Skip to content

Data Analytics

Join our forum to discuss your ideas with Aiven community or check out our public roadmap.

Data Analytics

Categories

JUMP TO ANOTHER FORUM

74 results found

  1. I need to read and write Parquet format files to/from Azure Blob Storage.

    3 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  2. Dear Aiven Community,

    As a DevOps Engineer,
    I want to be able to configure the shardindexingpressure settings directly within the Aiven platform and through the Aiven Terraform provider,
    so that I can better manage indexing loads on Opensearch clusters, optimize performance during high data throughput, and prevent potential bottlenecks.

    In addition, this configuration capability is essential for dynamically adjusting indexing pressure based on real-time data demands. Currently, trying to enable or modify shardindexingpressure settings results in a 403 error, indicating that the feature is not supported in Aiven's current Opensearch offerings. Enabling this feature would allow users to set parameters…

    8 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  3. As a user sending Kafka messages to Aiven for ClickHouse,
    I want to be able to send those messages using the protobuf format

    3 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  4. As an operator of Opensearch,
    I want to be alerted when my shards are outside of recommend best practices of 10-50GB / shard,
    so that I can avoid having overly large shard size cause performance problems for ingestion and query.

    In addition, please tell me how within the alert to split my shards if they do get too large.

    1 vote

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  5. As a SRE or operations engineer
    I want to be able to create index snapshots manually or with a policy
    so that I can easily and quickly restore partial data after an unintentional index corruption.
    In addition, I understand this would require a shared storage between cluster nodes which means additional storage so it would make sense to make it a paid feature and/or include it in the tiered storage project.

    3 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  6. As a developer
    I want to use a custom plugin in OpenSearch
    so that I can implement custom scoring or text analysis or use a community open source plugin.

    2 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  7. As an OpenSearch operator
    I want to be able to pause, stop, and start cross cluster replication
    so that I can use this feature to support failover in a disaster recovery scenario.

    Typically in Elasticsearch or Opensearch, CCR can be used to support a DR deployment by placing two clusters (a leader and follower) in separate regions. When the leader cluster becomes unavailable, applications/clients can failover to the follower cluster by stopping replication on the following cluster which makes it a regular index.

    https://opensearch.org/docs/latest/tuning-your-cluster/replication-plugin/api/

    1 vote

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    Currently we are controlling this process of start/pause etc so we can control replication process during the entire cluster's lifecycle (through all the node recycling, upgrading etc.) to ensure the stability of our services.


    I put this to Gather interest, I am also aware of the failover capability mentioned in the ideas as well (we have different ideas ticket for that), this is something we can have a look and see if we need to expose all APIs if the main usecase is failover 

  8. As an architect
    I want to create right-sized clusters for my use case
    so that I can get the most value.

    Currently, all the OpenSearch clusters have a 1:4 CPU:RAM ratio. High throughput application search use cases often have small data sets and can benefit from more relative CPU than RAM or disk (e.g. 1:2 CPU:RAM ratio). Logging use cases with large volumes of data may benefit from storage optimized instances with 1:8 CPU:RAM ratio with more disk.

    1 vote

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  9. As an Opensearch administrator, I want to be able to limit the impact of heavy requests on my cluster.
    So that when some client applications make these requests, I can mitigate impacts for other client applications.

    An example would be to be able to use backpressure mechanism
    https://opensearch.org/docs/latest/tuning-your-cluster/availability-and-recovery/search-backpressure/

    13 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  10. As an OpenSearch architect
    I want to provision the minimum amount of hardware I need to meet my requirements
    so that I can optimize costs.

    Currently, OpenSearch on Aiven only supports 3 AZ deployments for production-grade plans. OpenSearch clusters with data nodes deployed across 2 AZs can be considered production-grade as long as you have master nodes across 3 AZs.

    1 vote

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  11. As an user of Opensearch

    I want to be able to store large amounts of immutable logs for lengthy periods of time

    so that I can support compliance and other regulatory requirements placed on me

    In addition, I need this to be provided in a cost efficient manner, leveraging technologies such as object storage. Given queries against this data are infrequent price far outweighs performance.

    2 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  12. As Developer
    I want to be able to leverage the new ML capabilities in OpenSearch (https://opensearch.org/docs/latest/ml-commons-plugin/)
    so that I can use new features like Semantic search, leveraging external models etc.
    There are various cluster settings that needs to be exposed to the end-users to enable these features.

    The most urgent is "onlyrunonmlnode" which is set to "true" in Aiven Clusters. This needs to be set to "false" to allow ML workloads to be executed on any node (until we have the capability to assign dedicated ML-nodes in the cluster)

    There are more configs that…

    3 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  13. As an Data Engineer
    I want to use 3rd party ETL tools to populate Clickhouse tables
    so that I can use clickhouse as my warehouse. Currently popular tools such as airbyte do not work when connected to Aiven clickhouse due to the fact that it creates internal state tables within clickhouse, and it fails given that tables are not allow to be created outside the context of the Aiven console. Customers are unable to use external tools that can self manage state and required the permission.

    2024-01-26 22:37:45 normalization > Code: 497. DB::Exception: avnadmin: Not enough privileges. To execute this…

    3 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  14. As Opensearch Administrator
    I want to be able to set custom Base_URLs for my OpenSearch clusters
    so that I can simplify usage for my customers when they have multiple OS services per group/client/customer.

    1 vote

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  15. As a Security Analyst,

    I want to utilise the Observability plugin & PPL in an efficient manner. Features such as tooltips and autocomplete would help a lot, as well as bug fixes and regular updates.

    The syntax is not well nor widely understood, and there are lingering bugs, which for a user are very hard to duplicate across the microcosm of repositories which bundle into the suite.

    Observabilitiy & PPL feels like a very promising place for OpenSearch to become more useful to a security operations team, where currently the capabilities are extremely limiting.

    While OpenSearch Dashboards, with tenancy and…

    2 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  16. As a user of other services wanting to migrate my data to Aiven for ClickHouse,
    I want to be able to use generic JDBC compatible tools to move my data out of my current service into Aiven for ClickHouse easily, and at an acceptable speed.
    This can be done using the JDBC table function or table engine.

    0 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  17. As a developer
    I would like to have the following configuration options exposed:

    opensearch_security.cookie.ttl
    opensearch_security.session.ttl
    opensearch_security.session.keepalive

    so that I can lengthen the dashboard session timeout for my users.

    3 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  18. At this moment there are only the primary Postgres instances available for a clickhouse connect. Wouldn't it make sense to have the capability connection Clickhouse to a read replica to prevent performance impacts on the primary?

    1 vote

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  19. As a security analyst,

    I want to create a visualisation from a search, which can then be added to an existing or new dashboard, so that I can save time creating dashboard elements and create dashboard elements in a much easier manner than is currently possible.

    2 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  20. As a security operator,

    I want to have a view of our compliance status across various data sources, in a 'continual assurance' manner. e.g. PCI, SOC2, ISO27001, or frameworks such as NIST CSF. So that I can get a continual view of degredations as they occur.
    so that I can [describe the benefit or a problem you want to solve]
    In addition, [share any additional context or why this idea is important to you]

    2 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
← Previous 1 3 4
  • Don't see your idea?