Skip to content

Data Analytics

Join our forum to discuss your ideas with Aiven community or check out our public roadmap.

Data Analytics

Categories

JUMP TO ANOTHER FORUM

18 results found

  1. As an Opensearch administrator, I want to be able to limit the impact of heavy requests on my cluster.
    So that when some client applications make these requests, I can mitigate impacts for other client applications.

    An example would be to be able to use backpressure mechanism
    https://opensearch.org/docs/latest/tuning-your-cluster/availability-and-recovery/search-backpressure/

    13 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  2. Dear Aiven Community,

    As a DevOps Engineer,
    I want to be able to configure the shardindexingpressure settings directly within the Aiven platform and through the Aiven Terraform provider,
    so that I can better manage indexing loads on Opensearch clusters, optimize performance during high data throughput, and prevent potential bottlenecks.

    In addition, this configuration capability is essential for dynamically adjusting indexing pressure based on real-time data demands. Currently, trying to enable or modify shardindexingpressure settings results in a 403 error, indicating that the feature is not supported in Aiven's current Opensearch offerings. Enabling this feature would allow users to set parameters…

    8 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  3. There is a reported issue where incomplete results are returned when querying rollup indexes in OpenSearch. The issue was created upstream and found here:

    https://github.com/opensearch-project/index-management/issues/903#issuecomment-2146610171

    The solution suggested is to enable the plugin "plugins.rollup.search.searchalljobs" and local testing confirms that enabling this resolves the bug report, but access to the /_cluster/settings are currently not supported.

    3 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  4. As a developer or devops engineers
    I want to auto-promote follower OpenSearch Cluster to leader OpenSearch Cluster after I set up OS CrossClusterReplication in case of outage
    so that I can make sure that there will not be any disruption
    In addition, auto-resume would be beneficial once the Leader Cluster is stable

    40 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  5. As an OpenSearch admin/developer,
    I want to assign certain dedicated role to each the node in my OpenSearch clusters
    In order to optimise the performance of my OS clusters

    26 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  6. Provide a capability to backup opensearch to a secondary region in the even of hyperscaller regional outage allowing for customers to develop DR strategies meeting their RTO/RPO.

    2 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  7. As a database administrator
    I want to restore from external snapshot that isnt hosted on Aiven
    so that I can migrate data from certain OpenSearch and Elasticsearch cluster (Aiven and Non-Aiven) to Aiven for OpenSearch

    3 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  8. As a developer or database administrator,
    I would like to utilise object storage in my Aiven for OpenSearch instance,
    so that I can store larger amounts of data at a lower cost than attached SSD or HDD.

    37 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  9. As Developer
    I want to be able to leverage the new ML capabilities in OpenSearch (https://opensearch.org/docs/latest/ml-commons-plugin/)
    so that I can use new features like Semantic search, leveraging external models etc.
    There are various cluster settings that needs to be exposed to the end-users to enable these features.

    The most urgent is "onlyrunonmlnode" which is set to "true" in Aiven Clusters. This needs to be set to "false" to allow ML workloads to be executed on any node (until we have the capability to assign dedicated ML-nodes in the cluster)

    There are more configs that…

    5 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  10. As a user of Aiven for ClickHouse on AWS, Microsoft Azure or Google Cloud:
    I would like to leverage my own object storage account with Tiered Storage for Aiven for ClickHouse (that I am already using BYOC on).

    4 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  11. As a ClickHouse user, I want to be able to use table functions & engines that require identification credentials (ex. private remote s3 bucket, remote delta lake table etc.) without hassle and risks.
    With named collections in Aiven for ClickHouse, you can set your integrations credentials once and use it safely with all your remote queries.
    Moreover, you can easily rotate credentials if needed: change credentials once using the Aiven console and apply it to all your integrations.

    1 vote

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  12. As a developer using Kafka to stream messages to ClickHouse,
    I want to be sure that messages as delivered and ingested exactly once, as opposed to at-least-once, to ClickHouse to not have to deal with duplicated data that I have to deduplicate down the line. This removes both a cognitive and operational load from my data pipeline
    c.f. https://github.com/ClickHouse/clickhouse-kafka-connect

    The exactly-once semantics has been implemented in the newer version of the ClickHouse Kafka Connect Sink (https://github.com/ClickHouse/clickhouse-kafka-connect).

    5 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  13. As an operator of Opensearch,
    I want to be alerted when my shards are outside of recommend best practices of 10-50GB / shard,
    so that I can avoid having overly large shard size cause performance problems for ingestion and query.

    In addition, please tell me how within the alert to split my shards if they do get too large.

    1 vote

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  14. As a developer with a current PostgreSQL instance,
    I would like to utilise the ClickHouse PostgreSQL Table engine,
    so that I can easily read from that external PostgreSQL instance and insert the data into my Aiven for ClickHouse instance.

    3 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  15. As someone running analytics with Clickhouse,
    I want to be able to create functions that read data from an external MongoDB instance, as well as create tables that span data in external MongoDB instances,
    so that I can enrich my analytics with data stored elsewhere, or easily migrate data from a MongoDB instance to my Aiven for Clickhouse instance.

    5 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  16. As a customer of Aiven, I would like to pull data from many disparate sources into my ClickHouse data warehouse to provide my users best-in-class analytics and performance. Adding support for DeltaLake would open up new sources from which I can seamlessly consume from.

    4 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  17. As someone drunning analytics queries with Clickhouse against multiple data sources,
    I want to have the MySQL table engine enabled so that I can create tables in Clickhouse that span data in external MySQL instances,
    so that I can bring the value of external data sources into my analysis.

    2 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  18. As an user of Opensearch

    I want to be able to store large amounts of immutable logs for lengthy periods of time

    so that I can support compliance and other regulatory requirements placed on me

    In addition, I need this to be provided in a cost efficient manner, leveraging technologies such as object storage. Given queries against this data are infrequent price far outweighs performance.

    2 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  • Don't see your idea?