Skip to content

Data Analytics

Join our forum to discuss your ideas with Aiven community or check out our public roadmap.

Data Analytics

Categories

JUMP TO ANOTHER FORUM

93 results found

  1. As an AI engineers, developer using OpenSearch for ML applications,
    I want to incorporating machine learning in relevance ranking
    So that I can better improve my application with proper ranking-> recommendation

    8 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  2. As a developer, as a DR requirement, I would like to be able to restore the service from its backup in cases where the service's hosting region is down.

    4 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  3. As developer against ClickHouse,
    I want to utilise the S3Queue engine available in v23.11
    so that I can watch an S3 bucket, and auto consume new files as they appear, via a materialized view, as soon as they arrive.

    This will enable me to keep my analytics up to date with no need for additional engineering effort or streaming pipelines.

    7 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  4. As an operator of Opensearch,
    I want to be alerted when my shards are outside of recommend best practices of 10-50GB / shard,
    so that I can avoid having overly large shard size cause performance problems for ingestion and query.

    In addition, please tell me how within the alert to split my shards if they do get too large.

    1 vote

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
    Roadmapped  ·  Hoang Minh Vo responded

    Thanks Jason. We can roadmap this. WIll update the idea when we have more concrete timeline

  5. As a developer
    I would like to have the following configuration options exposed:

    opensearch_security.cookie.ttl
    opensearch_security.session.ttl
    opensearch_security.session.keepalive

    so that I can lengthen the dashboard session timeout for my users.

    3 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  6. As an OpenSearch operator
    I want to be able to pause, stop, and start cross cluster replication
    so that I can use this feature to support failover in a disaster recovery scenario.

    Typically in Elasticsearch or Opensearch, CCR can be used to support a DR deployment by placing two clusters (a leader and follower) in separate regions. When the leader cluster becomes unavailable, applications/clients can failover to the follower cluster by stopping replication on the following cluster which makes it a regular index.

    https://opensearch.org/docs/latest/tuning-your-cluster/replication-plugin/api/

    1 vote

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    Currently we are controlling this process of start/pause etc so we can control replication process during the entire cluster's lifecycle (through all the node recycling, upgrading etc.) to ensure the stability of our services.


    I put this to Gather interest, I am also aware of the failover capability mentioned in the ideas as well (we have different ideas ticket for that), this is something we can have a look and see if we need to expose all APIs if the main usecase is failover 

  7. As a security analyst,

    I want to create a visualisation from a search, which can then be added to an existing or new dashboard, so that I can save time creating dashboard elements and create dashboard elements in a much easier manner than is currently possible.

    2 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  8. As a security operator,

    I want to have a view of our compliance status across various data sources, in a 'continual assurance' manner. e.g. PCI, SOC2, ISO27001, or frameworks such as NIST CSF. So that I can get a continual view of degredations as they occur.
    so that I can [describe the benefit or a problem you want to solve]
    In addition, [share any additional context or why this idea is important to you]

    2 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  9. As a security analyst,

    I want to collect events directly from cloud resources (XaaS, eg AWS, Azure, Okta, Github, GCP...) without having to run an intermediary host such as Logstash, so that I can lower my infrastructure cost, lower external hosting complexity and lower our maintenance overhead.

    2 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  10. As a security analyst,

    I want to be able to search across more than one index within Discover (and Dashboards queries), so that I can enrich data between sources.

    For example, Okta logs contain an organisations user logins, along with their IP addresses. We may also have SSHd logs, and between the two we could correlate IP address to provide user details into a search of SSH logs. Many examples could be found.
    In addition, [share any additional context or why this idea is important to you]

    2 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  11. As a security analyst,

    I want to have a unified alerting, dashboarding and search experience in my SIEM, so that our capabilities are not spread across multiple plugins with differing query languages.

    Currently between Dashboards/Discover, Security Analytics, Observability there is not a unified experience, it is extremely confusing and difficult to use, and to make this harder each component has a different set of upstream repositories and seemingly little co-ordination between them in features, documentation and bug fixes, making the experience very confusing and difficult.

    2 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  12. As a security or data analyst,

    I want to be able to treat a string as another data type at search, for example searching the string "1" as an integer upon search, so that I can search data appropriately without having to update the mapping and reindex all data.

    2 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  13. As a security analyst,

    I want to utilise 'range' in visualisations without having to Edit Query as DSL,
    so that I can save time and also have people without extensive DSL knowledge create visualisations.

    2 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  14. As a security analyst and operator,

    I want to utilise up-to-date SIGMA rules in the OpenSearch Security Plugin, so that I can utilise current contributions from the opensource community.

    For example - at the time of writing this - the Okta rules in Security Plugin repo (main branch) have not been updated since February 2023 - with 13 rules available , while the SIGMA repo (master branch) Okta rules were last updated in December 2023 - with 21 rules available, notably including rules based on the high-profile Okta breach in 2023.

    This can be observed across many rule categories, with…

    2 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  15. As an architect
    I want to create right-sized clusters for my use case
    so that I can get the most value.

    Currently, all the OpenSearch clusters have a 1:4 CPU:RAM ratio. High throughput application search use cases often have small data sets and can benefit from more relative CPU than RAM or disk (e.g. 1:2 CPU:RAM ratio). Logging use cases with large volumes of data may benefit from storage optimized instances with 1:8 CPU:RAM ratio with more disk.

    1 vote

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  16. As a user of both PostgreSQL (on-prem, on another cloud provider or at Aiven) and Aiven for ClickHouse,
    I want to be able to ingest my PostgreSQL data, not as a on time snapshot or a remote view of the data like the current integrations allow me to, but as tables created and stored in ClickHouse that pull updates from PostgreSQL regularly.

    2 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  17. As a business owner,
    I would like to have the ability to set up a replication of my Aiven for ClickHouse service, and all the object storage attached to it, to a secondary region,
    so that in the case of needing disaster recovery from the primary region going down, we could fail our workload traffic over to the secondary region.

    The replicated secondary region does not need to be writable until the primary is unavailable, but we would like to read from it to validate consistency. As a customer we'll handle the failover mechanism.

    7 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  18. As developer
    I want to connect to Opensearch using mTLS connection
    so that I can rely on a trusted connection instead only on IP filtering or other mechanisms.

    5 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  19. At this moment there are only the primary Postgres instances available for a clickhouse connect. Wouldn't it make sense to have the capability connection Clickhouse to a read replica to prevent performance impacts on the primary?

    1 vote

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  20. As a security analyst,

    I want to 'reduce' the logs searched to reduce the data to common patterns, allowing me to easily see meaningful events.

    1 vote

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  • Don't see your idea?