Metrics for InfluxDB nodes
As a devops engineer, I want to track performance of my multiple Influx DBs. This will enable us to set up alerts in our systems, troubleshoot issues, identify bottlenecks, project for growth.
Currently, the only user-facing metrics are some limited dashboards in the Metrics page of each instance: Limited navigation and time window selection, no interactivity (no drill down), and also, not possible to hook any external system for alerting. We have seen the Support team uses some more advanced dashboards (richer, with more metrics than the ones available to end users) though.
The rest of our systems, and industry in general, use standards to communicate metrics so you can adapt to your environment. An example is OpenCensus/OpenTelemetry; which we use in many of our systems and end up exporting metrics to Google StackDriver. Such an integration would allow us to have a better understanding of the system as well as connect to alerting/incidents and explore metrics in relation to the other services we use.
-
Hi Hernan, thanks for posting this idea. We are actually in the process to implement a new alternative metrics database services replacing InfluxDB that is Prometheus compatible. Also, the Otel community is exploring ways to support exporting Otel metrics to Prometheus, which further enhances the compatibility of this new service to Otel metrics protocol.
Because of this reason, unfortunately, we have to park this idea for now as we are focusing on this new service and this service will come with all Metrics, Metrics integration as with other current Aiven products.
(Edited by admin)