I have a list of 1000 devices that I am monitoring for restart everyday. I have a restart count field in db that is incremented whenever the device restarts.
I am using a graph to visualize this data . Everyday I query last 48 hours data so I can see if a restart occurred in any of the device so that I can see a spike for that particular device.
The problem is the data visualization shown in graph is just all straight lines as a summary for all devices which gives the impression that there is no restart but that isnt really the case because when I click on a device (from the list just below the graph) , the rest of the graph lines for other devices vanish and only line for that particular device is shown and I can see a spike .
What I want is that any outlier or spike must be identifiable easily through my graph for any device . Currently I have no other way but to manually click each device from list and check for spike.
Please dont mind my english , I cannot post ss from my organization and I am fairly new to grafana still exploring . I need a way to identify outliers/spike in graph or using any other way. Please remember the device list is going to increase in future ,Thanks in advance
Would help if you provided
And expected visual
Don’t visualize raw restart counter value, but rate of change = derivate that value. So if there is no restart then all devices will have nice flat 0 time series (they will be “normalized”). But when there is counter increase, then derivated value won’t be 0 - that will your “spike”. Of course use also proper time grouping.
Implementation of course depends on used time series DB (TSDB), so check used TSDB doc. Your TSDB may have also other useful math functions (example of available functions in InfluxQL) and another options, e.g. machine learning, anomaly detection (+ other big “buzzwords”) - but simple math function can do great job here.
Also keep in mind edge cases - e.g. reset of the counter.
Data source is influxdb
expected visual , graph showing spikes clearly for devices for which restart counter has been incremented from previous day
Thanks for your response
could you please let me know how to make this rate of change modification as you mentioned? Please remember I still should be able to see the exact restart count increase when I hover over graph line of a particular device , my data source is influxdb if that is of any help
I already linked InfluxDB doc - that is your friend. Example query:
SELECT DERIVATE("field_restart_counter") FROM "measurement" WHERE $timeFilter GROUP BY time($__interval), "tag_device_id"
Thank you so much , this solution reduced manual effort to significant extent