Grafana data amount handling

Hi,

Is there any limitation in amount of data Grafana can handle? As well as number of Dashboard and/or panels per dashboard. Number of users Grafana can handle. If the database is connected to hundreds of sensors, is there any restriction or chance of performance degradation?

Thanks.
Deb

1 Like

Hi,

There is no restriction or limitation of data that Grafana can handle, also no restriction for the dashboards, panels or even user, in this context of on-premise installation. You need to cntact Grafana team if you want to use in the Grafana Cloud.

I suggest you using MariaDB or MySQL as Grafana backend to store the config of the dashboards, users, orgs, etc, instead of using sqlite as default installation.

I have Grafana installation for around 500 VM’s and can be handled by Grafana easily, depends on my InfluxDB and Telegraf to get the vCenter data as target.

Regards,
Fadjar Tandabawana

1 Like

Hi Fadjar,

Thank you for your explanation.

I am actually using Grafana locally installed and MariaDB as backend DB server. It is just for test at the moment though. But in practice the possible scenario might be data collecting from 100s of devices through sensors and collecting and visualizing them real time as well as analyze the past-data.

Though Grafana is described as Visualizing and Analyzing tool, but I think it is basically for visualization, isn’t it? Is there any actual analyzing mechanism there, in case I am missing something. I am quite new in using Grafana.

Thanks and regards,
Deb

Hi…

What kind of analyzing that you want?
I think there’s features if you want to get derivative, moving average, percentile, some statistics model, std dev, cumulative sum…

You need to read the complete document to gain all this capabilities to fit with your needs.
Some features that need to improved is the machine learning function, such as forecasting, anomaly detection and prediction still missing. I think this features still in development… :grin:

Regards,
Fadjar Tandabawana

Hi,

Thanks for your quick response.

Those analysis features are fine. Understand the unavailability of AI features. Would be great if added in future though. At present that has to be done in the backend I suppose.

So these analysis features (non-AI) can be used in the Frontend? Can be done as per user request or panels should be prepared with them? Though Frontend analysis may take time if the data amount is big and calculation is complicated. What is your opinion?

Is there any doc/links where I can get a little more details and how to use them?

Thanks and regards,
Deb

You can read the calculation features in this:

Regards,
Fadjar Tandabawana

Thank you!

Regards,
Deb

Hi,

Would like to ask a related question.

I understand there is no restriction of data amount in Grafana, if it is installed locally. But what is the response time if huge datasets are used? Say, 100,000 instances or more. Also, if further calculation is required (average, moving avg), will it incur delay?

Regards,
Deb

Depends on several things:

  1. Database that queried by grafana, including the disk type
  2. Time interval of the grafana visualization that you selected
  3. Network from grafana to client, because Grafana using Typescript that mostly running using local power.

I suggest to try it using several scenarios of the time interval…

Regards,
Fadjar Tandabawana

Hi,

Thank you for your prompt reply. I have used 50,000 instances (pseudo data). Will try to create different scenarios as you have suggested.

Regards,
Deb

Hi,

I have tested by increasing the data size. As expected there is some delay in uploading the the data to display. Not much, say 5 to 6 secs. One thing I have noticed, that it is bringing all the data from the database and showing what is set at the panel (time-range/query). Now even after increasing the data size is still in MB range. But if it is say in GB and it tries to bring the entire data set before display (even if it is asked to show current one-month/one-year), it will stall the system.
Is my understanding correct or I am missing something?

For InfluxDB there seems some retention-policy and it is possible to aggregate and compaction can be possible. Also InfluxDB’s response time might be faster for time-series data (not sure though).

Any inputs in this regard (handling huge data) would be very helpful.

Regards,
Deb

Not sure where the problem is…
But I can give you some clue to check the bottleneck.

Please use the top or htop in the linux box that contain the Grafana and influxdb to check when you query the biggest data that you want to show.
How the CPU and memory behave, also the DiskIO to guarantee the speed of lowest response of the computation machine.
If you use linux, you can use node_exporter that send the data to Prometheus to check the servers performance when the biggest query running.
This procedures to make sure, whether the capacity is suitable to the report in the interval that you need.

If everything fine, then check the communication channel to your local desktop, also check the CPU and memory in your local desktop.
I the query result is big, the other way is to optimize your query using time interval of the Date histogram time interval, or Query Option in the Min Interval, I suggest using interval variable $__interval.

Regards,
Fadjar Tandabawana

Hi,

Thank you very much for the detail explanation and suggestions.

From the Query Inspection → Stat where it gives the processing time and the total requirement time, if the processing time is getting the data with query from database, then it looks like the main time is the transmission delay. Because processing time is in the order of 0.1 to 0.2s.

I have tested locally with mysql query command to collect same data on the server side. Time is less than 2 sec. So if the Total time is 3-4sec, then it could be the network delay.

Read about the $_interval. This auto-calculating time interval looks very useful.

Here I am using query with variables in Grafana, as follows:

SELECT
datetime AS “time”,
temperature
FROM measurement
WHERE
batt_id = $BATT_ID and
cell_id = $CELL_ID
ORDER BY datetime

‘measurement’ is the table name, and ‘temperature’, ‘batt_id’ and ‘cell_id’ are fields, collecting the ‘temperature’ value from the selected batt_id-cell_id from drop-down menu.

Not sure, where I should use ‘$_interval’ here. Should it be while arranging the query from the UI? I mean instead of writing the query, selecting from query menu?

Thanks again.

Regards,
Deb

For the variable $__interval, you don’t need to include into the query, just put into the Query Option in Min Interval, fill it with $__interval and it will follow the interval that you select from the UI and it’s very useful to visualize the result following the time time interval.

Regards,
Fadjar Tandabawana

Thank you.

I actually thought so before. But if I change the Min interval as $__interval, and keeping other fields blank, then nothing is showing in the graph. If I put $__interval with values in Max data points and/or Relative time, then also no data is showing in the graph. No error but no graph either. A little confused. What mistake am I making?

I thought if I do not specify any Relative time (time range) in the Query option then it should take the Dashboard time range. It is doing so for if I don’t put $__interval in Min interval.

Regards,
Deb