What happened?
I am using Netdata as a data source for Grafana to monitor system uptime through InfluxDB. However, the Grafana graph intermittently shows a drop to zero (as if the system rebooted) at 2024-12-27 09:30:00, even though the system was not actually rebooted at that time.
Upon reviewing my Linux system logs, the machine was actually rebooted at Fri, Dec 27, 12:36, but Grafana does not show uptime data around that time.
This suggests a time mismatch between the actual system uptime and what is displayed in Grafana.
What did you expect to happen?
I expected the uptime graph to increase continuously unless an actual reboot occurs.
The displayed uptime should match the system’s actual uptime without false resets or missing data.
Did this work before?
I am unsure if this issue existed before, but I recently noticed the mismatch while monitoring uptime in Grafana.
How do we reproduce it?
- Configure Data Sources:
- Set up Netdata as a data source in Grafana.
- Store uptime metrics in InfluxDB.
- Use the following query to fetch uptime data:
SELECT max("value") FROM "netdata.system.uptime.uptime" WHERE time >= now() - 6h GROUP BY time(1m) fill(null)
- JSON Configuration for Grafana Uptime Panel:
{
"type": "graph",
"title": "Uptime",
"gridPos": {
"x": 0,
"y": 0,
"w": 12,
"h": 7
},
"id": 1,
"fieldConfig": {
"defaults": {
"custom": {}
},
"overrides": []
},
"pluginVersion": "7.3.6",
"datasource": "NetData",
"targets": [
{
"alias": "$3",
"dsType": "influxdb",
"groupBy": [
{
"params": [
"$__interval"
],
"type": "time"
},
{
"params": [
"null"
],
"type": "fill"
}
],
"measurement": "netdata.system.uptime.uptime",
"orderByTime": "ASC",
"policy": "default",
"refId": "A",
"resultFormat": "time_series",
"select": [
[
{
"params": [
"value"
],
"type": "field"
},
{
"params": [],
"type": "max"
}
]
],
"tags": []
}
],
"options": {
"alertThreshold": true
},
"renderer": "flot",
"yaxes": [
{
"decimals": 1,
"format": "s",
"label": "",
"logBase": 10,
"max": null,
"min": "1",
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
],
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxis": {
"align": false,
"alignLevel": null
},
"lines": true,
"fill": 3,
"linewidth": 2,
"dashLength": 10,
"spaceLength": 10,
"pointradius": 5,
"legend": {
"alignAsTable": true,
"avg": true,
"current": true,
"max": true,
"min": true,
"rightSide": false,
"show": true,
"total": false,
"values": true
},
"nullPointMode": "null",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"aliasColors": {},
"seriesOverrides": [],
"thresholds": [],
"timeRegions": [],
"bars": false,
"dashes": false,
"height": "",
"links": [],
"percentage": false,
"points": false,
"stack": false,
"steppedLine": false,
"timeFrom": null,
"timeShift": null,
"fillGradient": 0,
"hiddenSeries": false
}
-
Simulate a Network Disconnection:
- On a Debian system, disconnect the internet connection.
- Example scenario:
- At 12:10 on 2024-12-27, the network connection was lost.
- The system remained powered on, but Grafana falsely indicated a reboot.
-
Observe Grafana Behavior:
- Despite the system being powered on, the Grafana uptime graph drops to zero at 09:30.
- The actual reboot, according to Linux logs, occurred at 12:36, but Grafana does not show uptime data at this time.
-
Perform an Actual Power Cycle:
- At 12:40, the power cable was unplugged and plugged back in, causing a real reboot.
- After this, uptime data was restored correctly.
Request for Help:
- Could this issue be due to how Netdata reports uptime, or is it related to InfluxDB aggregation?
- Are there any recommended configurations in Netdata or Grafana to ensure uptime is continuously tracked without false reboot indications?
- Has anyone faced a similar issue, and if so, what was the resolution?
Grafana platform?
A package manager (APT, YUM, BREW, etc.)