Hello,
I’ve built my first Grafana server, with InfluxDB, Telegraf, Prometheus, Node Exporter and Blackbox and have imported our VMware estate.
I have now been ask to create a simple dashboard listing all servers in like a heatmap or grid showing the server as up or down. Green as up and Red as down or packet loss as amber.
I guess I need to ping these servers, but how would I create this grid?
Thanks
After a lot of experimenting with different panels I now do this with the Bar gauge
panel, it’s not really designed for this but with some particular settings and the right shape panel (tall and thin) it looks good.
So for example, with Prometheus as a data source using the blackbox exporter (mostly using the tcp prober) my query to show all devices for a site might be an Instant
query of: probe_success{site_name="Site 1"}
.
Then I put a couple of my labels as the Legend: {{name_short}} {{icon}}
Orientation: Horizontal
Display mode: Basic
Show unfilled area: Enabled
Min: 0
Max 1
Threshold: Green 1
Threshold: Red Base
Value mapping: 0 DOWN
Value mapping: 1 UP
Then with a narrow dashboard panel you get a small green bar for everything up, an empty bar with a red line for everything down, the device name to the left of the bar and UP
(in green) or DOWN
(in red) to the right of the bar.
Thanks, I will give this a go when I’ve back at work, this will hopefully solve a headache I’ve had trying to get this to work.
I will let you know how I get on.
Polystat would probably be a good plugin to display this.
Hi, don’t suppose you have a screenshot of this dashboard?
And a sample of your blackbox config?
Thanks!
@g0nz0uk Here is an image showing 3 panels with the text pixelised. If something was down I get an empty grey bar with a red line at the far left of it and a red “DOWN” text.
The icons are just emoji pasted into the blackbox config file. The part of blackbox.yml for tcp probe is just this:
modules:
tcp_connect:
prober: tcp
And my prometheus.yml targets under the tcp_connect probe are like this. In this example it’s a NVR which works on port 8000:
- targets: ['192.168.0.100:8000']
labels:
aaa: '1.0'
name: 'Network Video Recorder 1'
name_short: 'NVR 1'
type: 'NVR'
icon: '🖥️ '
location: 'Site 1'
location_short: 'S1'
1 Like
This is impressive, I didn’t know you could add jpegs and get them to change based on a situation, so you have 2 jpegs (up and down)?
I was hoping to to just ICMP servers and if they don’t respond then show as down.
So you base yours on listening ports like TCP 8000?
@g0nz0uk no, the icon/Emoji is static, just to indicate the device type e.g. NVR, Camera, Routers, Website etc.
All I care about is a basic “UP” or “DOWN”, or more accurately “can I see the device on the network”, nothing else, so I use TCP probe for most of my checks either on the web ports 80 or 443 if they have a web interface, or a particular port the devices uses.
And for the few web sites I monitor I use the http_2xx probe with a “fail_if_not_matches_regexp:” definition so if the site does not return the page I expect it will return as a failure.
I have not used the icmp probe so I can’t comment on that.
I’ve been looking for a way to replicate a simple up or down dashboard that almost all monitoring tools have by default, I considered using the input.ping or the output.health in someway then it occurred to me to just check for a null value from one of the standard metrics.
As Linux and Windows dont appear to share a standard fields i’ve created two rows and two panels one for each OS.
I’m using this as my linux query then use the standard options to change the results to boolean on/off with a value mapping override to show null results as down in red.
Linux
SELECT last(used) FROM “mem” WHERE host =~ /$x_server$/
Windows
SELECT mean(“Available_Bytes”) FROM “win_mem” WHERE host =~ /$server$/
I’m sure there’s a cleaner way to achieve this result, hopefully when i have some time i’ll refine this dashboard.
Dashboard ID is 19357