I have a panel that shows a stacked line chart of the number of users on multiple servers using graphite as a data source. Because the number of servers can vary, the simple query I use is similar to:
mything *server-* users where the server names are like server-01, etc. The graph works fine and shows exactly what I want to show.
What I’m trying to figure out is how to create an alert so that if the sum of the users on those servers in aggregate is greater than a specified value, it creates an alert. Trying a simple version of the alert with something like when avg() of query(A,15m,now) is above 100 and clicking Test Rule shows me that it’s testing for that value for each of the “n” servers. Because of what I’m looking for, I can’t just divide the target value on one server because A) I don’t know how many servers there will be, and B) It’s ok if the load isn’t symmetric as long as it’s not over the alert threshold. I tried using the sum() function but that’s just adding all the values per series. (The max threshold won’t change often, I can deal with that manually when it happens.)
I found that I can make this work if I change the query to do a groupByNode(), but then I can’t see the individual values.
Is there a way to make this work?