Scheduled/automated annotations

Hi All!

I’m really hope that I’m in the right place as I’m new on the forum. Sorry if I made any mistakes.

We are using Grafana v8.4.7 (23bf3ef043) (because scheduled upgrades) for our server monitorings through Nagios. We use it for long time and usually love it, but I have more and more often one ‘question’ for what I did not find exact answer yet.

Is it possible to make scheduled/recurring annotations for time periods with known special processes? I’m trying to explain it with just some simple examples.

We have backup operations, we have virus scanning operations (and we have lot others, etc.). These operations are recurring processes and make small or larger extra load, memory consumption, etc., on the corresponding environments. For a daily operation analitics/process optimizations would be very big help if the graphs could have automatic/scheduled annotations to show us there were something special but known operation on the investigated time period. For the mentioned 2 examples, would be very nice if we could always see our backup times, or scanning times, etc. (could be lot more same like operations).

For the question could be two theoretical answers:

  1. Simply time based (scheduled) annotations.
    • I think I can not explain it more, as it is really trivial.
  2. Data based automated annotations.
    • For example if we record in a special DB/table our such processes, if we could read them out and make automated annotations based on them. This one theoretically is a very nice and sophisticated solution, but far less secure as the simple time based, as it based not on Grafana itself, but on the corresponding servers, which if have any problems the corresponding data could be corrupted.

Are here any solution for these ‘suggestions’?

Lot of thanks in advance if anybody can help us in this topic!

Welcome

Are you sure you want to implement such a thing. Have you thought through of the possible cry wolf effect of this approach?

Hi!

Thanks for the fast response. It’s a question currently nothing more.

Could you please explain for me the black side of the thingie? I’m definetaly not speaking about alerts, but about annotations, what should not alert us. Basically just comments on the graphs which we try to continuously analize, to optimize our operational background. For first sight I do not see the negative effect of properly ‘commented’ graphs, what could be manually made by myself to indicate some special backgrounds, but for cyclical events the manual comment of course not way to go.

As this is just a ‘question’, I would be really happy if you could point me on the problem of the ‘concept’.

Thanks for your time!

So other real events that need your attention unrelated to backups etx could be obscured by these auto generated annotations.

But that said, maybe you can use some script language such as oython and leverage the annotations api to generate these on cron/ schedule

Hello, why don’t you use simple table , could be static or dynamic to
display your spécial prerequisite if there is one.
Why go through annotation ?

I don’t think this is a real problem.

  1. Annotations could be visible differently as alerts.
  2. Annotations could be hidden with a ‘switch’.
  3. Annotations do not send notifications, while alerts yes.

Basically annotations could be fine exactly such operations as without them the alerts also can be commented. I think one of the main goal of annotations could be the commenting of graphs.

Anyway thanks for your opinion, I’m wondering everybody’s opinion. In my life I made a very lot analyzations in the past, just for example in Google Analytics, where the same like annotation feature was very helpful for exactly the same thing. Of course in that context there were no requirement of automatic comments, as all of our developments, their results commented into GA manually by the related management team.


But about your second thought, the Annotation API suggestion is very nice! Thank you for it! If there is no other way, we will definitely use it! Thanks for ‘idea’!

1 Like

Hi!

Sadly that is not the exact efficient solution, this is why I throw the topic about annotation. I do not know how much times did you make deep analyzations of analytic systems, but if you did it, I’m sure you met a lot of situations when there were lot of parameters about just 1-1 investigation points. In those cases we have to compare lot of data, lot of paralell graphs, etc, etc, where could be a situation when even minutes, seconds could be count. Searching for a bug, searching for any unefficient process could be very hard, and require lot of attentions. In those cases could be very helpful, if you could easily notified some things what in normal cirtumstences everybody know, but in a deep investigation, because the lot of parelell data can be overlooked. We are real people, sadly with our mistakes, and such analyzations basically look like a bigdata analysis where we can. I really like to investigate, analyse, but sadly there could be circumstences where I simply do not see the forest because of the trees.

If the task is simply such situations could not be realized, but the situation is sprawling, it’s a bit hard to take into account every possible aspects.

In these situations could be very helpful, if we could see some direct comment about the investigated time period/time frame.

Sadly in such moments an ‘off’ table with the relevand data is far from efficient, I’m sure you understand it. This is why lot of analysers try to sum more different graphs to extract their summarized results.

1 Like

hello i think this is really intresting talk
let me share you my approach of our analytics and resolution method:
We have a database with thousand of computation batch every night, parralelised and clusturised by criticity level.

My approch to improve our analytical diagnostic was based on two major idea.

  • 1st Meta structure of data presentation :

From Large to specific with multiple dashboard linked together .
First dashboard is batch level information (each level containing some task, each task some calculation )


i use gantt panel to display batch and Gauge to display and compute key indicator of domain activity (based on service enggagement) , and technical indicator
You can click on a gantt task to be redirected to the same dashboard but for one level detail
And again to see every calculation…

2nd , i made a table panel to link every board level to our ticketing sofware.
they can click to open automatically a new ticket and report it with a correct criticity, quick screen of key indicator and technical error if they exist and that’s it.

Finally the big picture analysis is made other people who have a board to take account all ticket criticity and technical failure. They can organize recovery action by opening internal new ticket …

Hope this give you some idea
if you are intrested i can provide you blurred screen of my prod config tomorrow. this screen is an example
@tamastoth.ebola

1 Like

Hi @alexandrearmand ! Sorry your answer somehow simply escaped my attention, and some days later I paddled into other waters. But now I noticed my absence.

Thanks for the detailed idea what is first time seemed a bit different as my planned use case, but on second time I noticed the potential of it as an idea.

Let me tell a very simple sample case:

We have a simple ‘load’ graphs/panels of any of our (usually ‘web’)servers. Nothing special, just the basics. There are lot of other panels with such and so data, integrated standalone, etc, but all of them shows me only any of the characteristics of the server and its services, but not our very own and in some sace scheduled activity. I mean I see nicely all basic services, as SQL, Apache, etc, but not some of specially, ‘manually’ handled processes like backup, virus scanning etc. These services basically not the strictly business required services what we already have on our dashboards but like ‘special solutions’ only for the operation background. Of course we could make scripts for all of them to show suach and so characteristics but as our main business is not the high level operation, we have no time, capacity, staff for it. The idea of the sceduled annotations came for me from those reasons. If I could simply configure them, I should not ask my developers to sophisticate all of our processes to the required level just because of the monitoring.

But what you showed me is a bit ‘thought-provoking’. I really like you Gants, but it simply flashed into my mind that with correct value mapping, and with almost just one very simply script we could ‘log’ all of our special and scheduled cyclical processes ‘into’ one normal graph, where the levels could show the corresponding services which could be translated by the Grafana config to real ‘service names’, and could show their time course, like your Gant charts. It could be really simple with ‘running’/‘not running’ states, where the running states could be different levels by the corresponding service.

If I just would like to check what could be the reason of any pike of the load, along with all the already available service data I could compare them with the mentioned ‘operational tasks’ graph.

It is really simply and easy, even me but my developers could prepare it’s background.

I have to compare 2 or more graphs as currently also what could beeasier with ‘integrated’ annotations, but not so bad, so basically acceptable.

Thanks for your answer which guided me to a very simply but acceptable direction!

hello,
I’m happy it’s inspire you ! the tree architecture (from large to specific) is the best

just for the one step further idea, i made python script to make a profile of data over time.
this way you can automatically detect emergent issues never seen before, and precisly define the difference with the normal profile and make action, alert, recover
Mine is basic (i made prototype using TSkmeans ) but you can go far in automation with this.

i also use a simple regression script to detect issue in database task time before they appear, because calculation are made in database, task time can grow indefinetly for some reason.
This way you can send alert if task time will be to high

i have to say some close concurrent of grafana are already put this in core of their application


i will not make the ad but this is intresting , and i think cost/reward ratio is really high for this kind of tech and future of dashboarding app. but this is speculation :slight_smile:

@yosiasz maybe you will find this intresting too

1 Like