Our observability stack for golang with otel
- Grafana
- Alloy
- Prometheus (Mimir)
We had memory issue with cumulative temporality. Thus, moved to use delta temporality with deltatocumulative process (to handle counter reset problems). This is working well for continuous metrics.
But for sparse service, we are seeing missing metrics. This is what i see in the graph.
- Service emits metric at T1
- Alloy pushes to prometheus using remote_write at T1 + few seconds
- Alloy pushes same metric as cumulative to Prometheus till T1 + 5 mins
- no more pushes from Alloy after T1 + 5 mins
- Service emits metric at T2
- process is repeated.
- In between T2-T1, I don’t see any metrics in the Prometheus. As a cumulative metric, I expected that the metrics to be sent after T1 + 5 mins as well.
I updated max_stale
config from default 5m
. I think this is not the correct config. Other configs like gc_frequency
from prometheus exporter also won’t work i guess.
Is there any other configuration I can look at it not to drop
or even pointing direction to the code also will be helpful.
Thanks,!!!