Gathering metrics and logs at the same time

obiwan · October 25, 2024, 4:46pm

For many years I have been running promtail to gather logs and Prometheus to gather metrics. In any given cluster I need to run promtail on each node so that it can access the filesystem and gather data from the log files.

Conversely, in any given cluster I need to only run one instance of Prometheus, for these reasons

Prometheus can easily access the HTTP endpoints of all the services running in a cluster.
Multiple instances of Prometheus scraping the same targets means needing to do deduplication downstream.

Moving to Alloy means that I could collect metrics and logs with one tool instead of two. However, if I configure Alloy to run on every node, using the same configuration as before, I will have to deal with deduplication. One option is to have Alloy only gather metrics from the services running on the same node, instead of from all services in the cluster.

How are you doing logs and metrics collection in your system?

tonyswumac · October 25, 2024, 6:50pm

This is what we currently do:

For every host we have alloy agents installed that collect logs and scrape metrics “locally” only to the host it’s running on, meaning linux metrics and such.

We have, separately, a cluster of alloy agents (size depends on environment) that is configured with cluster mode, and this is responsible for scraping from actual HTTP endpoints.

obiwan · October 25, 2024, 11:03pm

@tonyswumac, ok, that means you have a couple of different Alloy configurations running in your system. One definition for the local agents, then another definition for the cluster of agents that scrape HTTP endpoints. That is another option I can consider. Thank you for sharing that.

Does your Alloy cluster also scrape data from your local Alloy agents? Or are the local Alloy agents doing remote_write to log and metric storage destinations?

tonyswumac · October 28, 2024, 7:46pm

We are currently doing remote write from individual alloy agents, to a centralized alloy cluster, then to mimir. Primary reason we did it this way was because we used to use telegraf + influxdb, which is push based, and it was easier for us to migrate with push mechanism.

We did do a POC by scraping from all alloy agents, using EC2 discovery, and it worked quite well too. We just haven’t considered whether we want to switch.

Topic		Replies	Views
Can Alloy scrape metrics/logs stored in prom/loki or another Alloy instance? Grafana Alloy	3	54	April 10, 2025
Alloy agent and host memory cpu metrics Grafana Cloud	1	29	May 30, 2025
Architecture advice: how to gather metrics from many places Configuration kubernetes , metrics	0	216	April 3, 2024
Scrape Prometheus node_export metrics from remote server with grafana alloy Grafana Alloy	2	695	May 12, 2025
Scrape loki=>alloy=>prometheus Grafana Alloy loki , promethues	3	153	September 16, 2024

Gathering metrics and logs at the same time

Related topics