Count number of unique values without hitting series-limit

I am using Grafana together with Grafana Loki to monitor a service through its access logs. I have been able to extract the number of requests over a given time range, as well as the top 10 paths and requests to specific URLs.

What I would also like to count is the number of unique users. My possibilities are very limited here, so I decided to count the number of unique IPs as an approximation. But when I run the following query on larger time frames (7 days in my case), it fails due to the 500 series limit.

count(count_over_time({job="data-collections-explorer", code="200"} | regexp `^(?:[^ ]+ )?(?P<ip>[^ ]+) - -.*` | keep ip [$__range]))

I understand why the query runs into this issue (more than 500 unique IPs resulting in more than 500 time queries), but I was hoping there are alternative ways to retrieve the metric I am looking for. In my case, I don’t even need the time series that count_over_time creates, I just want the absolute number of unique values over the given range. Is there any way of doing something like this in LogQL?

I assume using the tranformations in the Grafana Dashboards is not a good approach, as it would run into the line limit. (1000 log lines, which is the limit, reaches only 2 days into the past in my case)

(Side note: I already have set up recording rules for unique IPs in the last 24 hours, I am just looking for ways to make the above query work. That would allow me to also record the number of unique IPs per 7 days)

Thank you for taking the time to read my post :smile:

How about this?

count(
  sum by (ip) (
    count_over_time(
      {job="data-collections-explorer", code="200"}
        | regexp `^(?:[^ ]+ )?(?P<ip>[^ ]+) - -.*`
      [$__range]
    )
  )
)
1 Like

Thank you for your reply. While your idea sadly does not work (still produces the 500 series limit error), it gave me the idea to try something else. Instead of doing a count_over_time on the whole $__range, I instead now do it on just 1h. Since the labels (IPs) are preserved in the time series per 1h, I can then count over those using a Reduce transformation (where I calculate the total number of accesses per IP over all 1h segments). Then I display the data in a Stat, where I can choose to display the count of values in the response. That yields exactly the number of unique IPs over the entire time range, and even goes beyond 500 IPs.

If you need counting big number of unique label values (for example, millions of unique ip addresses), then Loki won’t help here :frowning: Try VictoriaLogs then - it provides count_uniq() stats function, which can count millions of unique values in a blink of an eye. For example, the following query counts the number of unique ip values over the selected time range in Grafana:

{job="data-collections-explorer", code="200"}
    | extract " <ip> - -"
    | stats count_uniq(ip) as uniq_ips

The ip values are extracted from the log message using the extract pipe.