I have some small logging system for DNS queries and I tried to build some metrics dashboard, like as: top queries hosts and etc… but faced with really poor performance and unclear behaviour for me.
My loki stack was started over this compose file: https://github.com/grafana/loki/blob/master/production/docker/docker-compose-ha-memberlist.yaml
without any changes.
For put logs I am using docker: Docker driver client | Grafana Loki documentation
time logcli query ‘{q_hostname=“dns-cache01”,q_message=“CLIENT_QUERY”,q_type=“A”}’ --since=50m --limit 10 -q --stats
Ingester.TotalReached 3
Ingester.TotalChunksMatched 115
Ingester.TotalBatches 0
Ingester.TotalLinesSent 0
Ingester.HeadChunkBytes 42 kB
Ingester.HeadChunkLines 437
Ingester.DecompressedBytes 0 B
Ingester.DecompressedLines 0
Ingester.CompressedBytes 19 MB
Ingester.TotalDuplicates 0
Store.TotalChunksRef 0
Store.TotalChunksDownloaded 0
Store.ChunksDownloadTime 0s
Store.HeadChunkBytes 0 B
Store.HeadChunkLines 0
Store.DecompressedBytes 0 B
Store.DecompressedLines 0
Store.CompressedBytes 0 B
Store.TotalDuplicates 0
Summary.BytesProcessedPerSecond 7.0 MB
Summary.LinesProcessedPerSecond 72510
Summary.TotalBytesProcessed 42 kB
Summary.TotalLinesProcessed 437
Summary.ExecTime 6.026743ms
2020-11-28T21:44:37+02:00 {} q_host=dns-cache01 q_domain=110.250.188.20.zz.countries.nerd.dk q_src_ip=194.0.200.251 q_type=A q_code=NOERROR
2020-11-28T21:44:37+02:00 {} q_host=dns-cache01 q_domain=gmr-smtp-in.l.google.q_src_ip=178.20.158.192 q_type=A q_code=NOERROR
2020-11-28T21:44:37+02:00 {} q_host=dns-cache01 q_domain=119.206.150.45.zz.countries.nerd.dk q_src_ip=194.0 q_type=A q_code=NOERROR
2020-11-28T21:44:37+02:00 {} q_host=dns-cache01 q_domain=buy-sat q_src_ip=193.200. q_type=A q_code=NOERROR
2020-11-28T21:44:37+02:00 {} q_host=dns-cache01 q_domain=api.dropbox q_src_ip=10.5.0 q_type=A q_code=NOERROR
2020-11-28T21:44:37+02:00 {} q_host=dns-cache01 q_domain=ip102.ip-167-114-25. q_src_ip=178.20 q_type=A q_code=NOERROR
2020-11-28T21:44:37+02:00 {} q_host=dns-cache01 q_domain=ip102.ip-167-114-25 q_src_ip=178.20 q_type=A q_code=NOERROR
2020-11-28T21:44:37+02:00 {} q_host=dns-cache01 q_domain=pulse.sum.im q_src_ip=193.200 q_type=A q_code=NOERROR
2020-11-28T21:44:37+02:00 {} q_host=dns-cache01 q_domain=119.206.150.45.zz.countries.nerd.dk q_src_ip=194.0 q_type=A q_code=NOERROR
2020-11-28T21:44:37+02:00 {} q_host=dns-cache01 q_domain=alt1.gmr-smtp-in.l.google q_src_ip=178.20 q_type=A q_code=NOERROR
But, if I will write something like this:
root@prometheus01:~/compose-dnstap# time logcli query ‘topk(5, sum by (q_domain) (count_over_time({q_hostname=“dns-cache01”,q_message=“CLIENT_QUERY”,q_type=“A”} | logfmt | q_domain =~ “." | q_src_ip =~ ".” [5m])))’ --since=50m --limit 2 -q --stats | jq ..metric.q_domain
Ingester.TotalReached 3
Ingester.TotalChunksMatched 242
Ingester.TotalBatches 0
Ingester.TotalLinesSent 0
Ingester.HeadChunkBytes 161 kB
Ingester.HeadChunkLines 1680
Ingester.DecompressedBytes 238 MB
Ingester.DecompressedLines 2025896
Ingester.CompressedBytes 39 MB
Ingester.TotalDuplicates 0
Store.TotalChunksRef 0
Store.TotalChunksDownloaded 0
Store.ChunksDownloadTime 0s
Store.HeadChunkBytes 0 B
Store.HeadChunkLines 0
Store.DecompressedBytes 0 B
Store.DecompressedLines 0
Store.CompressedBytes 0 B
Store.TotalDuplicates 1011298
Summary.BytesProcessedPerSecond 39 MB
Summary.LinesProcessedPerSecond 330676
Summary.TotalBytesProcessed 238 MB
Summary.TotalLinesProcessed 2027576
Summary.ExecTime 6.13160133s
“alex-car.com.ua”
“alt1.gmr-smtp-in.l.google.coma”
“alt2.gmr-smtp-in.l.google.coma”
“demeter.freehost.com.ua”
“gmr-smtp-in.l.google.cos”
“gh.microsoft”
“myandex.r”
“vido.ua”
“za08.in.ua”
“zabbixcom.ua”
real 0m6.308s
user 0m0.150s
sys 0m0.068s
The most intersting part: in my command I have an option: limit=2 and topk(5), but result of running query ignored all limits
how is it possible to optimize query to create a graph for top queries hosts?