Help with Promtail Config For Loki - Errors

Hey, I had configured Loki and Promtail to grab the logs for my docker containers. I had some tags on my docker containers and was able to create labels from them no problem. My containers are web applications, so I wanted to create some labels for things like, IP address, Status Code, etc, but ever since I updated my config, I keep getting these errors in the promtail logs, and when I try to view dashboards for Loki in Grafana, I keep getting gateway timeouts. I’m guessing my configuration is bad, but can’t figure it out.

The ERROR:

level=warn ts=2022-03-04T09:42:57.65872831Z caller=client.go:349 component=client host=10.128.2.123:3100 msg="error sending batch, will retry" status=429 error="server returned HTTP status 429 Too Many Requests (429): Maximum active stream limit exceeded, reduce the number of active streams (reduce labels or reduce label values), or contact your Loki administrator to see if the limit can be increased"

My Promtail Config:

server:
  http_listen_port: 9080
  grpc_listen_port: 0

positions:
  filename: /tmp/positions.yaml

clients:
  - url: http://10.128.2.123:3100/loki/api/v1/push


scrape_configs:
- job_name: containers
  static_configs:
  - targets:
      - localhost
    labels:
      job: containerlogs
      host: ${HOSTNAME}
      __path__: /var/lib/docker/containers/*/*log

  pipeline_stages:
  - json:
      expressions:
        output: log
        stream: stream
        attrs:
  - regex:
      expression: (?P<ip>((?:[0-9]{1,3}\.){3}[0-9]{1,3})).+(?P<request>(GET|POST|HEAD|PUT|DELETE|CONNECT|OPTIONS|TRACE|PATCH)).(?P<endpoint>(.+) ).+\".(?P<status>([0-9]{3}))
      source: output
  - json:
      expressions:
        tag:
      source: attrs
  - regex:
      expression: (?P<image_name>(?:[^|]*[^|])).(?P<container_name>(?:[^|]*[^|])).(?P<image_id>(?:[^|]*[^|])).(?P<container_id>(?:[^|]*[^|]))
      source: tag
  - timestamp:
      format: RFC3339Nano
      source: time
  - labels:
      tag:
      stream:
      image_name:
      container_name:
      image_id:
      container_id:
      ip:
      request:
      endpoint:
      status:
  - output:
      source: output

The lines that I added that started causing the issues:

...
  - regex:
      expression: (?P<ip>((?:[0-9]{1,3}\.){3}[0-9]{1,3})).+(?P<request>(GET|POST|HEAD|PUT|DELETE|CONNECT|OPTIONS|TRACE|PATCH)).(?P<endpoint>(.+) ).+\".(?P<status>([0-9]{3}))
      source: output
...
      ip:
      request:
      endpoint:
      status:

Any help would be greatly appreciated!

EDIT
The error in Loki when getting the gateway timeout was:

level=error ts=2022-03-04T14:08:59.793376377Z caller=scheduler_processor.go:199 org_id=fake msg="error notifying frontend about finished query" err="rpc error: code = ResourceExhausted desc = grpc: received message larger than max (7954289 vs. 4194304)" frontend=172.28.0.25:9095

I also found the logcli tool and ran the following:

# logcli --addr="http://localhost:3100" series {} --analyze-labels

Total Streams:  147
Unique Labels:  13

Label Name      Unique Values  Found In Streams
filename        45             147
endpoint        45             126
container_id    36             134
tag             36             134
image_id        30             134
container_name  21             134
image_name      19             134
ip              10             126
status          5              126
host            4              147
request         2              126
stream          2              147
job             1              147

Looking at the config, the default value for grpc_server_max_concurrent_streams is 100, so what I did was changed the following configuration options:

  grpc_server_max_concurrent_streams: 500
  grpc_server_max_recv_msg_size: 10000000
  grpc_server_max_send_msg_size: 10000000

Seems to have stopped the errors for now. The Grafana dashboard is slow to load, but it doesn’t throw the error anymore. Still not 100% sure if this is the fix or just a band aid for my bad Promtail config.