Promtail unable to send logs to remote loki instance

I am trying to ship logs from local promtail instance to a remote loki and grafana setup. I keep getting the following error:

Error Message:

level=warn ts=2020-12-16T12:58:32.454345353Z caller=client.go:241 component=client 
host=some-remote-loki-ip-over-vpn:3100 msg="error sending batch, will retry" status=-1 
error="Post http://some-remote-loki-ip-over-vpn:3100/loki/api/v1/push: net/http: request canceled (Client.Timeout 
exceeded while awaiting headers)"

Note:

  1. If I try shipping same log using postman or insomnia, then I am successfully able to push them to remote loki instance
  2. I thought maybe promtail container is unable to access the remote endpoint but curl and ping work as expected. So promtail is able to find the remote loki instance
  3. I tried searching online for a similar post, found a few (not loki specific) where people suggested updating DNS to 8.8.8.8 but that didn’t help

Promtail Configuration:

server:
  http_listen_port: 9080
  grpc_listen_port: 0

positions:
  filename: /env/app/output/promtail/positions.yaml

clients:
  - url: http://some-remote-loki-ip-over-vpn:3100/loki/api/v1/push

scrape_configs:
- job_name: system
  static_configs:
  - targets:
      - localhost
    labels:
      __path__: /env/app/input/**/*.log

hrmmm, I’m not really sure what’s going on here. The error suggests it’s a network problem but I do have one thought.

How long does it take to make the request when using postman? If the connection is really slow maybe this is causing the timeout?

The connection speed appears to be fine. As soon as I start promtail, I send some system logs to remote loki which are sent just fine but when I try to ship logs of backend services, I see the following pattern:

  1. It takes about 30-40 min before a request to push those logs is made by promtail
  2. The above error pops up everytime.

Not sure what’s going on here. I am able to ping and curl the post request from inside the docker container so it’s not an access problem. Still, looking into it

I have the same problem, how did you solve this issue?

I have the same issue. Can we get some help from the experts?

In my case I was doing some test with the calico plugin CNI and in most of the cases that pods were struggling with networking problems I resolved using the hostNetwork: true parameter. In the case of the promtail, removing this resolve the problem. Obs.: I’m still running the calico CNI

Same problem - it is a true timeout - works on my high performance nodes - not at all on my Raspberry PIs