"WriteTo failed" issues with distributed Tempo in a Docker Swarm

I have taken the example distributed compose and converted it over to Docker Swarm. However I am seeing continual network problems. This is the Swarm YAML file I am using:

version: '3.8'

services:
  distributor:
    image: grafana/tempo:1.4.1
    hostname: distributor
    command: "-target=distributor -config.file=/etc/tempo.yml"
    # No ports exposed, the multiple NICs mess up Tempo
    #ports:
    #  - "3100:3100"
    #  - "4317:4317"
    volumes:
      - /opt/tempo/configs/tempo.yml:/etc/tempo.yml:ro

  ingester:
    image: grafana/tempo:1.4.1
    hostname: ingester-{{.Task.Slot}}
    command: "-target=distributor -config.file=/etc/tempo.yml"
    volumes:
      - /opt/tempo/configs/tempo.yml:/etc/tempo.yml:ro
    deploy:
      placement:
        max_replicas_per_node: 1
      replicas: 3

  query-frontend:
    image: grafana/tempo:1.4.1
    hostname: query-frontend
    command: "-target=query-frontend -config.file=/etc/tempo.yml"
    volumes:
      - /opt/tempo/configs/tempo.yml:/etc/tempo.yml:ro
    deploy:
      replicas: 1

  querier:
    image: grafana/tempo:1.4.1
    hostname: querier
    command: "-target=querier -config.file=/etc/tempo.yml"
    volumes:
      - /opt/tempo/configs/tempo.yml:/etc/tempo.yml:ro
    deploy:
      replicas: 1

  compactor:
    image: grafana/tempo:1.4.1
    hostname: compactor
    command: "-target=compactor -config.file=/etc/tempo.yml"
    volumes:
      - /opt/tempo/configs/tempo.yml:/etc/tempo.yml:ro
    deploy:
      replicas: 1

  metrics_generator:
    image: grafana/tempo:1.4.1
    hostname: metrics_generator
    command: "-target=metrics-generator -config.file=/etc/tempo.yml"
    volumes:
      - /opt/tempo/configs/tempo.yml:/etc/tempo.yml:ro
    deploy:
      replicas: 1

networks:
  default:
    name: some-network
    driver: overlay
    attachable: true
    # If this is not internal, the multiple NICs mess up Tempo
    internal: true

If the network is not declared as internal, I see errors like this in the logs:
level=warn ts=2022-08-22T08:20:36.41597158Z caller=tcp_transport.go:428 component="memberlist TCPTransport" msg="WriteTo failed" addr=172.18.0.9:7946 err="dial tcp 172.18.0.9:7946: i/o timeout"

If I expose any port, I see errors like this:
level=warn ts=2022-08-22T08:31:10.297712956Z caller=tcp_transport.go:428 component="memberlist TCPTransport" msg="WriteTo failed" addr=172.18.0.7:7946 err="dial tcp 172.18.0.7:7946: connect: network is unreachable"

The IPs getting logged seem to relate to the docker_gwbridge network.

The issue seems to be down to the container having multiple NICs in these scenarios, but no amount of fiddling with http_listen_address, grpc_listen_address, instance_interface_names, interface_names, advertise_addr, or bind_addr and trying to “lock” the executing service to eth0 has fixed them, although they do have an effect as when I set everything to be eth0 (including the advertise/bind IP), I get errors like this:
evel=warn ts=2022-08-22T08:32:28.387377885Z caller=tcp_transport.go:428 component="memberlist TCPTransport" msg="WriteTo failed" addr=10.0.0.9:7946 err="dial tcp 10.0.0.9:7946: i/o timeout" level=warn ts=2022-08-22T08:32:28.394078007Z caller=tcp_transport.go:428 component="memberlist TCPTransport" msg="WriteTo failed" addr=172.18.0.9:7946 err="dial tcp 172.18.0.9:7946: i/o timeout"

Note the mix of ingress and docker_gwbridge IPs.

Can anyone offer a suggestion on how to resolve these warnings and have it only use the valid NIC and IPs, or point me to where in the documentation I can find an answer? My problem looks similar to issue 927, but I didn’t find a solution there.

An internal network with no exposed ports isn’t of much use and the only solution I can think of is adding something like HA Proxy to the mix as at the moment the stack can’t consume spans, with the distributor logging errors like this:
level=error ts=2022-08-22T09:12:57.061880819Z caller=rate_limited_logger.go:27 msg="pusher failed to consume trace data" err="DoBatch: InstancesCount <= 0"
And the ingesters:
ts=2022-08-22T09:49:56.071293583Z caller=memberlist_logger.go:74 level=error msg="Push/Pull with distributor-96cf3228 failed: dial tcp 10.0.0.102:7946: connect: connection refused"

Host OS is CentOS 7.9.2009

Tempo uses the ring for discovering other members of the cluster. When an ingester starts up it attempts to determine it’s IP address by reading the nics and then adding itself to the ring with what it assumes is it’s IP. If you visit /ingester/ring on your distributors you will see a page like:

This will show you all the ingesters and what they think their IP is.

There are a few settings you can use to control what IP/port the ingester adds to the ring. To change what interfaces the ingester will check for IPs set the following config options on the ingester.

ingester:
  lifecycler:
    interface_names:
    - ?
    - ?

To skip auto discovery and set the ip/port directly use:

ingester:
  lifecycler:
    address: <ip>
    port: <int>

Hopefully this helps?

1 Like

Thanks, I am now seeing the ingesters:

Even with the interface or address set, I am still seeing warnings flood
the logs, for example from the digester:

level=info ts=2022-08-22T15:06:35.368651832Z caller=memberlist_client.go:542 msg="joined memberlist cluster" reached_nodes=4
ts=2022-08-22T15:06:43.03000974Z caller=memberlist_logger.go:74 level=warn msg="Was able to connect to ingester-3-43da4322 but other probes failed, network may be misconfigured"
ts=2022-08-22T15:06:48.030561892Z caller=memberlist_logger.go:74 level=warn msg="Was able to connect to ingester-2-a06d7368 but other probes failed, network may be misconfigured"
ts=2022-08-22T15:06:53.030784722Z caller=memberlist_logger.go:74 level=warn msg="Was able to connect to metrics_generator-ba2cb484 but other probes failed, network may be misconfigured"
ts=2022-08-22T15:06:58.031631393Z caller=memberlist_logger.go:74 level=warn msg="Was able to connect to ingester-1-d816f38c but other probes failed, network may be misconfigured"
ts=2022-08-22T15:06:58.53276489Z caller=memberlist_logger.go:74 level=warn msg="Refuting a suspect message (from: distributor-260fdb52)"

And the ingesters:

ts=2022-08-22T15:12:39.094606347Z caller=memberlist_logger.go:74 level=info msg="Marking distributor-5770526a as failed, suspect timeout reached (2 peer confirmations)"
level=warn ts=2022-08-22T15:12:40.204746758Z caller=tcp_transport.go:428 component="memberlist TCPTransport" msg="WriteTo failed" addr=10.0.0.146:7946 err="dial tcp 10.0.0.146:7946: i/o timeout"

The IP mentioned, 10.0.0.146 appears to relate to the digester, yet I have its ring address set; so why that Ingress IP is in play is quite beyond me.

Memberlist error chatter is common during rollouts or if a pod goes away for other reasons. As old IPs disappear memberlist logs errors when it can no longer reach a member. These lines can be ignored in and around rollouts.

If these errors are occurring regularly outside of rollouts then I would look into:

  • What IPs are memberlist.join_members settings? Are these all valid?
  • Can all your pods communicate on the memberlist port (7946) by default?

No pods in play here, this is Docker Swarm.

All the containers are on the same Docker network, configured in the stack file, so they can see each other without issue. I have tested the set-up and it is working. This is an abridged version of my tempo config:

distributor:
  receivers:
    otlp:
      protocols:
        grpc:
  ring:
    instance_addr: ${bindAddr}

ingester:
  lifecycler:
    address: ${bindAddr}
    ring:
      replication_factor: 3

memberlist:
  abort_if_cluster_join_fails: false
  advertise_addr: ${bindAddr}
  bind_addr: ["${bindAddr}"]
  bind_port: 7946
  join_members:
    - distributor:7946
    - ingester-1:7946
    - ingester-2:7946
    - ingester-3:7946
    - compactor:7946
    - query_frontend:7946
    - querier:7946
    - metrics_generator:7946

(bindAddr is set in a custom docker-entrypoint.sh based on the configured NIC, eth0, and each container will be using their own value for bindAddr.)

The errant IPs are related to the Docker Ingress and gateway bridge networks and even though I am explicitly setting the NIC to use (eht0) and the IP for the container (10.0.x.y) via env var expansion. the other NICs still seem to be getting tried.
This isn’t just at the start, it’s constant. Perhaps that’s just a side effect of using Swarm, but using K8s is not an option available to me.

Maybe that isn’t even a problem and I can just set to the log level to ERROR, but I find it a concern.

Thanks again for the help.

Edit: Meant to add that without forcibly setting the the bind address, the ingesters all show up as “unhealthy”.

Edit 2: Got it working now with no errors, had to alter how I determined the IP but then it started wokring with no apparent warnings. (I would use static IPs if I could.)

Nice to see things are working out! Can you share your solution so others who are having similar issues can use it?

The main issue is how networks appear within containers as part of a swarm, they can sometimes get multiple NICs and this leads to all the problems detailed above. Which IP is on what NIC is non-deterministic.

The first solution would to not use Docker Swarm at all, but Kubernetes as that is what seems to be documented! :slight_smile:

If Swarm is a must, then assign static IPs if you can and specify those within the Tempo configs (example config file above):

  ingester-1:
    image: /grafana/tempo:latest
    ...
    environment:
      - bindAddr=10.20.30.42
    networks:
      the_network:
        - ipv4_address: 10.20.30.42
    deploy:
      mode: replicated
      replicas: 1

networks:
  the_network:
    driver: overlay
    attachable: true
    ipam:
      config:
        - subnet: 10.20.30.0/24

Where that is not possible due to reasons, the next option is to find the “correct” IP, assign that to a variable, and then refer to it in the Tempo config.

The Tempo image has the Almquist shell with the getopt available, and this can all be done relatively trivially in a custom docker_entrypoint.sh where you pass one parameter for the IP (e.g. -i 10.20.30 if the running container will be on a 10.20.30.0/24 network) and use that with grep to find the full IP:
export bindAddr=ifconfig | grep inet.addr.${whatever-i-was} | cut -d: -f2 | awk '{print $1}'
Obviously this must be a unique result.
Then another parameter for all the Tempo settings (e.g. -t "-target=ingester -config.file=/etc/tempo.yml") and simply call Tempo at the end with those settings:
/tempo ${whatever-t-was} -config.expand-env=true

Maybe there is an easier way to do this, but I did not find it. I am, however, far from being an expert.