I’m trying to find a way to monitor network partition issue. I was able to consistently reproduce the error on the frontend side so I have the message that Mnesia has experienced a network partition, but none of the online examples I found seems to be working in term of prometheus monitoring
Also other available metrics like rabbitmq_unreachable_cluster_peers_count neither show any difference between my clusters in normal state vs the one that is having network parition. The only way I can see this metric being updated is if I restart a member of the cluster
hello
Use rabbitmq-diagnostics (CLI)
Run:
rabbitmq-diagnostics cluster_status
Look for:
Network Partitions:
export to Prometheus.
or
Use RabbitMQ API
Look for:
“partitions”:
and same export to prometheus using collectore enable.
not sure but you can try this.
please check metrics come on 9090 port
ie. prometheus target and status up or not