Need more docs for a possible future scenario

I need to educate my self on the following hypotetical scenario:

  • 4 geographically separated datacenters (two in Brazil, two in USA)
  • Each datacenter has 2 physical servers
  • Each server has 8 Vmware virtual machines
  • Each virtual machine has about 80 containers (docker)

Now the plan:

  • Add 1 physical server running one Grafana Loki per datacenter, collecting data from inside that datacenter only
  • Add 1 physical server running one Grafana Loki collecting (or receiving) data from the other 4 Grafana Loki servers

Let me know if I did get that wrong:

  • having one Grafana Server per datacenter and, for instance, one Grafana Server per geographical location will assure I minimize the logs lost (I never ever had a network failed inside datacenters in 12 years - so it’s a possible but minimal risk)
  • those servers has high cost, high speed disks to collect everything in a snap
  • having one central Grafana Server will make our team able to query and monitor logs from all locations from one point only
  • the central server have 40TB of “not so fast but cheap” disks aggregating everything from other servers

Of course, I’ll have to measure disks speed to guarantee all can be stored in proper timing.
There is some guide for a similar setup, or there is a better practice for a similar scenario?

I do really appreciate your thoughs on that.

Regards,

ER