Query DEacceleration using blooms

We were pretty excited to test out bloom filters, as on a large volume, our slowest part would be pulling chunks from the GCS. However, query performance drops significantly when filtering is enabled, and I can pinpoint why…

Here is the comparison when using a structured metadata field and when blooms are in use (key=value matching added):
Results of: sum by (ctxt_msg)(count_over_time({service_name="api",method=~"v2_website"} |= "server state check" [$__auto]))

Stats
Total request time	11.3 s
Number of queries	1
Total number rows	1845
Data source stats
Summary: total bytes processed	20.0 GB
Summary: exec time	8.02 s

Results of: sum by (ctxt_msg)(count_over_time({service_name="api",method=~"v2_website"} |= "server state check" | ctxt_msg="<NULL>"[$__auto]))

Stats
Total request time	3.43 mins
Number of queries	1
Total number rows	1443
Data source stats
Summary: total bytes processed	8.78 GB
Summary: exec time	3.37 mins

Bloom settings are exactly as mentioned in the documentation:

loki:
  bloom_build:
    enabled: true
    planner:
      planning_interval: 6h
  bloom_gateway:
    enabled: true
  limits_config:
    bloom_creation_enabled: true
    bloom_split_series_keyspace_by: 1024
    bloom_gateway_enable_filtering: false

Running 10 bloom gateways and 15 index gateways, and I see that most of the time is spent on chunk_refs_fetch_time (indexing)

total_time_s                   11.512756889
chunk_refs_fetch_time	       11.441082839s
index_bloom_filter_ratio       0.00
index_post_bloom_filter_chunks 11
index_total_chunks	           11
index_used_bloom_filters	   true

At this point, I am lost and disabled bloom filtering altogether, but hopefully the community will come to the rescue!