Enabling tls_insecure_skip_verify in loki deployment fails with "cannot validate certificate for x.x.x.x because it doesn't contain any IP SANs" error

I have enabled tls for grpc server which works perfectly fine. However, when I set “tls_insecure_skip_verify: false” in all grpc clients, I am getting below error

level=error ts=2024-12-21T19:33:54.344126319Z caller=ratestore.go:303 msg=“unable to get stream rates from ingester” ingester=x.x.x.x:9095 err=“rpc error: code = Unavailable desc = connection error: desc = "transport: authentication handshake failed: tls: failed to verify certificate: x509: cannot validate certificate for x.x.x.x because it doesn’t contain any IP SANs"”
level=error ts=2024-12-21T19:33:55.457565818Z caller=ratestore.go:303 msg=“unable to get stream rates from ingester” ingester=x.x.x.x:9095 err=“rpc error: code = Unavailable desc = connection error: desc = "transport: authentication handshake failed: tls: failed to verify certificate: x509: cannot validate certificate for x.x.x.x because it doesn’t contain any IP SANs"”

All certificates have correct domain names.

The common config is as below.

auth_enabled: true
common:
  compactor_address: 'https://loki-backend:3100'
  path_prefix: /var/loki
  replication_factor: 1
  storage:
    s3:
      bucketnames: chunks
      insecure: false
      region: us-east-1
      s3forcepathstyle: false
compactor:
  compaction_interval: 10m
  delete_request_store: s3
  retention_delete_delay: 2h
  retention_delete_worker_count: 150
  retention_enabled: true
  working_directory: /var/loki/data/retention
compactor_grpc_client:
  tls_ca_path: /etc/rmagdum/pki/client-certs/ca.crt
  tls_cert_path: /etc/rmagdum/pki/client-certs/tls.crt
  tls_enabled: true
  tls_insecure_skip_verify: false
  tls_key_path: /etc/rmagdum/pki/client-certs/tls.key
frontend:
  grpc_client_config:
    tls_ca_path: /etc/rmagdum/pki/client-certs/ca.crt
    tls_cert_path: /etc/rmagdum/pki/client-certs/tls.crt
    tls_enabled: true
    tls_insecure_skip_verify: false
    tls_key_path: /etc/rmagdum/pki/client-certs/tls.key
  max_outstanding_per_tenant: 1024
  scheduler_address: ""
  tail_proxy_url: http://loki-querier.rmagdum.svc.cluster.local:3100
frontend_worker:
  grpc_client_config:
    tls_ca_path: /etc/rmagdum/pki/client-certs/ca.crt
    tls_cert_path: /etc/rmagdum/pki/client-certs/tls.crt
    tls_enabled: true
    tls_insecure_skip_verify: false
    tls_key_path: /etc/rmagdum/pki/client-certs/tls.key
  scheduler_address: ""
index_gateway:
  mode: simple
ingester:
  max_chunk_age: 2h
ingester_client:
  grpc_client_config:
    tls_ca_path: /etc/rmagdum/pki/client-certs/ca.crt
    tls_cert_path: /etc/rmagdum/pki/client-certs/tls.crt
    tls_enabled: true
    tls_insecure_skip_verify: false
    tls_key_path: /etc/rmagdum/pki/client-certs/tls.key
limits_config:
  discover_log_levels: false
  discover_service_name: []
  ingestion_burst_size_mb: 128
  ingestion_rate_mb: 64
  max_cache_freshness_per_query: 10m
  per_stream_rate_limit: 128M
  per_stream_rate_limit_burst: 256M
  query_timeout: 300s
  reject_old_samples: true
  reject_old_samples_max_age: 168h
  retention_period: 24h
  split_queries_by_interval: 15m
  volume_enabled: true
memberlist:
  join_members:
  - loki-backend.rmagdum.svc.cluster.local
  - loki-read.rmagdum.svc.cluster.local
  - loki-write.rmagdum.svc.cluster.local
pattern_ingester:
  enabled: false
querier:
  max_concurrent: 1024
query_range:
  align_queries_with_step: true
query_scheduler:
  grpc_client_config:
    tls_ca_path: /etc/rmagdum/pki/client-certs/ca.crt
    tls_cert_path: /etc/rmagdum/pki/client-certs/tls.crt
    tls_enabled: true
    tls_insecure_skip_verify: false
    tls_key_path: /etc/rmagdum/pki/client-certs/tls.key
  max_outstanding_requests_per_tenant: 1024
ruler:
  storage:
    s3:
      bucketnames: ruler
      insecure: false
      region: us-east-1
      s3forcepathstyle: false
    type: s3
runtime_config:
  file: /etc/loki/runtime-config/runtime-config.yaml
schema_config:
  configs:
  - from: "2023-01-01"
    index:
      period: 24h
      prefix: index_rmagdum
    object_store: s3
    schema: v13
    store: tsdb
server:
  grpc_listen_port: 9095
  grpc_tls_config:
    cert_file: /etc/rmagdum/pki/server-certs/tls.crt
    client_auth_type: RequireAndVerifyClientCert
    client_ca_file: /etc/rmagdum/pki/server-certs/ca.crt
    key_file: /etc/rmagdum/pki/server-certs/tls.key
  http_listen_port: 3100
  http_server_read_timeout: 600s
  http_server_write_timeout: 600s
  http_tls_config:
    cert_file: /etc/rmagdum/pki/server-certs/tls.crt
    client_auth_type: RequireAndVerifyClientCert
    client_ca_file: /etc/rmagdum/pki/server-certs/ca.crt
    key_file: /etc/rmagdum/pki/server-certs/tls.key
storage_config:
  aws:
    bucketnames: ${BUCKET_NAME}
    s3: s3://${AWS_ACCESS_KEY_ID}:${AWS_SECRET_ACCESS_KEY}@${BUCKET_REGION}
  boltdb_shipper:
    index_gateway_client:
      server_address: dns+loki-backend-headless.rmagdum.svc.cluster.local:9095
  hedging:
    at: 250ms
    max_per_second: 20
    up_to: 3
  object_prefix: rmagdum/loki-s3
  tsdb_shipper:
    active_index_directory: /var/loki/data/tsdb-index
    cache_location: /var/loki/data/tsdb-cache
    index_gateway_client:
      grpc_client_config:
        tls_ca_path: /etc/rmagdum/pki/client-certs/ca.crt
        tls_cert_path: /etc/rmagdum/pki/client-certs/tls.crt
        tls_enabled: true
        tls_insecure_skip_verify: false
        tls_key_path: /etc/rmagdum/pki/client-certs/tls.key
      server_address: dns+loki-backend-headless.rmagdum.svc.cluster.local:9095
tracing:
  enabled: false

Are you sure that loki components are connecting to that domain name and not to the IP directly?

Thanks for the reply @jangaraj . From the logs, it seems the grpc clients are trying to connect over IP and not the domain name. Is there any additional configuration required for them to use domain names.

Note: The https connection (from nginx gateway) works fine with the same certificates.

Try to configure tls_server_name

I was initially using separate server certificates for each component (read, write and backend). This is why it was difficult to configure tls_server_name for every grpc client. Then I used same server certificate for all components and set the dns name used for that certificate in tls_server_name. Its working now :slight_smile:

Thank you so much for your quick response @jangaraj

One topic = one issue pls.