Logstash-output-loki docker image

I’ve been trying to make Docker Hub work with an s3 input.

My logstash config looks like:

input {
    s3 {
        bucket => "foo-test-alb"
        region => "us-east-2"
        # access_key_id => ""
        # secret_access_key => ""
        add_field => {
            "doctype" => "aws-application-load-balancer"
            "log_format" => "aws-application-load-balancer"
        }
    }
}

filter {
    if [doctype] == "aws-application-load-balancer" or [log_format] == "aws-application-load-balancer" {
        grok {
            match => [ "message", '%{NOTSPACE:request_type} %{TIMESTAMP_ISO8601:log_timestamp} %{NOTSPACE:alb-name} %{NOTSPACE:client} %{NOTSPACE:target} %{NOTSPACE:request_processing_time:float} %{NOTSPACE:target_processing_time:float} %{NOTSPACE:response_processing_time:float} %{NOTSPACE:elb_status_code} %{NOTSPACE:target_status_code:int} %{NOTSPACE:received_bytes:float} %{NOTSPACE:sent_bytes:float} %{QUOTEDSTRING:request} %{QUOTEDSTRING:user_agent} %{NOTSPACE:ssl_cipher} %{NOTSPACE:ssl_protocol} %{NOTSPACE:target_group_arn} %{QUOTEDSTRING:trace_id} "%{DATA:domain_name}" "%{DATA:chosen_cert_arn}" %{NOTSPACE:matched_rule_priority} %{TIMESTAMP_ISO8601:request_creation_time} "%{DATA:actions_executed}" "%{DATA:redirect_url}" "%{DATA:error_reason}" "%{DATA:target_port_list}" "%{DATA:target_status_code_list}"']
        }
        date {
            match => [ "log_timestamp", "ISO8601" ]
        }
        mutate {
            gsub => [
                "request", '"', "",
                "trace_id", '"', "",
                "user_agent", '"', ""
            ]
        }
        if [request] {
            grok {
                match => ["request", "(%{NOTSPACE:http_method})? (%{NOTSPACE:http_uri})? (%{NOTSPACE:http_version})?"]
            }
        }
        if [http_uri] {
            grok {
                match => ["http_uri", "(%{WORD:protocol})?(://)?(%{IPORHOST:domain})?(:)?(%{INT:http_port})?(%{GREEDYDATA:request_uri})?"]
            }
        }
        if [client] {
            grok {
                match => ["client", "(%{IPORHOST:c_ip})?"]
            }
        }
        if [target_group_arn] {
            grok {
                match => [ "target_group_arn", "arn:aws:%{NOTSPACE:tg-arn_type}:%{NOTSPACE:tg-arn_region}:%{NOTSPACE:tg-arn_aws_account_id}:targetgroup\/%{NOTSPACE:tg-arn_target_group_name}\/%{NOTSPACE:tg-arn_target_group_id}" ]
            }
        }
        if [c_ip] {
            geoip {
                source => "c_ip"
                target => "geoip"
            }
        }
        if [user_agent] {
            useragent {
                source => "user_agent"
                prefix => "ua_"
            }
        }
    }
}

output {
  stdout {}
}

output {
  loki {
    url => "https://loki.mon:3100/loki/api/v1/push"
    batch_size => 112640 #112.64 kilobytes
    retries => 5
    min_delay => 3
    max_delay => 500
    message_field => "message"
  }
  # stdout { codec => rubydebug }
}

This config works fine if I install logstash 7.14.0 on an EC2 instance (and comment out the loki section as the plugin isn’t installed) and the logs in the s3 bucket get piped to stdout.

I want this to run in ECS (Elastic Container Service). So I wrote this Dockerfile:

FROM grafana/logstash-output-loki
COPY logstash.conf /home/logstash/loki-test.conf
COPY ./logstash.yml /usr/share/logstash/config/logstash.yml

To build a container.

The logstash.yml looks like this:

http.host: "0.0.0.0"
xpack.monitoring.enabled: false

As with the default one in the image it kept trying to find an elasticsearch server to connect to for licensing reasons.

I then launched this container in my monitoring ECS cluster but it fails to connect to the s3 bucket and start processing logs.

It doesn’t give much in the way of errors, 14 loglines in total and none above warning. By studying the initialisation messages have I worked out what constitutes success/failure.

On logstash 7.14.0 on an EC2 instance I see
[INFO ] 2021-08-26 19:57:25.767 [Converge PipelineAction::Create<main>] Reflections - Reflections took 220 ms to scan 1 urls, producing 120 keys and 417 values
but from the container I get
[INFO ][org.reflections.Reflections] Reflections took 93 ms to scan 1 urls, producing 22 keys and 45 values
The former then goes on to spew formatted logs to stdout but the latter provides no output after the
Successfully started Logstash API endpoint {:port=>9600}
logline

I initially assume this was going to be IAM/access related and have spent quite some time not only checking that the permission were all ok, but also running a container with the same role and security groups in the same cluster which has the aws cli tools on it which can quite happily list the contents of the bucket. I have also installed docker on the EC2 instance from where I can run logstash 7.14.0 from the cli, and when I run the container built from the logstash-output-loki container on there it also fails to connect to s3. Hence I am now convinced that this is not an IAM issue. (famous last words)

Has anyone run in to similar issues?
Why is the logstash in the logstash-output-loki container on dockerhub 7.9.3 and not 7.14.0 ?
Is upgrading logstash version likely to help?

I think my next move will be to build a container based on logstash/logstash-oss:7.14.0-amd64 with the loki plugin installed but I would be very grateful for any advice anyone has.

1 Like