JSON object appearing across multiple lines instead of as a single log entry

Hi Team,

I’m trying to monitor event logs from an application in Grafana. The application outputs json which is dumped into a eventLogs.json file on the HDD via a script - here’s an example with 3x objects:

{
  "object": "list",
  "data": [
    {
      "object": "event",
      "type": 1100,
      "itemId": "aba8329b-16fb-407d-abda-af51009b3d01",
      "collectionId": null,
      "groupId": null,
      "policyId": null,
      "memberId": null,
      "actingUserId": "dffc7182-1203-4af0-b1d0-af51009a9a0c",
      "installationId": null,
      "date": "2022-11-18T09:25:12.3433333Z",
      "device": 10,
      "ipAddress": "2.198.xxx.xxx"
    },
    {
      "object": "event",
      "type": 1108,
      "itemId": "0deba616-9f81-488f-81c1-af4a01040347",
      "collectionId": null,
      "groupId": null,
      "policyId": null,
      "memberId": null,
      "actingUserId": "dffc7182-1203-4af0-b1d0-af51009a9a0c",
      "installationId": null,
      "date": "2022-11-18T09:24:57.071Z",
      "device": 10,
      "ipAddress": "2.198.xxx.xxx"
    },
    {
      "object": "event",
      "type": 1107,
      "itemId": "f812baad-6e31-4fac-8c8a-af4a0103a7f4",
      "collectionId": null,
      "groupId": null,
      "policyId": null,
      "memberId": null,
      "actingUserId": "83cd55a9-95bf-4eb5-a221-af4900c54bf7",
      "installationId": null,
      "date": "2022-11-11T15:45:36.167Z",
      "device": 10,
      "ipAddress": "2.198.xxx.xxx"
    }
  ],
  "continuationToken": "5249723801789057904"
}

I’m scraping this using promtail - here’s my promtail scrape config:

- job_name: xxx-eventLogs_json
  pipeline_stages:
    - json:
        expressions:
          event: event
          eventType: eventType
          itemId: itemId
          collectionId: collectionId
          groupId: groupId
          policyId: policyId
          memberId: memberId
          actingUserId: actingUserId
          installationId: installationId
          date: date
          device: device
          ipAddress: ipAddress
    - labels:
        event:
        eventType:
        itemId:
        collectionId:
        groupId:
        policyId:
        memberId:
        actingUserId:
        installationId:
        date:
        device:
        ipAddress:
        job: xxxlogs-eventLogs_json
    - timestamp:
        format: RFC3339Nano
        source: date
  static_configs:
  - targets:
      - localhost
    labels:
      job: bwlogs-eventLogs_json
      __path__: /var/xxxlogs/eventLogs/*.json
      host: xxx-grafana

These lines are appearing correctly labelled inside of grafana, but are appearing as individual log lines, instead of as a single log line with multiple labels:

In addition - my ‘date’ pipeline stage doesn’t seem to have taken - date is just a label on the data.

I also made an attempt at exporting to .csv, but using the json labels seems ‘neater’

I would very much appreciate any help in getting these json objects read as single log lines.

Two problems I can see here;

  1. Promtail expects each line to be individual log line.
  2. I am not aware of a way to do loop.

In your example, you need to extract everything under data and then loop through each element as a single json object, which I don’t think can be accomplished alone with Promtail configuration. I’d recommend some sort of script to transform the data list into individual log line per element.

OK - perfect, so I’d be better off abandoning a json import and going the .csv route and then extracting labels via regex.

I’d previously been playing with this configuration and got it to a stage where I was happily extracting the labels, but not managing to convert the timestamp. I’ll have another look tomorrow.

Thanks once again.

OK - I’ve had another play.

I’m now outputting my event logs to .csv format - here are some example lines:

event,1100,0deba616-9f81-488f-81c1-af4a01040347,,,,,83cd55a9-95bf-4eb5-a221-af4900c54bf7,,2022-11-11T15:46:40.6766667Z,10,2.198.xxx.xxx
event,1115,f812baad-6e31-4fac-8c8a-af4a0103a7f4,,,,,83cd55a9-95bf-4eb5-a221-af4900c54bf7,,2022-11-11T15:45:38.63Z,10,2.198.xxx.xxx
event,1107,f812baad-6e31-4fac-8c8a-af4a0103a7f4,,,,,83cd55a9-95bf-4eb5-a221-af4900c54bf7,,2022-11-11T15:45:36.167Z,10,2.198.xxx.xxx

and the applicable part of my config.yml for promtail now looks like this:

scrape_configs:
- job_name: xxx-eventLogs_csv
  pipeline_stages:
    - regex:
        expression: '^(?P<event>[a-z]*),(?P<eventType>\d{4}),(?P<itemId>[^,]*),(?P<collectionId>[^,]*),(?P<groupId>[^,]*),(?P<policyId>[^,]*),(?P<memberId>[^,]*),(?P<actingUserId>[^,]*),(?P<installationId>[^,]*),(?P<date>[^,]*),(?P<device>[^,]*),(?P<ipAddress>\d{8,16})$'
    - labels:
        eventType:
        itemId:
        collectionId:
        groupId:
        memberId:
        actingUserId:
        date:
        ipAddress:
    - timestamp:
        format: RFC3339Nano
        source: date
  static_configs:
  - targets:
      - localhost
    labels:
      host: xxx-grafana
      job: xxxlogs-eventLogs_csv
      __path__: /var/xxxlogs/eventLogs/*.csv

I’m aware that this is quite a large number of labels, but that’s something I’ll deal with afterwards.

My issue now is that the pipeline stage that I’ve set does not appear to be picked up. I can’t see any errors in the logs or GUI - just nothing is happening.

I am expecting labels to be added to my data based on the ?P codes in the regex, and then the data with the ‘date’ label to be set as the timestamp.

From this screenshot, the timestamp is still being set as the moment at which Promtail scraped the log, and no labels are being added.

All help is greatly appreciated.