Filtering option for Elasticsearch datasource

karthick2020 · January 23, 2020, 5:59am

Hi Team,
We had designed a dashboard in grafana with Elasticsearch as datasource. In metricbeat agent, a new field is added via metricbeat.yml with below config

fields:
** application: [“exxS-e11”,“eBxxxxH-e11”,“exxS-e10”]**
fields_under_root: true

So in kibana, the app info is displayed as below

In grafana, we had created a variable to list applications for filtering,

while we filter any one application to get unique count, it by default includes other 2 apps count and display value as “3”. This may be due to cardinality aggregation feature in ES, But we wanted to filter and display value as “1”. is it feasible?

malamgir · January 23, 2020, 2:15pm

Yeah you can do this. Like you have created variable with the name of application. Put this variable in your query in where clause like
where application in $Variablename ( Name which you have assign to the variable). I wish this will will be helpful.

karthick2020 · January 23, 2020, 5:26pm

Thanks.But this is for Elasticsearch as source and use lucene query in grafana. When we filter one application, it shows unique count as 3 (including other 2 in array) as shown below.

malamgir · January 24, 2020, 6:41am

I have a plan to work on Elastic search for log files but right now i didn’t use it. Anyways thanks for update. when you’ll find out the solution let me know . This will be helpful for me. Thanks.

b0b · February 4, 2020, 10:01am

Hi @karthick2020,

is not 3 the expected result? The screenshot in your original post shows 3 unique values for application.

I think you should use Count if you want to graph the occurrence of a specific $APPLICATION.

I don’t use Elasticsearch much as a Data Source so I am not 100% sure though…

karthick2020 · February 12, 2020, 12:58pm

Hi @b0b,
Yes, we expect the unique count result as “1” since we apply filter and try to search for a specific application from array of strings.

b0b · February 12, 2020, 1:36pm

As far as I know, Lucene/Elasticsearch does not work quite like that.

I’m sure you will see the same result if you query Elasticsearch directly like this

You will search the application field for $APPLICATION and the result will be how many unique values for $APPLICATION were found in all results combined that matched application:$APPLICATION.

The result is application:[“exxS-e11”,“eBxxxxH-e11”,“exxS-e10”] which is evaluated as 3.

That is exactly what I would expect Grafana to return for that query.

karthick2020 · February 13, 2020, 8:46am

Hi @b0b,
Thanks, but our team expect to see response value as “1”. Is there any alternate options that I can use to get this exact unique count (ignoring combination) as “1” at elasticsearch level or in Grafana?

b0b · February 13, 2020, 10:26am

It is mainly Elasticsearch but also Unique Count in Grafana does not do what you expect it to do.

Can you change application from an array to a hash?

Instead of this

application: [“exxS-e11”,“eBxxxxH-e11”,“exxS-e10”]

You would have

application:
  exxS-e11: true
  eBxxxxH-e11: true
  exxS-e10: true

The you could query with _exists_:application.$APPLICATION

Elastic forums have other suggestions

karthick2020 · February 13, 2020, 11:27am

Hi @b0b,
Thanks for the suggestions.

I will check on using Hash (also need to check its flexibility with automated deployment scenarios)
From two ES forum references which you shared, I could try 1st reference and check if below option may be helpful. But it may be challenging to handle via grafana template variable filter. Also we end up ensuring that our filter or search always match expected response structure.

Configure:
PUT metricbeat/doc/1
{
“application”: [
“exxS-E11”,
“ebxxxh-E10”
],
“required_matches”: 2
}

Search as:
GET metricbeat/doc/_search
{
“query”: {
“bool”:{
“must”: [{
“terms_set”: {
“application” : {
“terms” : [“exxS-E11”,“ebxxxh-E10”],
“minimum_should_match_field”: “required_matches”
}
}
}], “filter”:[{
“range”: { “required_matches”: { “gte”: “2” }}
}]
}
}
}’

2nd ES forum reference may consume more resources and lead to performance issues.

b0b · February 13, 2020, 2:33pm

Maybe adding metadata from a process is easier than adding fields directly which might not be very dynamic…

https://www.elastic.co/guide/en/beats/metricbeat/current/add-process-metadata.html

I use metadata fields in Logstash and they can be dynamically updated and changed. Maybe that can work for Metricbeat as well.

karthick2020 · February 14, 2020, 9:54am

Thanks @b0b. Initially I was planning to use this process metadata. But refrained and used fields for unique tracking of certain additional details like application etc. What would be the major difference between metadata & fields? Could you please clarify? or please share reference to understand the difference in better way.

b0b · February 14, 2020, 10:20am

Not sure how well this works for metricbeat… Metadata is mostly internal only to the service in question.

When I use metadata fields in Logstash I can set them on different inputs for different kinds of logs. Then I use the metadata to set to which Elasticsearch index the messages will be routed. My Logstash output looks like this

output {
  elasticsearch {
        hosts => ["10.0.0.1:9200"]
        index => "%{[@metadata][log_prefix]}-%{[@metadata][index]}-%{+yyyy.MM.dd}"
  }
}

This is just to illustrate how I use them. With this I do not need to use conditionals based on field values to route certain logs to certain indices…

I was imagining that maybe this would be possible in metricbeat

  fields:
   application: %{[@metadata][application_id]}

Something like that but I can not find if it is possible in metricbeat or not… And how you would assign the value for the metadata field…

Which is why I suggested Add process metadata as that is something that is documented. Or if you are running containers there is also Add Docker metadata.

That is about as much as I know on this topic.

Hope that helps.

karthick2020 · February 17, 2020, 4:37am

Hi @b0b,
Thanks. I tried suggested hash, but with the combination of array (to handle template variable) as referred below

fields:
application:
exxxx11: true
eBxxxxx11: true
application1: [exxxx11,eBxxxx11]
fields_under_root: true

application - for Hash
application1 - array retained to handle “template variable” filtering only
But it is not helpful while I try to get count dynamically with filter option (from array based field
only for template variable usage) as referred below

karthick2020 · February 24, 2020, 9:17am

Hi @b0b, could you please check on Hash based approach which I shared above and clarify

b0b · February 24, 2020, 3:30pm

I’m doing some tests @karthick2020,

I have not worked much with Elasticsearch as a Data Source…

b0b · February 24, 2020, 3:56pm

This is my test data

{"doc_nr": 1,"app1":true,"applications":["app1"]}
{"doc_nr": 2,"app2":true,"applications":["app2"]}
{"doc_nr": 3,"app3":true,"applications":["app3"]}
{"doc_nr": 4,"app1":true,"app2":true,"applications":["app1","app2"]}
{"doc_nr": 5,"app3":true,"app2":true,"applications":["app3","app2"]}
{"doc_nr": 6,"app3":true,"app1":true,"app2":true,"applications":["app3","app2","app1"]}

I did not get the grouping to work the way I wanted… I haven’t used it before like this…

It is also not possible to choose several apps the way I did it…

karthick2020 · February 25, 2020, 5:38am

Thanks @b0b. Grouping which are referring is about choosing/selecting multiple apps from template variable and getting the exact unique count based on our multiple selection?

b0b · February 25, 2020, 8:46am

Hi @karthick2020,

I should have written “Group by” instead of grouping

The third row in the query editor. With a short interval when I used “Group by” Date Histogram I got a float instead of 0 or 1. I guess it was the average over the time range when split into interval sized buckets, if that makes sense…

This is unfortunately not a problem I personally need solving at the moment and I don’t have time for more testing as I have other proprieties…

As I mentioned before, I have no direct experience of doing exactly what you are so everything I have written have been theoretical suggestions of what could work.

Good luck Hopefully you get it to work the way you expect it to.

karthick2020 · February 25, 2020, 8:52am

Thanks @b0b, Did u configured those test data directly in metricbeat yml or via API? I tried similar configuration in metricbeat.yml, but config file is not loading while starting metricbeat.

Topic		Replies	Views
Elasticsearch datasource: counting unique items in array Dashboards	0	402	July 30, 2023
Access specific field (elasticsearch as datasource) Elasticsearch elasticsearch , query-help	11	1046	December 3, 2024
Best way to filter data from Elasticsearch Elasticsearch elasticsearch , datasource , grafana	6	148	November 5, 2025
Unique terms in table Elasticsearch	0	1392	September 7, 2018
Elasticsearch Query in Grafana Grafana	3	5313	August 19, 2022

Filtering option for Elasticsearch datasource

Related topics