Some operators turn operand into string constant

I am using Grafana 5.3.1 with InfluxDB. When using the Graph panel editor to construct a WHERE clause on a field value, there is no >= and <= operator in the pull down. When I type it in the input box, the (numeric) operand is put in the clause as string constant. InfluxDB returns no results when comparing a numeric field with a string.
My measurement has the resource usage by user (a few dozen users). I want to make a stacked chart with the most heavy users shown, and the small ones grouped as “others” so have two queries:
select sum(cpu) where cpu > 0.5 group by time(1m),user
select sum(cpu) where cpu <= 0.5 group by time(1m)
I can use “<” and hope nothing was exactly 0.5, but it’s kind of odd.

Rob

InfluxDB editor is designated for filtering by tags, but your are trying to filter by field value. Toggle Edit Mode and write InfluxDB query with correct syntax manually.

Ok, but the inconsistency between “<” and “<=” means that for example the “<” operating on a tag is incorrectly including the value (selected from the pull down) as a numeric value and forms an incorrect query.
I now realize the InfluxDB query does not work for “<” and “>” on tags, so if the query editor is supposed to do tags, we probably should not have “<” and “>” in the menu. But I think it would be nice to have the editor also do clauses on fields instead.

“<=” is not suggested operator by the editor, that means it is not implemented and you should expect the problem with generated syntax. You will need to handle also exceptions such as “<=” for the string/boolean fields, …

Filtering by field value is not recommended, because it can cause full table scan = implementation of filtering by field value is not a good idea. The editor must be generic and foolproof. You can still switch to raw editor mode, where you can write and use what you need (not offered functions/operators, subqueries, filtering by field value, timezone, …).

Anyway, you can create PR if you think that this feature is useful.

You’re right, but the “<” currently is suggested by the editor, and using that on tag values at best returns no result, but possibly also a query that is syntactically incorrect.

True, filtering by field value means InfluxDB scans all records returned for that time range, but so does it for a sum() or max() aggregation. I’m not sure this is so much worse.

Let’s test it (my dataset has 90M+ records; value is a field):

# aggregation function SUM
$ time influx --execute 'SELECT MAX(value) FROM measurement'
name: measurement 
time max_value
---- -----------
0    5643545

real    0m31.940s
user    0m0.116s
sys     0m0.065s

# filtering by field value
$ time influx --execute 'SELECT value FROM measurement WHERE value<-100000000'
        name: measurement 
time                value
----                -----
1533631800000000000 -732515639190

real    0m46.496s
user    0m0.090s
sys     0m0.043s

32 vs 46 sec → filtering by field value is much worse in this particular test case. Of course, if you have a small data set in the selected time range, then you won’t notice any performance problem.

Neat. Thanks. I wonder whether the difference in elapsed time is because the operation can be forked and run in parallel on multiple shards. Your second case would probably want to return the result in order, when there were multiple hits. But indeed, in my case I am running against a fairly short time range and aggregate by time.

At least I now understand why some of my queries did not return a result.

Actually, no order has been specified + result has had only 3 records → order is not a problem. Let’s EXPLAIN it:

$ influx -execute 'EXPLAIN SELECT MAX(value) FROM measurement'
QUERY PLAN
----------
EXPRESSION: max(value::float)
...
NUMBER OF FILES: 257884
NUMBER OF BLOCKS: 329388
SIZE OF BLOCKS: 202085541

$ influx -execute 'EXPLAIN SELECT value FROM measurement WHERE value<-100000000'
QUERY PLAN
----------
EXPRESSION: <nil>
...
NUMBER OF FILES: 515768
NUMBER OF BLOCKS: 658776
SIZE OF BLOCKS: 404171082

So aggregation function doesn’t need full table scan in this case (see a number of files) - it is designated for this usage.

Thank you. I guess I have some more reading to do before I understand the effect. I trust the time range will be applied before the filtering on value, so in my case the impact would be less severe.
I don’t really see how I can avoid it; there are too many categories so the legend would be useless. My filter ensures that I only see the ones that are significant anywhere in the period displayed. It is effective for fairly short periods. Ideally I would want the top() categories, but I can’t get that to work with InfluxDB.