'How can I compute the difference between two PromQL queries when I have null values?

I have two PromQL queries in Grafana.

Query 1: max_over_time(counter{label="label1"}[5m])

Query 2: max_over_time(counter{label="label1"}[5m] offset 10m)

There’s an exact match between the labels in both queries, so I don’t believe I need to use the on() function. I would like to compute the difference between these queries…

Query 3: max_over_time(counter{label="label1"}[5m]) - max_over_time(counter{label="label1"}[5m] offset 10m)

Query 3 returns a resulting vector which is correct for the most part. If, for example, the resulting vector of Query 1 has an entry, at the i’th position, with value 1500 and the resulting vector of Query 2 has an entry at the i’th position with value 1000. Then the i’th position of the resulting vector in Query 3 becomes 1500-1000= 500.

But when Query 1 has a value of 1000 and Query 2 a value of null (which is formatted as 0), the result becomes 1000 - null = null.

I would like the result to be 1000 in this case. I have attempted to convert all null values to zero, but based on what I’ve read, Prometheus seems to already treat nulls as zeros. I have also attempted to use vector(0):

(max_over_time(counter{label="label1"}[5m]) or vector(0)) - (max_over_time(counter{label="label1"}[5m] offset 10m) or vector(0))

But this doesn't change the result.

Here is a subset of the results obtained from Grafana's query inspector for all three queries.

Correct results for the difference query

I get correct results for the difference query when there are non-null values. However as soon as null values occur, the difference can no longer be computed.

Incorrect difference values when there are null values. 1000-null should be 1000 not null.

I would really appreciate some helpful tips. Thanks in advance.



Solution 1:[1]

To the extent that vector(0) ever works (and I'm not convinced it would work in this case), it only works if the entire data set is null. If, for example, Query 1 returns data for, say, 3 different pods and Query 2 returns data for only 1 of those pods (since the other 2 were just scaled up), then the data for pods 2 and 3 will be null, causing the subtraction to also result in a null for pods 2 and 3. Using vector(0) would have no effect, since data was returned for pod 1. There is no built-in way that I could find to force the subtraction to treat all nulls in the offset data as 0. However, this can be accomplished by subtracting the non-offset data from itself and ORing it with the offset data to force it return 0 for the offset data for all cases for which there is non-offset data:

(max_over_time(counter{label="label1"}[5m]) -
   (max_over_time(counter{label="label1"}[5m] offset 10m) OR 
      (max_over_time(counter{label="label1"}[5m]) -
       max_over_time(counter{label="label1"}[5m])))

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 BrianEff