'Prometheus: Count metric value over a period of time

I don't speak English very well, but I need some advice. I have Prometheus. How can I calculate the number of downtime for a service over a period of time? It's my function

 irate(ALERTS{job="blackbox", alertstate="firing"}[2h])

enter image description here



Solution 1:[1]

From the image you provided, it seems like the value of the metric (say metric_x) is 1 when the server is down, otherwise 0.

You can use sum_over_time(range-vector) query function to calculate, how many times the value was equal to 1 at a given interval.

The following query calculates the sum of metric_x (ie. how many times the mertic_x was equal to 1) within the last 5 minutes.

sum_over_time(metric_x{job="xxx"}[5m])

Similarly, the following query calculates the sum of metric_x of a day had a week ago. It's like a range, starting time to ending time.

sum_over_time(metric_x{job="xxx"}[1d] offset 1w)

References:


Update:

Well If I were you, I would create a recording rule for the query and perform sum_over_time() on the new metric that the rule created.

groups:
  - name: rules
    rules:
    - record: blakbox:ALERTS:irate
      expr:  irate(ALERTS{job="blackbox", alertstate="firing"}[2h])
sum_over_time(blakbox:ALERTS:irate{job="blackbox", alertstate="firing"}[1d] offset 1w)

Solution 2:[2]

If the metric can have either 0 or 1 values, then the sum_over_time(metric[d]) calculates the number of 1 values on the specified lookbehind window d. For example, sum_over_time(up[1h]) returns the number of up samples with 1 value during the last hour. The number of 0 values then can be calculates as count_over_time(up[1h]) - sum_over_time(up[1h]).

If the metric can have other values than 0 and 1, then Prometheus doesn't provide functions for counting the number of samples with a particular value yet :( But VictoriaMetrics provides such a function - count_eq_over_time. For example, the following MetricsQL query returns the number of samples for some_metric time series with the value 42 over the last hour:

count_eq_over_time(some_metric[1h], 42)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 valyala