'PromQL: Filter time series based on presence of recent metrics

In general my question is, how can I filter time series to only those that have a recent metric recorded?

This specific case of this problem is this: I'm trying to graph container_network_receive_bytes_total from cadvisor to give bytes received for all running containers since they started running:

rate(
  container_network_receive_bytes_total{name=~".+",interface="eth0"}[5m]
)

but only for containers that are currently running. The problem is above query shows old terminated containers in results which I don't want (might be useful but I don't want in this case). For example, some docker swarm stack broken and is in a crash loop creating a new containers then terminating container every few seconds so I see a bunch of timeseries for dead irrelevant containers in the results:

enter image description here

I tried doing a join with the on operator based on current state of the container, something like:

rate(
  container_network_receive_bytes_total{name=~".+",interface="eth0"}[5m]) 
  + on(name) group_right 
  container_tasks_state{state="running"}

but this doesn't seem to work because the operator is applied with the state of the container at each timestamp not with the latest timestamp.



Solution 1:[1]

container_network_receive_bytes_total provides the image label.

When a container is stopped for whatever reason, it uses the pause container as placeholder (I believe this is not GKE-specific, as IBM cloud and EKS uses it too afaik).

Therefore you can filter out non-current instances using a query such as

rate(container_network_receive_bytes_total{name=~".+",interface="eth0",image!~"k8s.gcr.io/pause.*"}[5m]) > 0

after inspecting your image label values.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 jawu