'Duplicated metrics using prometheus federation

What happened?

The duplicated metrics appear on main prometheus server:

namespace_workload_pod:kube_pod_owner:relabel{cluster="sample-eks", job="federate", namespace="cert-manager", pod="cert-manager-848f547974-xkg8j", prometheus="monitoring/k8s", prometheus_replica="prometheus-k8s-0", workload="cert-manager", workload_type="deployment"}
namespace_workload_pod:kube_pod_owner:relabel{cluster="sample-eks", namespace="cert-manager", pod="cert-manager-848f547974-xkg8j", workload="cert-manager", workload_type="deployment"}

The difference is additional job,prometheus and prometheus_replica labels.

Did you expect to see some different?

No duplicated metrics.

How to reproduce it (as minimally and precisely as possible):

Install two kube-prometheus stacks on different clusters:

  • one main
  • one child

The main cluster will get the data from the child cluster using /federate endpoint.

Federation config:

- job_name: 'federate'
  scrape_interval: 15s
  honor_labels: true
  metrics_path: '/federate'
  params:
    'match[]':
      - '{job=~".*"}'      
  static_configs:
  - targets:
    - 'prometheus-sample.dev.xyz.com'

Environment

I use last git tag of kube-prometheus repo on AWS EKS 1.21.

  • Prometheus Logs:
level=warn ts=2021-10-01T15:04:21.300Z caller=scrape.go:1399 component="scrape manager" scrape_pool=federate target="http://prometheus-sample.dev.oryxlabs.com:80/federate?match%5B%5D=%7Bjob%3D~%22.%2A%22%7D" msg="Error on ingesting samples with different value but same timestamp" num_dropped=20


Solution 1:[1]

The best option here is to deploy all prometheus servers with the option replicaExternalLabelName: ""

More info here:

https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#prometheusspec

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Drazul