'Duplicated metrics using prometheus federation
What happened?
The duplicated metrics appear on main prometheus server:
namespace_workload_pod:kube_pod_owner:relabel{cluster="sample-eks", job="federate", namespace="cert-manager", pod="cert-manager-848f547974-xkg8j", prometheus="monitoring/k8s", prometheus_replica="prometheus-k8s-0", workload="cert-manager", workload_type="deployment"}
namespace_workload_pod:kube_pod_owner:relabel{cluster="sample-eks", namespace="cert-manager", pod="cert-manager-848f547974-xkg8j", workload="cert-manager", workload_type="deployment"}
The difference is additional job,prometheus and prometheus_replica labels.
Did you expect to see some different?
No duplicated metrics.
How to reproduce it (as minimally and precisely as possible):
Install two kube-prometheus stacks on different clusters:
- one main
- one child
The main cluster will get the data from the child cluster using /federate endpoint.
Federation config:
- job_name: 'federate'
scrape_interval: 15s
honor_labels: true
metrics_path: '/federate'
params:
'match[]':
- '{job=~".*"}'
static_configs:
- targets:
- 'prometheus-sample.dev.xyz.com'
Environment
I use last git tag of kube-prometheus repo on AWS EKS 1.21.
- Prometheus Logs:
level=warn ts=2021-10-01T15:04:21.300Z caller=scrape.go:1399 component="scrape manager" scrape_pool=federate target="http://prometheus-sample.dev.oryxlabs.com:80/federate?match%5B%5D=%7Bjob%3D~%22.%2A%22%7D" msg="Error on ingesting samples with different value but same timestamp" num_dropped=20
Solution 1:[1]
The best option here is to deploy all prometheus servers with the option replicaExternalLabelName: ""
More info here:
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Drazul |
