'Elasticsearch get unique values from a text value

I have k8s log data collected from fluentbit and sent to elasticsearch with the format such as

{"log": "job_id: 1", "kubernetes.host":"minikube", ... }, 
{"log": "job_id: 2", "kubernetes.host":"minikube", ... }
{"log": "job_id: 3", "kubernetes.host":"minikube", ... }
{"log": "job_id: 1", "kubernetes.host":"minikube", ... }

and I would like to write a elasticsearch get query such that it returns unique job_ids as

["1", "2", "3"]

Any help would be appreciated



Solution 1:[1]

You can use a terms aggregation ,to return the list of unique job ids

{
    "size": 0,
    "aggs": {
        "unique-job-id": {
            "terms": {
                "field": "log.keyword"
            }
        }
    }
}

The search response will be

"aggregations": {
        "unique-job-id": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
                {
                    "key": "job_id: 1",
                    "doc_count": 2
                },
                {
                    "key": "job_id: 2",
                    "doc_count": 1
                },
                {
                    "key": "job_id: 3",
                    "doc_count": 1
                }
            ]
        }
    }

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Paulo