'Filebeat 7.10 fails to collect events from multiple kubernetes pods

Filebeat is configured to collect events from multiple kubernetes pods using or condition. Events from a specific pod are continuously collected but events from another pod are collected very slowly and no events are collected after sometime.

Commenting all other pods leaving a single one in the configuration works well and updates the events in the elasticsearch quickly.

There are 3 worker nodes on which filebeat (v7.10.2) is running as a daemonset. Each filebeat has cpu limits of 4 core and memory limits of 4 Gb. There will be one index generated per day and the size of index does not exceed more than 2Gb.

I want the filebeat to collect events from all the pods and update elasticsearch within no time. Please help me in understanding the issue and the best practices to improve filebeat performance.

filebeat.yml -

  filebeat.autodiscover:
    providers:
      - type: kubernetes
        node: ${NODE_NAME}
        tags:
          - "kube-logs"
        templates:
          - condition.or:
              - contains:
                  kubernetes.pod.name: "ne-db-manager"
              - contains:
                  kubernetes.pod.name: "ne-mgmt"
              - contains:
                  kubernetes.pod.name: "list-manager"
              - contains:
                  kubernetes.pod.name: "scheduler-mgmt"
              - contains:
                  kubernetes.pod.name: "sync-ne"
              - contains:
                  kubernetes.pod.name: "file-manager"
              - contains:
                  kubernetes.pod.name: "dash-board"
              - contains:
                  kubernetes.pod.name: "config-manager"
              - contains:
                  kubernetes.pod.name: "report-manager"
              - contains:
                  kubernetes.pod.name: "clean-backup"
              - contains:
                  kubernetes.pod.name: "warrior"
              - contains:
                  kubernetes.pod.name: "ne-backup"
              - contains:
                  kubernetes.pod.name: "ne-restore"
            config:
              - type: container
                paths:
                  - "/var/log/containers/*-${data.kubernetes.container.id}.log"
                multiline.type: pattern
                multiline.pattern: '^[[:space:]]'
                multiline.negate: false
                multiline.match: after
  logging.level: debug
  processors:
    - drop_event:
        when.or:
           - equals:
               kubernetes.namespace: "kube-system"
           - equals:
               kubernetes.namespace: "default"
           - equals:
               kubernetes.namespace: "logging"
  output.logstash:
    hosts: ["logstash-service.logging:5044"]
    index: filebeat
    pretty: true
  setup.template.name: "filebeat"
  setup.template.pattern: "filebeat-*"

logstash.conf -

 input {
   beats {
     port => 5044
   }
 }
 filter {
   if "beats_input_codec_plain_applied" in [tags] {
     mutate {
       #rename => [ "log", "message" ]
       add_tag => [ "DBBKUP", "kubernetes" ]
     }
     mutate {
         remove_tag => ["beats_input_codec_plain_applied"]
     }
     date {
       match => ["time", "ISO8601"]
       remove_field => ["time"]
     }
     grok {
         match => { "message" => "%{TIMESTAMP_ISO8601:LogTimeStamp}%{SPACE}%{GREEDYDATA:Message}" }
         remove_field => ["message"]
         add_tag => ["DBBKUP"]
     }
     date {
       match => [ "LogTimeStamp", "yyyy-MM-dd HH:mm:ss", "ISO8601" ]
       #match => [ "LogTimeStamp", "yyyy-MM-dd HH:mm:ss", "yyyy-MM-dd'T'HH:mm:ss.SSSZ", "ISO8601" ]
       target => "LogTimeStamp"
     }

     if "DBBKUP" in [tags] {
       grok {
         match => { "Message" => "( DEBUG )" }
         add_tag => ["DEBUG"]
       }
     }
     if "DBBKUP" in [tags] {
       grok {
         match => { "Message" => "( INFO )" }
         add_tag => ["INFO"]
       }
     }
     if "DBBKUP" in [tags] {
       grok {
         match => { "Message" => "( ERROR )" }
         add_tag => ["ERROR"]
       }
     }
     if "DBBKUP" in [tags] {
       grok {
         match => { "Message" => "(Exception)" }
         add_tag => ["EXCEPTION"]
       }
     }

     if "DBBKUP" in [tags] {
       grok {
         match => { "Message" => "(ne_management-)" }
         add_tag => ["NE_MGMT"]
         remove_tag => ["DBBKUP"]
       }
     }
     if "NE_MGMT" in [tags] {
       grok {
         match => { "Message" => "(NE_DATA_HISTORY)" }
         add_tag => ["NE_DATA_HISTORY"]
       }
     }
     if "NE_DATA_HISTORY" in [tags] {
       grok {
         match => { "Message" => "%{USERNAME:AppName}%{SPACE}%{WORD:LogLevel}%{SPACE}%{WORD:msgkey} : %{GREEDYDATA:value}" }
         }
       json{
               source => "value"
               target => "data"
       }
     }
     if "DBBKUP" in [tags] {
       grok {
         match => { "Message" => "(scheduler-)" }
         add_tag => ["SCHED"]
         remove_tag => ["DBBKUP"]
       }
     }
     if "SCHED" in [tags] {
       grok {
         match => { "Message" => "(SCHEDULE_EXECUTED)" }
         add_tag => ["SCHEDULE_EXECUTED"]
       }
     }
     if "SCHEDULE_EXECUTED" in [tags] {
       grok {
         match => { "Message" => "%{USERNAME:AppName}%{SPACE}%{WORD:LogLevel}%{SPACE}%{WORD:msgkey} : %{GREEDYDATA:value}" }
         }
       json{
               source => "value"
               target => "data"
       }
     }
     if "DBBKUP" in [tags] {
       grok {
         match => { "Message" => "(dash_board-)" }
         add_tag => ["DASHBOARD"]
         remove_tag => ["DBBKUP"]
       }
     }
    if "DASHBOARD" in [tags] {
      grok {
        match => { "Message" => "(NFS_MONITORING)" }
        add_tag => ["NFS_MONITORING"]
      }
    }
    if "NFS_MONITORING" in [tags] {
      grok {
        match => { "Message" => "%{USERNAME:AppName}%{SPACE}%{WORD:LogLevel}%{SPACE}%{WORD:msgkey} : %{GREEDYDATA:value}" }
        }
      json{
              source => "value"
              target => "data"
      }
    }
    if "DASHBOARD" in [tags] {
      grok {
        match => { "Message" => "(SYNCNE_CLEANBKUP_DATA)" }
        add_tag => ["SYNCNE_CLEANBKUP_DATA"]
      }
    }
    if "SYNCNE_CLEANBKUP_DATA" in [tags] {
      grok {
        match => { "Message" => "%{USERNAME:AppName}%{SPACE}%{WORD:LogLevel}%{SPACE}%{WORD:msgkey} : %{GREEDYDATA:value}" }
        }
      json{
              source => "value"
              target => "data"
      }
    }
    if "DBBKUP" in [tags] {
       grok {
         match => { "Message" => "(config_management-)" }
         add_tag => ["CONFIG_MGMT"]
         remove_tag => ["DBBKUP"]
       }
     }
     if "DBBKUP" in [tags] {
       grok {
         match => { "Message" => "(NEDBManagerApp-)" }
         add_tag => ["DO_BKUP"]
         remove_tag => ["DBBKUP"]
       }
     }
     if "DO_BKUP" in [tags] {
       grok {
         match => { "Message" => "(BACKUP_EXECUTION_DETAILS)" }
         add_tag => ["BACKUP_EXECUTION_DETAILS"]
       }
     }
     if "BACKUP_EXECUTION_DETAILS" in [tags] {
       grok {
         match => { "Message" => "%{USERNAME:AppName}%{SPACE}%{WORD:LogLevel}%{SPACE}%{WORD:msgkey} : %{GREEDYDATA:value}" }
         }
       json{
               source => "value"
               target => "data"
       }
     }
     if "DO_BKUP" in [tags] {
       grok {
         match => { "Message" => "(NE_RESTORE_REPORT)" }
         add_tag => ["NE_RESTORE_REPORT"]
       }
     }
     if "NE_RESTORE_REPORT" in [tags] {
       grok {
         match => { "Message" => "%{USERNAME:AppName}%{SPACE}%{WORD:LogLevel}%{SPACE}%{WORD:msgkey} : %{GREEDYDATA:value}" }
         }
       json{
               source => "value"
               target => "data"
       }
     }
    if "DO_BKUP" in [tags] {
      grok {
        match => { "Message" => "(SCHEDULE_RUN_DETAILS)" }
        add_tag => ["SCHEDULE_RUN_DETAILS"]
      }
    }
    if "SCHEDULE_RUN_DETAILS" in [tags] {
      grok {
        match => { "Message" => "%{USERNAME:AppName}%{SPACE}%{WORD:LogLevel}%{SPACE}%{WORD:msgkey} : %{GREEDYDATA:value}" }
        }
      json{
              source => "value"
              target => "data"
      }
    }
    if "DBBKUP" in [tags] {
       grok {
         match => { "Message" => "(ReportApp-)" }
         add_tag => ["REPORT"]
         remove_tag => ["DBBKUP"]
       }
     }
     if "DBBKUP" in [tags] {
       grok {
         match => { "Message" => "(sync_ne-)" }
         add_tag => ["SYNC_NE"]
         remove_tag => ["DBBKUP"]
       }
     }
     if "SYNC_NE" in [tags] {
       grok {
         match => { "Message" => "(NE_DATA_HISTORY)" }
         add_tag => ["NE_DATA_HISTORY"]
       }
     }
     if "NE_DATA_HISTORY" in [tags] {
       grok {
         match => { "Message" => "%{USERNAME:AppName}%{SPACE}%{WORD:LogLevel}%{SPACE}%{WORD:msgkey} : %{GREEDYDATA:value}" }
         }
       json{
               source => "value"
               target => "data"
       }
     }
     if "SYNC_NE" in [tags] {
       grok {
         match => { "Message" => "(SYNC_DATA_HISTORY)" }
         add_tag => ["SYNC_DATA_HISTORY"]
       }
     }
     if "SYNC_DATA_HISTORY" in [tags] {
       grok {
         match => { "Message" => "%{USERNAME:AppName}%{SPACE}%{WORD:LogLevel}%{SPACE}%{WORD:msgkey} : %{GREEDYDATA:value}" }
         }
       json{
               source => "value"
               target => "data"
       }
     }
     if "DBBKUP" in [tags] {
       grok {
         match => { "Message" => "(file_management-)" }
         add_tag => ["FILE_MGMT"]
         remove_tag => ["DBBKUP"]
       }
     }
     if "FILE_MGMT" in [tags] {
       grok {
         match => { "Message" => "(BACKUP_DOWNLOADED_DETAILS)" }
         add_tag => ["BACKUP_DOWNLOADED_DETAILS"]
       }
     }
     if "BACKUP_DOWNLOADED_DETAILS" in [tags] {
       grok {
         match => { "Message" => "%{USERNAME:AppName}%{SPACE}%{WORD:LogLevel}%{SPACE}%{WORD:msgkey} : %{GREEDYDATA:value}" }
         }
       json{
               source => "value"
               target => "data"
       }
     }
     if "DBBKUP" in [tags] {
       grok {
         match => { "Message" => "(cleanup_backups-)" }
         add_tag => ["CLEANUP"]
         remove_tag => ["DBBKUP"]
       }
     }
     if "DBBKUP" in [tags] {
       grok {
         match => { "Message" => "(list-manager-)" }
         add_tag => ["LIST_MGR"]
         remove_tag => ["DBBKUP"]
       }
     }
     if "DBBKUP" in [tags] {
       grok {
         match => { "Message" => "(vz1-warrior-job)" }
         add_tag => ["JOBS"]
         remove_tag => ["DBBKUP"]
       }
     }
     if "DBBKUP" in [tags] {
       grok {
         match => { "Message" => "(Katana Log)" }
         add_tag => ["WARRIOR"]
         remove_tag => ["DBBKUP"]
       }
     }

     if "_grokparsefailure" in [tags] {
       grok {
         match => { "Message" => "%{WORD:LogLevel}: %{GREEDYDATA:Message}" }
         remove_field => ["message"]
         add_tag => ["log"]
       }
     }

     if "DBBKUP" in [tags] and "ne-backup" in [kubernetes][pod][name] {
       grok {
         match => { "message" => "%{GREEDYDATA:bkupLog}" }
         remove_field => ["message"]
         add_tag => ["WARJOBS"]
         remove_tag => ["DBBKUP"]
       }
     }
     if "DBBKUP" in [tags] and "ne-restore" in [kubernetes][pod][name] {
       grok {
         match => { "message" => "%{GREEDYDATA:bkupLog}" }
         remove_field => ["message"]
         add_tag => ["WARJOBS"]
         remove_tag => ["DBBKUP"]
       }
     }
   }
 }
 output {
   elasticsearch {
     hosts => ["http://index.elastic:9200"]
     manage_template => false
     index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"
   }
 }

I am able to see the 'WARJOBS' tag being collected continuously from 'ne-backup' pod but 'BACKUP_EXECUTION_DETAILS' tag from 'ne-db-manager' pod is collected at the beginning and later stopped collecting it.

I see this issue with other tags as well. If I just comment out rest of the pods from filebeat configuration then logs from a single pod is collected very quickly.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source