'Kafka server configuration - listeners vs. advertised.listeners

To get Kafka running, you need to set some properties in config/server.properties file. There are two settings I don't understand.

Can somebody explain the difference between listeners and advertised.listeners property?

The documentation says:

listeners: The address the socket server listens on.

and

advertised.listeners: Hostname and port the broker will advertise to producers and consumers.

When do I have to use which setting?



Solution 1:[1]

listeners is what the broker will use to create server sockets.

advertised.listeners is what clients will use to connect to the brokers.

The two settings can be different if you have a "complex" network setup (with things like public and private subnets and routing in between).

Solution 2:[2]

From this link: https://cwiki.apache.org/confluence/display/KAFKA/KIP-103%3A+Separation+of+Internal+and+External+traffic

During the 0.9.0.0 release cycle, support for multiple listeners per broker was introduced. Each listener is associated with a security protocol, ip/host and port. When combined with the advertised listeners mechanism, there is a fair amount of flexibility with one limitation: at most one listener per security protocol in each of the two configs (listeners and advertised.listeners).

In some environments, one may want to differentiate between external clients, internal clients and replication traffic independently of the security protocol for cost, performance and security reasons. A few examples that illustrate this:

  • Replication traffic is assigned to a separate network interface so that it does not interfere with client traffic.
  • External traffic goes through a proxy/load-balancer (security, flexibility) while internal traffic hits the brokers directly (performance, cost).
  • Different security settings for external versus internal traffic even though the security protocol is the same (e.g. different set of enabled SASL mechanisms, authentication servers, different keystores, etc.)

As such, we propose that Kafka brokers should be able to define multiple listeners for the same security protocol for binding (i.e. listeners) and sharing (i.e. advertised.listeners) so that internal, external and replication traffic can be separated if required.

So,

listeners - Comma-separated list of URIs we will listen on and their protocols. Specify hostname as 0.0.0.0 to bind to all interfaces. Leave hostname empty to bind to default interface. Examples of legal listener lists:

  • PLAINTEXT://myhost:9092,TRACE://:9091
  • PLAINTEXT://0.0.0.0:9092, TRACE://localhost:9093

advertised.listeners - Listeners to publish to ZooKeeper for clients to use, if different than the listeners above. In IaaS environments, this may need to be different from the interface to which the broker binds. If this is not set, the value for listeners will be used.

Solution 3:[3]

There's so much confusion or little information in answers provided here for the question. So posting my elaborate answer for clarity.

  1. listeners - Used by the embedded jetty web server in kafka to bind to. This jetty web server is used to provide REST API that provides the control plane for Kafka Connect workers. The hostname in this setting can be left empty if you want kafka to bind to localhost (it does by calling InetAddress.getLocalHost().getCanonicalHostName() java api)
  2. advertised.listeners: This address is published to zookeeper by every kafka broker. If this setting is not set, then value of listeners will be used here and published to zookeeper. That's the only purpose of this setting for notifying others. Kafka Clients use the 'advertised.listeners' setting published to zookeeper (as /brokers/ids/<id>/ # endpoints) to talk to Kafka broker.

Now the question is why to have two setting? Why not a single setting? Let's say your kafka broker is sitting behind a proxy. And all the kafka clients have to talk to the proxy to reach the broker. In this case, we want kafka's embedded jetty server to bind to localhost and local port, but we can't publish this to zookeeper as clients can't use it. So kafka admin can set the setting advertised.listeners to the proxy host and port.

Also, in some of our production hosts, InetAddress.getLocalHost().getCanonicalHostName() returns empty and so listeners setting's hostname was empty which was fine for jetty to bind. But advertised.listeners was published to zookeeper as NULL:9092 since it took the same value as listeners by default. Now all the brokers tried to publish in this way to zookeeper and so brokers got the error java.lang.IllegalArgumentException: requirement failed: Configured end points null:14092 as advertised.listeners as NULL:9092 is already registered by broker 101. The fix was to change the advertised.listeners setting to have hostname in it.

Solution 4:[4]

Listeners are all the addresses the Kafka broker listens on (it can be more than 1 address) whereas advertised listeners are the addresses other agents (producers, consumers, or brokers) need to connect to if they want to talk to the current broker.

The 2 lists should be the same if all are running on the same machine (can connect using localhost:9092 or 127.0.0.1:9092) but if consumers, producers, or other brokers do not stay on the same machine or Docker instance, they must use different addresses (that's why we have advertised listeners). Two examples:

  • Saying we use Docker to run 2 Kafka instances named kafka and kafka2. kafka2 for sure cannot connect to kafka using localhost:29092. It must use kafka:9092 instead.
  • Producer from host machine cannot connect to kafka using kafka:9092. It must use localhost:29092 instead.

Let use the following docker-compose config to understand more about the startup process of a Kafka broker:

# config/docker-compose.yml
  kafka:
    image: docker.io/bitnami/kafka:3
    ports:
      - "29092:29092"
      - "9092:9092"
    environment:
      - KAFKA_CFG_ZOOKEEPER_CONNECT=zookeeper:2181
      - ALLOW_PLAINTEXT_LISTENER=yes
      - KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP=CLIENT:PLAINTEXT,EXTERNAL:PLAINTEXT
      - KAFKA_CFG_LISTENERS=CLIENT://:9092,EXTERNAL://:29092
      - KAFKA_CFG_ADVERTISED_LISTENERS=CLIENT://kafka:9092,EXTERNAL://localhost:29092
      - KAFKA_CFG_INTER_BROKER_LISTENER_NAME=CLIENT
    depends_on:
      - zookeeper

With this config, Docker will start 1 Kafka broker instance which listens on 2 ports:

  • 9092 with name CLIENT
  • 29092 with name EXTERNAL

The broker then connects to Zookeeper at zookeeper:2181 and registers its 2 addresses: kafka:9092 and localhost:29092. Also, with KAFKA_CFG_INTER_BROKER_LISTENER_NAME=CLIENT, it wants Zookeeper to tell other brokers to connect to kafka:9092 if want to talk to it.

But why need 2 ports? Read more here

References:

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Thilo
Solution 2 Giorgos Myrianthous
Solution 3 viji
Solution 4