Exposing Kafka on two different networks

A Big Data setup usually requires you to have multiple networking interface, let’s see how to set up Kafka on more than one of them. Kafka is a open-source stream processing software platform system wich functions like a publish/subscribe distributed messaging. It is designed for high throughput with built-in partitioning, replication, and fault tolerance.

This article was implemented using CDH 5.7.1 with Kafka 2.0.1.5 installed using parcels.

One of the clusters we are working on has the following network configuration:

A “data” network exposing our edge, Kafka and master nodes to the outside world
An “internal” network dedicated to the cluster for our worker nodes

We use Kafka for data ingestion and also to send processed data to another system exposing UIs for the analysts so we have:

A Spark Streaming job consuming Kafka topics from YARN (our “internal” network)
The other system’s app consuming Kafka topics from the outside (our “data” network)

Thus, Kafka must be available on two different networks. To do so, the following configuration must be applied on each Kafka broker in the kafka.properties safety valve input and the Kafka nodes must share the same hostname on both networks:

listeners=PLAINTEXT://0.0.0.0:9092
advertised.listeners=PLAINTEXT://<hostname>:9092

That’s it!

NB: Kafka is listening on every interface instead of just the one you need. Supposedly, Kafka accepts the following configuration to set specific IP addresses:

listeners=PLAINTEXT://<ip1>:9092,PLAINTEXT://<ip2>:9092
advertised.listeners=PLAINTEXT://<hostname>:9092

however, it will throw this exception on startup:

java.lang.IllegalArgumentException: requirement failed: Each listener must have a different port
  at scala.Predef$.require(Predef.scala:219)
  at kafka.server.KafkaConfig.validateUniquePortAndProtocol(KafkaConfig.scala:905)
  at kafka.server.KafkaConfig.getListeners(KafkaConfig.scala:913)
  at kafka.server.KafkaConfig.<init>(KafkaConfig.scala:866)
  at kafka.server.KafkaConfig$.fromProps(KafkaConfig.scala:698)
  at kafka.server.KafkaConfig$.fromProps(KafkaConfig.scala:695)
  at kafka.server.KafkaServerStartable$.fromProps(KafkaServerStartable.scala:28)
  at kafka.Kafka$.main(Kafka.scala:58)
  at com.cloudera.kafka.wrap.Kafka$.main(Kafka.scala:76)
  at com.cloudera.kafka.wrap.Kafka.main(Kafka.scala)

and a variation of ”Each listener must have a different protocol” when changing the ports.

Share this article