K2 Data Science & Engineering

Latest news about the curriculum and alumni

Follow publication

Running Kafka using Docker

--

I’m going to show how to use Docker to quickly get started with a development environment for Kafka.

Why Docker?

Docker is a very useful tool to package software builds and distribute them onwards. It allows you to define a universal configuration file and run lightweight virtual machines, called containers.

First, install the version of Docker for your operating system.

Getting Started

After you have downloaded and installed Docker, you can run a container process from the command line, however docker-compose offers a better workflow; see its documentation for details.

The simplest docker-compose.yaml file looks as follows:

image — There are number of Docker images with Kafka, but the one maintained by wurstmeister is the best.

ports —For Zookeeper, the setting will map port 2181 of your container to your host port 2181. For Kafka, the setting will map port 9092 of your container to a random port on your host computer. We can’t define a single port, because we may want to start a cluster with multiple brokers. They will need unique ports.

environment — There are three environment variables. The first two are mandatory, while the third is optional.

  • KAFKA_ADVERTISED_HOST_NAME — Put your host IP, the IP on your main network adaptor.
  • KAFKA_ZOOKEEPER_CONNECT — Tell Kafka to connect to the Zookeeper container on port 2181.
  • KAFKA_CREATE_TOPICS — Create a test topic with 5 partitions and 2 replicas.

volumes — For more details on the binding, see this article.

Run this command:

>> docker-compose up -d

If you want to add more Kafka brokers:

>> docker-compose stop>> docker-compose scale kafka=3 

You should be able to run docker ps and see the 2 containers:

You can use the kafka-python package to setup producers and consumers:

You can view the output on your console to confirm its working.

Become a Kafka Master

Kafka is written in Scala and Java. As such, most of the new features are only accessible through those languages.

If you are interested in learning data engineering, check out the course below.

--

--

Responses (2)

Write a response