Kafka

Pramod Narayana
3 min readNov 25, 2020

--

This is a series of articles on Kafka. Lets start will a brief introduction to Kafka

Introduction to Kafka

  1. Kafka is a distributed circular persistent message queue.
  2. A queue is divided and distributed across the nodes in the cluster.
  3. Old messages are deleted after a certain time to make room for new messages. Hence it is called a circular message queue. By default, old messages are deleted after 1 week.
  4. All the messages are stored on the disk. Hence it is a persistent message queue. Also for faster access, messages are available in memory

Producer Application Examples

  1. Credit card application will send credit card swipes as messages to Kafka
  2. Ad-server Application will send ad impressions and ad clicks as messages to Kafka.
  3. Sensor Application will send sensor data as messages to Kafka

Consumer Application Examples

  1. Spark Streaming, Flink Streaming, Samza, Kafka Streaming can consume messages from Kafka, do ETL transformation, and save transformed messages back to Kafka or other external storage in real-time.
  2. Web frameworks can also consume messages from Kafka and display them on websites in real-time.

Github repository

https://github.com/pixipanda/kafkatraining
https://github.com/pixipanda/avro-consumer-app
https://github.com/pixipanda/avro-consumer-app2

Ubuntu image

Download ubuntu image (3.86GB)
username: hduser
password: hadoop123

IntelliJ

IntelliJ is installed in the home directory (/home/hduser/idea-IC-183.6156.11/bin)
Go to IntelliJ’s bin directory
cd idea-IC-183.6156.11/bin/

Start IntelliJ
./idea.sh

First, create a workspace directory in your home directory
mdkir workspace
cd workspace

Clone kafkatraining repository from GitHub
git clone https://github.com/pixipanda/kafkatraining

Import Code to IntelliJ

Open IntelliJ and Click on Import Project

Select the downloaded Project and Click OK

Checkmark “Import Project From External Model”
Select Maven and Click OK

Checkmark “Search for projects recursively” and
Checkmark “Import Maven projects automatically” and Click Next

By default, your project will be selected. Click Next

Add New JDK by Clicking on the “+” mark on the top left corner

By default, the java installation directory will be selected. Click Next.
JDK 1.8 will be loaded. Click Next

Let the default Project name and Project Location be as it is and Click Next

Finally, all the dependency libraries will be downloaded and the project will be loaded

Let’s see the components of Kafka in the next article

Summary

  1. Brief Introduction to Kafka
  2. Setup kafka training repo from GitHub

--

--

Pramod Narayana
Pramod Narayana

No responses yet