Build a real-time data capability through a Kafka message backbone in AWS
In the last Kafka post, we have installed a Kafka Cluster on Linux Ubuntu. On this post, I will use this cluster to create a Kafka Application that produces and consumes messages. The Application will be written in Java (8+).
To produce messages, I will use the twitter API that enables clients to receive Tweets in near real-time. Every Twitter account has access to the Streaming API and any developer can build applications using it.
Generating Twitter API Keys
Go to Twitter Developer: https://developer.twitter.com/en/apps
Create a new Application and fill in all the minimum required information.
Go to Keys and tokens tab and copy the consumer key and secret pair to a file for later use.
Click on “Create” to generate Access Token and Secret. Copy both of them to a file. Now you have all things needed for developing the producer.
Configure a remote connection to the Kafka Broker
To access your Kafka Broker remotely, make sure to open port 9092 in AWS.
Log in to your AWS. Go to your instance Description tab and click on security group created. Include a new Inboud Rule for Port 9092. You can limit to your IP or leave accessible to all.
Connect to your Ubuntu Server using ssh described in the previous post.
sudo nano /etc/kafka/server.properties
Uncomment the following line and insert the public IPv4 Public IP of your Kafka Server
advertised.listeners=PLAINTEXT://<IPv4 Public IP>:9092
Restart the Kafka Broker:
sudo systemctl stop confluent-kafka sudo systemctl start confluent-kafka
Create the topic
Create your Kafka Topic for the demo running the following command:
kafka-topics --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic bigdata-tweets
Clone the Repository
Now, let’s build our application on the local machine. The complete code is available in my GitHub here. Please clone this repository.
The Producer API helps you produce data to Apache Kafka. I will connect to a remote Kafka Broker, fetch Tweets using the API and send it to a Kafka Topic.
Open IDE of your choice and import as a maven project. I’ll name mine kafka-twitter
Replace your Twiter keys in TwitterConfiguration.java file
Execute App.java and the console will show the tweets being fetched and sent to the Kafka Producer. This will the trigger the Callback with a simple acknowledged response.
Go to the Control Center as explained in the previous post and you will see the messages in your bigdata-tweets topic
Log in to your Kafka Server and consume your messages:
kafka-console-consumer --bootstrap-server localhost:9092 --topic bigdata-tweets --from-beginning