{"componentChunkName":"component---src-templates-tag-js","path":"/tags/kafka/","result":{"data":{"site":{"siteMetadata":{"title":"LoginRadius Blog"}},"allMarkdownRemark":{"totalCount":2,"edges":[{"node":{"fields":{"slug":"/engineering/quick-kafka-installation/"},"html":"<p>In this post, we will look at the step-by-step process for Kafka Installation on Windows. Kafka is an open-source stream-processing software platform and comes under the Apache software foundation.</p>\n<h2 id=\"what-is-kafka\" style=\"position:relative;\"><a href=\"#what-is-kafka\" aria-label=\"what is kafka permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a><strong>What is Kafka?</strong></h2>\n<p>Kafka is used for real-time streams of data, to collect big data, or to do real-time analysis (or both). Kafka is used with in-memory microservices to provide durability and it can be used to feed events to complex event streaming systems and IoT/IFTTT-style automation systems. </p>\n<h2 id=\"installation-\" style=\"position:relative;\"><a href=\"#installation-\" aria-label=\"installation  permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a><strong>Installation :</strong></h2>\n<h3 id=\"1-java-setup\" style=\"position:relative;\"><a href=\"#1-java-setup\" aria-label=\"1 java setup permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>1. Java Setup:</h3>\n<p>Kafka requires Java 8 for running. And hence, this is the first step that we should do to install Kafka. To install Java, there are a couple of options. We can go for the Oracle JDK version 8 from the <a href=\"https://www.oracle.com/java/technologies/javase/javase-jdk8-downloads.html\">Official Oracle Website</a>.</p>\n<h3 id=\"2-kafka--zookeeper-configuration\" style=\"position:relative;\"><a href=\"#2-kafka--zookeeper-configuration\" aria-label=\"2 kafka  zookeeper configuration permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>2. Kafka &#x26; Zookeeper Configuration:</h3>\n<p><strong>Step 1:</strong> Download Apache Kafka from its <a href=\"https://kafka.apache.org/downloads\">Official Site</a>.</p>\n<p><strong>Step 2:</strong> Extract tgz via cmd or from the available tool  to a location of your choice:</p>\n<pre class=\"grvsc-container dark-default-dark\" data-language=\"\" data-index=\"0\"><code class=\"grvsc-code\"><span class=\"grvsc-line\">tar -xvzf kafka_2.12-2.4.1.tgz</span></code></pre>\n<p><strong>Step 3:</strong> Copy the path of the Kafka folder. Now go to <em>config</em> inside Kafka folder and open <em>zookeeper.properties</em> file. Copy the path against the field <em>dataDir</em> and add <em>/zookeeper-data</em> to the path.</p>\n<p><span\n      class=\"gatsby-resp-image-wrapper\"\n      style=\"position: relative; display: block; margin-left: auto; margin-right: auto; max-width: 636px; \"\n    >\n      <span\n    class=\"gatsby-resp-image-background-image\"\n    style=\"padding-bottom: 41.35220125786164%; position: relative; bottom: 0; left: 0; background-image: url('data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAAICAYAAAD5nd/tAAAACXBIWXMAAA7DAAAOwwHHb6hkAAABZklEQVQoz21Ria6cMAzk/3+xYoHcB0sSjsCy6k7tvPekqmqkkZ0hNmNP93g8MOkJfndQWSFsAVMaMSwj3OoQ9wi/OYgsoYvCfMywq6VcwxHP4DvXiSzQ9X2PUY3wh2+ELJIeWUjKzWrajxxhTAMM8df7wnqtWI4F27Wh3hXnfeIknvNuYIVygo0WPmg8nxZmtlCOFEUNaQXs7CA9RfpW32crLGfG/jrw7+l4ZCFJTTQQVDxTjNTcJY+0Z8QcKSaEErDWguM+SFHFUpem6r8NRzGSQoPRTXCkUpsJMzWq79qw3zvu33dTxmPyHs/32ZBqQia1zG+v/bvhNLRxRBBwiyO1GiEqpCujXIV2xtiaqnIWKl7bHpnnhgc1Yp7ffpkiBuinhkmGooIi2ESu+wG9/QWTTXM77oEcdaSQnfeNY7WB+J979+jJFBpZ06is0lAUfmqGBC+hgsTrvr8W9PkOn0/Dz/k7/wPAdV3BuDwGNAAAAABJRU5ErkJggg=='); background-size: cover; display: block;\"\n  ></span>\n  <img\n        class=\"gatsby-resp-image-image\"\n        alt=\"zookeeper\"\n        title=\"zookeeper\"\n        src=\"/static/726644262b05de677792b4af683311e0/9be90/zookeeper.png\"\n        srcset=\"/static/726644262b05de677792b4af683311e0/9be90/zookeeper.png 636w\"\n        sizes=\"(max-width: 636px) 100vw, 636px\"\n        style=\"width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0;\"\n        loading=\"lazy\"\n      />\n    </span>\n<strong>Step 4:</strong> we have to modify the config/server.properties file. Below is the change:</p>\n<pre class=\"grvsc-container dark-default-dark\" data-language=\"\" data-index=\"1\"><code class=\"grvsc-code\"><span class=\"grvsc-line\">fileslog.dirs=C:\\kafka\\kafka-logs</span></code></pre>\n<p>Basically, we are pointing the log.dirs to the new folder /data/kafka.</p>\n<h2 id=\"run-kafka-server\" style=\"position:relative;\"><a href=\"#run-kafka-server\" aria-label=\"run kafka server permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a><strong>Run Kafka Server:</strong></h2>\n<p><strong>Step 1:</strong> Kafka requires Zookeeper to run. Basically, Kafka uses Zookeeper to manage the entire cluster and various brokers. Therefore, a running instance of Zookeeper is a prerequisite to Kafka.</p>\n<p>To start Zookeeper, we can open a PowerShell prompt and execute the below command:</p>\n<pre class=\"grvsc-container dark-default-dark\" data-language=\"\" data-index=\"2\"><code class=\"grvsc-code\"><span class=\"grvsc-line\">.\\bin\\windows\\zookeeper-server-start.bat .\\config\\zookeeper.properties</span></code></pre>\n<p>If the command is successful, Zookeeper will start on port 2181.</p>\n<p><strong>Step 2:</strong> Now open another command prompt and change the directory to the kafka folder. Run kafka server using the command: </p>\n<pre class=\"grvsc-container dark-default-dark\" data-language=\"\" data-index=\"3\"><code class=\"grvsc-code\"><span class=\"grvsc-line\">.\\bin\\windows\\kafka-server-start.bat .\\config\\server.properties</span></code></pre>\n<p><strong>Now your Kafka Server is up and running</strong>, you can create topics to store messages. Also, we can produce or consume data directly from the command prompt.</p>\n<h2 id=\"create-a-kafka-topic\" style=\"position:relative;\"><a href=\"#create-a-kafka-topic\" aria-label=\"create a kafka topic permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a><strong>Create a Kafka Topic:</strong></h2>\n<ol>\n<li>Open a new command prompt in the location C:\\kafka\\bin\\windows.</li>\n<li>Run the following command:</li>\n</ol>\n<pre class=\"grvsc-container dark-default-dark\" data-language=\"\" data-index=\"4\"><code class=\"grvsc-code\"><span class=\"grvsc-line\">kafka-topics.bat --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test</span></code></pre>\n<h2 id=\"creating-kafka-producer\" style=\"position:relative;\"><a href=\"#creating-kafka-producer\" aria-label=\"creating kafka producer permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a><strong>Creating Kafka Producer:</strong></h2>\n<ol>\n<li>Open a new command prompt in the location C:\\kafka\\bin\\windows</li>\n<li>Run the following command:</li>\n</ol>\n<pre class=\"grvsc-container dark-default-dark\" data-language=\"\" data-index=\"5\"><code class=\"grvsc-code\"><span class=\"grvsc-line\">kafka-console-producer.bat --broker-list localhost:9092 --topic test</span></code></pre>\n<h2 id=\"creating-kafka-consumer\" style=\"position:relative;\"><a href=\"#creating-kafka-consumer\" aria-label=\"creating kafka consumer permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a><strong>Creating Kafka Consumer:</strong></h2>\n<ol>\n<li>Open a new command prompt in the location C:\\kafka\\bin\\windows.</li>\n<li>Run the following command:</li>\n</ol>\n<pre class=\"grvsc-container dark-default-dark\" data-language=\"\" data-index=\"6\"><code class=\"grvsc-code\"><span class=\"grvsc-line\">kafka-console-consumer.bat --bootstrap-server localhost:9092 --topic test --from-beginning</span></code></pre>\n<p>If you see these messages on consumer console,<em>Congratulations!!!</em> you all done. Then you can play with producer and consumer terminal bypassing some Kafka messages.</p>\n<style class=\"grvsc-styles\">\n  .grvsc-container {\n    overflow: auto;\n    -webkit-overflow-scrolling: touch;\n    padding-top: 1rem;\n    padding-top: var(--grvsc-padding-top, var(--grvsc-padding-v, 1rem));\n    padding-bottom: 1rem;\n    padding-bottom: var(--grvsc-padding-bottom, var(--grvsc-padding-v, 1rem));\n    border-radius: 8px;\n    border-radius: var(--grvsc-border-radius, 8px);\n    font-feature-settings: normal;\n  }\n  \n  .grvsc-code {\n    display: inline-block;\n    min-width: 100%;\n  }\n  \n  .grvsc-line {\n    display: inline-block;\n    box-sizing: border-box;\n    width: 100%;\n    padding-left: 1.5rem;\n    padding-left: var(--grvsc-padding-left, var(--grvsc-padding-h, 1.5rem));\n    padding-right: 1.5rem;\n    padding-right: var(--grvsc-padding-right, var(--grvsc-padding-h, 1.5rem));\n  }\n  \n  .grvsc-line-highlighted {\n    background-color: var(--grvsc-line-highlighted-background-color, transparent);\n    box-shadow: inset var(--grvsc-line-highlighted-border-width, 4px) 0 0 0 var(--grvsc-line-highlighted-border-color, transparent);\n  }\n  \n  .dark-default-dark {\n    background-color: #1E1E1E;\n    color: #D4D4D4;\n  }\n</style>","frontmatter":{"date":"August 25, 2020","updated_date":null,"title":"Setting Up and Running Apache Kafka on Windows OS","tags":["Kafka","Windows"],"coverImage":{"childImageSharp":{"fluid":{"aspectRatio":1.5037593984962405,"src":"/static/cf0f7f91117e9c259a3ad57b67a89c15/ee604/messagelog.png","srcSet":"/static/cf0f7f91117e9c259a3ad57b67a89c15/69585/messagelog.png 200w,\n/static/cf0f7f91117e9c259a3ad57b67a89c15/497c6/messagelog.png 400w,\n/static/cf0f7f91117e9c259a3ad57b67a89c15/ee604/messagelog.png 800w,\n/static/cf0f7f91117e9c259a3ad57b67a89c15/f3583/messagelog.png 1200w,\n/static/cf0f7f91117e9c259a3ad57b67a89c15/5707d/messagelog.png 1600w,\n/static/cf0f7f91117e9c259a3ad57b67a89c15/7ddcb/messagelog.png 2700w","sizes":"(max-width: 800px) 100vw, 800px"}}},"author":{"id":"Ashish Sharma","github":"ashish8947","avatar":null}}}},{"node":{"fields":{"slug":"/engineering/stream-processing-using-kafka/"},"html":"<p>Kafka Streams is a Java library developed to help applications that do stream processing built on Kafka. To learn about Kafka Streams, you need to have a basic idea about Kafka to understand better.  If you’ve worked with Kafka before, Kafka Streams is going to be easy to understand.</p>\n<h2 id=\"what-are-kafka-streams\" style=\"position:relative;\"><a href=\"#what-are-kafka-streams\" aria-label=\"what are kafka streams permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>What are Kafka Streams?</h2>\n<p>Kafka Streams is a streaming application building library, specifically applications that turn Kafka input topics into Kafka output topics. Kafka Streams enables you to do this in a way that is distributed and fault-tolerant, with succinct code.</p>\n<h2 id=\"what-is-stream-processing\" style=\"position:relative;\"><a href=\"#what-is-stream-processing\" aria-label=\"what is stream processing permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>What is Stream processing?</h2>\n<p>Stream processing is the ongoing, concurrent, and record-by-record real-time processing of data.</p>\n<p><strong>Let us get started with some highlights of Kafka Streams:</strong></p>\n<ul>\n<li>Low Barrier to Entry:  Quickly write and run a small-scale POC on a single instance. You only need to run multiple instances of the application on various machines to scale up to high-volume production workloads.</li>\n<li>Lightweight and straightforward client library:  Can be easily embedded in any Java application and integrated with any existing packaging, deployment, and operational tools.</li>\n<li>No external dependencies on systems other than <a href=\"https://en.wikipedia.org/wiki/Apache_Kafka\">Apache Kafka</a> itself</li>\n<li>Fault-tolerant local state: Enables fast and efficient stateful operations like windowed joins and aggregations</li>\n<li>Supports exactly-once processing: Each record will be processed once and only once, even when there is a failure.</li>\n<li>One-record-at-a-time processing to achieve millisecond processing latency supports event-time-based windowing operations with out-of-order arrival of records.</li>\n</ul>\n<h2 id=\"kafka-streams-concepts\" style=\"position:relative;\"><a href=\"#kafka-streams-concepts\" aria-label=\"kafka streams concepts permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>Kafka Streams Concepts:</h2>\n<ul>\n<li><strong>Stream</strong>:  An ordered, replayable, and fault-tolerant sequence of immutable data records, where each data record is defined as a key-value pair.</li>\n<li><strong>Stream Processor</strong>:  A node in the processor topology represents a processing step to transform data in streams by receiving one input record at a time from its source in the topology, applying any operation to it, and may subsequently produce one or more output records to its sinks.</li>\n</ul>\n<p>There are two individual processors in the topology:</p>\n<ul>\n<li><strong>Source Processor</strong>: A source processor is a special type of stream processor that does not have any upstream processors. It produces an input stream to its topology from one or multiple Kafka topics by consuming records from these topics and forwarding them to its down-stream processors.</li>\n<li><strong>Sink Processor</strong>: A sink processor is a special type of stream processor that does not have down-stream processors. It sends any received records from its up-stream processors to a specified Kafka topic.</li>\n</ul>\n<p><span\n      class=\"gatsby-resp-image-wrapper\"\n      style=\"position: relative; display: block; margin-left: auto; margin-right: auto; max-width: 768px; \"\n    >\n      <span\n    class=\"gatsby-resp-image-background-image\"\n    style=\"padding-bottom: 119.53846153846155%; position: relative; bottom: 0; left: 0; background-image: url('data:image/jpeg;base64,/9j/2wBDABALDA4MChAODQ4SERATGCgaGBYWGDEjJR0oOjM9PDkzODdASFxOQERXRTc4UG1RV19iZ2hnPk1xeXBkeFxlZ2P/2wBDARESEhgVGC8aGi9jQjhCY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2P/wgARCAAYABQDASIAAhEBAxEB/8QAFgABAQEAAAAAAAAAAAAAAAAAAAIF/8QAFQEBAQAAAAAAAAAAAAAAAAAAAAH/2gAMAwEAAhADEAAAAd1UlASS0LP/xAAYEAACAwAAAAAAAAAAAAAAAAAAECAhMf/aAAgBAQABBQIt7H//xAAVEQEBAAAAAAAAAAAAAAAAAAAQEf/aAAgBAwEBPwEh/8QAFBEBAAAAAAAAAAAAAAAAAAAAIP/aAAgBAgEBPwEf/8QAFBABAAAAAAAAAAAAAAAAAAAAMP/aAAgBAQAGPwIf/8QAGhAAAgMBAQAAAAAAAAAAAAAAAAEQEVFxof/aAAgBAQABPyGtZ1DVmGobovfBH//aAAwDAQACAAMAAAAQ08BB/8QAGBEAAgMAAAAAAAAAAAAAAAAAAAEQETH/2gAIAQMBAT8QuEHp/8QAFhEBAQEAAAAAAAAAAAAAAAAAAQAQ/9oACAECAQE/EMWL/8QAHBABAQEBAAIDAAAAAAAAAAAAAREhABAxQVGB/9oACAEBAAE/EHTfXIzoLSFyngIp6adpYAZvzx60nAijv0XhbqLkPRMv73//2Q=='); background-size: cover; display: block;\"\n  ></span>\n  <img\n        class=\"gatsby-resp-image-image\"\n        alt=\"Toplogy Example\"\n        title=\"Toplogy Example\"\n        src=\"/static/b27b1b16e63c3b3ac479a43ba7c12482/212bf/streams-architecture-topology.jpg\"\n        srcset=\"/static/b27b1b16e63c3b3ac479a43ba7c12482/6aca1/streams-architecture-topology.jpg 650w,\n/static/b27b1b16e63c3b3ac479a43ba7c12482/212bf/streams-architecture-topology.jpg 768w,\n/static/b27b1b16e63c3b3ac479a43ba7c12482/47311/streams-architecture-topology.jpg 1080w\"\n        sizes=\"(max-width: 768px) 100vw, 768px\"\n        style=\"width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0;\"\n        loading=\"lazy\"\n      />\n    </span></p>\n<ul>\n<li><strong>Kstream</strong>: KStream is nothing but that, a Kafka Stream. It’s a never-ending flow of data in a stream. Each piece of data — a record or a fact — is a collection of key-value pairs. Data records in a record stream are always interpreted as an \"INSERT\".</li>\n<li><strong>KTable</strong>: A KTable is just an abstraction of the stream, where only the latest value is kept. Data records in a record stream are always interpreted as an \"UPDATE\".</li>\n</ul>\n<p>There is actually a close relationship between streams and tables, the so-called stream-table duality</p>\n<h2 id=\"implementing-kafka-streams\" style=\"position:relative;\"><a href=\"#implementing-kafka-streams\" aria-label=\"implementing kafka streams permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>Implementing Kafka Streams</h2>\n<p>Let's Start with the Setup using Scala instead of Java. The Kafka Streams DSL for Scala library is a wrapper over the existing Java APIs for Kafka Streams DSL.</p>\n<p>To Setup things, we need to create a <code>KafkaStreams</code> Instance. It needs a topology and configuration (<code>java.util.Properties</code>). We also need a input topic and output topic. Let's look through a simple example of sending data from an input topic to an output topic using the Streams API</p>\n<p>You can create a topic using the below commands (need to have Kafka pre installed)</p>\n<pre class=\"grvsc-container dark-default-dark\" data-language=\"shell\" data-index=\"0\"><code class=\"grvsc-code\"><span class=\"grvsc-line\">kafka-topics --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic inputTopic</span>\n<span class=\"grvsc-line\">kafka-topics --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic outputTopic</span></code></pre>\n<pre class=\"grvsc-container dark-default-dark\" data-language=\"scala\" data-index=\"1\"><code class=\"grvsc-code\"><span class=\"grvsc-line\">val config: Properties = {</span>\n<span class=\"grvsc-line\">    val properties = new Properties()</span>\n<span class=\"grvsc-line\">    properties.put(StreamsConfig.APPLICATION_ID_CONFIG, &quot;your-application&quot;)</span>\n<span class=\"grvsc-line\">    properties.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, &quot;localhost:9092&quot;)</span>\n<span class=\"grvsc-line\">    properties.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, &quot;latest&quot;)</span>\n<span class=\"grvsc-line\">    properties.put(StreamsConfig.PROCESSING_GUARANTEE_CONFIG, StreamsConfig.EXACTLY_ONCE)</span>\n<span class=\"grvsc-line\">    properties.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.String())</span>\n<span class=\"grvsc-line\">    properties.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, Serdes.String())</span>\n<span class=\"grvsc-line\">    properties</span>\n<span class=\"grvsc-line\">    }</span></code></pre>\n<p>StreamsBuilder provide the high-level Kafka Streams DSL to specify a Kafka Streams topology.</p>\n<pre class=\"grvsc-container dark-default-dark\" data-language=\"scala\" data-index=\"2\"><code class=\"grvsc-code\"><span class=\"grvsc-line\">val builder: StreamsBuilder = new StreamsBuilder</span></code></pre>\n<p>Creates a KStream from the specified topics.</p>\n<pre class=\"grvsc-container dark-default-dark\" data-language=\"scala\" data-index=\"3\"><code class=\"grvsc-code\"><span class=\"grvsc-line\">val inputStream: KStream[String,String] = builder.stream(inputTopic, Consumed.`with`(Serdes.String(), Serdes.String()))</span></code></pre>\n<p>Store the input stream to the output topic.</p>\n<pre class=\"grvsc-container dark-default-dark\" data-language=\"scala\" data-index=\"4\"><code class=\"grvsc-code\"><span class=\"grvsc-line\">inputStream.to(outputTopic)(producedFromSerde(Serdes.String(),Serdes.String())</span></code></pre>\n<p>Starts the Streams Application</p>\n<pre class=\"grvsc-container dark-default-dark\" data-language=\"scala\" data-index=\"5\"><code class=\"grvsc-code\"><span class=\"grvsc-line\">val kEventStream = new KafkaStreams(builder.build(), config)</span>\n<span class=\"grvsc-line\">kEventStream.start()</span>\n<span class=\"grvsc-line\">sys.ShutdownHookThread {</span>\n<span class=\"grvsc-line\">      kEventStream.close(10, TimeUnit.SECONDS)</span>\n<span class=\"grvsc-line\">    }</span></code></pre>\n<p>You can send data to the input topic using </p>\n<pre class=\"grvsc-container dark-default-dark\" data-language=\"shell\" data-index=\"6\"><code class=\"grvsc-code\"><span class=\"grvsc-line\">kafka-console-producer --broker-list localhost:9092 --topic inputTopic</span></code></pre>\n<p>And can fetch the data from the output topic using</p>\n<pre class=\"grvsc-container dark-default-dark\" data-language=\"shell\" data-index=\"7\"><code class=\"grvsc-code\"><span class=\"grvsc-line\">kafka-console-consumer --bootstrap-server localhost:9092 --topic outputTopic --from-beginning</span></code></pre>\n<p>You can add the necessary dependencies in your build file for sbt or pom file for maven. Below is an example for build.sbt.</p>\n<pre class=\"grvsc-container dark-default-dark\" data-language=\"sbt\" data-index=\"8\"><code class=\"grvsc-code\"><span class=\"grvsc-line\">// Kafka</span>\n<span class=\"grvsc-line\">libraryDependencies += &quot;org.apache.kafka&quot; %% &quot;kafka-streams-scala&quot; % &quot;2.0.0&quot;</span>\n<span class=\"grvsc-line\">libraryDependencies += &quot;javax.ws.rs&quot; % &quot;javax.ws.rs-api&quot; % &quot;2.1&quot; artifacts( Artifact(&quot;javax.ws.rs-api&quot;, &quot;jar&quot;, &quot;jar&quot;)) // this is a workaround. There is an upstream dependency that causes trouble in SBT builds.</span></code></pre>\n<p>Let us modify the code a little bit to try out a WordCount example:\nThe code splits the sentences into words and groups by word as a key and the number of occurences or count as value and is being sent to the output topic by converting the KTable to KStream.</p>\n<pre class=\"grvsc-container dark-default-dark\" data-language=\"scala\" data-index=\"9\"><code class=\"grvsc-code\"><span class=\"grvsc-line\">val textLines: KStream[String, String] = builder.stream[String, String](inputTopic)</span>\n<span class=\"grvsc-line\">val wordCounts: KTable[String, Long] = textLines</span>\n<span class=\"grvsc-line\">\t\t.flatMapValues(textLine =&gt; textLine.toLowerCase.split(&quot;\\\\W+&quot;))</span>\n<span class=\"grvsc-line\">\t\t.groupBy((_, word) =&gt; word)</span>\n<span class=\"grvsc-line\">\t\t.count()(materializedFromSerde(Serdes.String(),Serdes.Long()))</span>\n<span class=\"grvsc-line\">\twordCounts.toStream.to(outputTopic)(producedFromSerde(Serdes.String(),Serdes.Long())</span></code></pre>\n<p>With the above process, we can now implement a simple streaming application or a word count application using Kafka Streams in Scala. If you want to set up kafka on Windows, here is a quick guide to implement apache <a href=\"/quick-kafka-installation/\">kafka on windows OS</a>. </p>\n<style class=\"grvsc-styles\">\n  .grvsc-container {\n    overflow: auto;\n    -webkit-overflow-scrolling: touch;\n    padding-top: 1rem;\n    padding-top: var(--grvsc-padding-top, var(--grvsc-padding-v, 1rem));\n    padding-bottom: 1rem;\n    padding-bottom: var(--grvsc-padding-bottom, var(--grvsc-padding-v, 1rem));\n    border-radius: 8px;\n    border-radius: var(--grvsc-border-radius, 8px);\n    font-feature-settings: normal;\n  }\n  \n  .grvsc-code {\n    display: inline-block;\n    min-width: 100%;\n  }\n  \n  .grvsc-line {\n    display: inline-block;\n    box-sizing: border-box;\n    width: 100%;\n    padding-left: 1.5rem;\n    padding-left: var(--grvsc-padding-left, var(--grvsc-padding-h, 1.5rem));\n    padding-right: 1.5rem;\n    padding-right: var(--grvsc-padding-right, var(--grvsc-padding-h, 1.5rem));\n  }\n  \n  .grvsc-line-highlighted {\n    background-color: var(--grvsc-line-highlighted-background-color, transparent);\n    box-shadow: inset var(--grvsc-line-highlighted-border-width, 4px) 0 0 0 var(--grvsc-line-highlighted-border-color, transparent);\n  }\n  \n  .dark-default-dark {\n    background-color: #1E1E1E;\n    color: #D4D4D4;\n  }\n</style>","frontmatter":{"date":"July 01, 2020","updated_date":null,"title":"Kafka Streams: A stream processing guide","tags":["Scala","Kafka","Kafka Streams"],"coverImage":{"childImageSharp":{"fluid":{"aspectRatio":1.5037593984962405,"src":"/static/b42ed1fb037f4d27ff3e60a004298488/ee604/header.png","srcSet":"/static/b42ed1fb037f4d27ff3e60a004298488/69585/header.png 200w,\n/static/b42ed1fb037f4d27ff3e60a004298488/497c6/header.png 400w,\n/static/b42ed1fb037f4d27ff3e60a004298488/ee604/header.png 800w,\n/static/b42ed1fb037f4d27ff3e60a004298488/f3583/header.png 1200w","sizes":"(max-width: 800px) 100vw, 800px"}}},"author":{"id":"Priyadarshan Mohanty","github":"priyadarshan1995","avatar":null}}}}]}},"pageContext":{"tag":"Kafka"}},"staticQueryHashes":["1171199041","1384082988","2100481360","23180105","528864852"]}