kafka-streams-on-heroku

Kafka Streams example on Heroku with a multi-project gradle build

Dependencies

Postgres
Kafka (+ Zookeeper) 0.10+ (this uses 0.11 brokers against 1.0 client)
Java 8
Gradle 4.3 (use sdkman)

Local Development

Building

$ ./gradlew clean build

Testing

$ ./gradlew clean test

Building FatJar Artifacts

$ ./gradlew clean stage

Running Locally

Topologies are organized as subprojects. You can run any or all of them

(start postgres - optional, zookeeper - required, kafka - required)
$ ./gradlew streams-text-processor:run
$ ./gradlew streams-aggregator:run
$ ./gradlew streams-anomaly-checker:run

Deployment: Heroku

Dependencies

Postgres
Kafka
Heroku CLI
Heroku Kafka CLI Plugin

Config Vars

SENDGRID_API_KEY (optional via SendGrid addon)
TESTING_EMAIL (optional for sinking to a test email using SendGrid addon)

Setup

Install the Heroku CLI: https://devcenter.heroku.com/articles/heroku-cli

Install the Heroku Kafka CLI Plugin:

heroku plugins:install heroku-kafka

Clone the application:

$ git clone [email protected]:kissaten/kafka-streams-on-heroku.git

Create the application:

$ cd kafka-streams-on-heroku
$ heroku apps:create <application name>
$ heroku buildpacks:add heroku/ruby
$ heroku buildpacks:add heroku/gradle

Deploy the application:

$ git push heroku master

Run the setup script:

$ ./setup <app name> <plan>

Smoke Testing

$ heroku kafka:topics:write [prefix]textlines "hello world" -a <app>
$ heroku pg:psql -c 'select * from windowed_counts' HEROKU_POSTGRESQL_URL -a <app>

Example Use Cases

Now let's use Kafka Streams with some example use cases. The data-generators directory contains some simple Ruby scripts to generate streams of data. Instructions on how to use them are below.

Word Count

First we'll do word count over a large stream of text. This will produce into Kafka lines from Alice's Adventures in Wonderland.

$ heroku run ruby data-generators/text-generator/stream-lines-to-kafka.rb data-generators/text-generator/alice-in-wonderland.txt --app sushi

Alternatively, if you have Ruby and Bundler installed locally, you can run the data generator locally

$ bundle install --path=vendor/gems
$ cd data-generators/text-generator
$ HEROKU_KAFKA_URL=$(heroku config:get HEROKU_KAFKA_URL) \
HEROKU_KAFKA_CLIENT_CERT=$(heroku config:get HEROKU_KAFKA_CLIENT_CERT) \
HEROKU_KAFKA_CLIENT_CERT_KEY=$(heroku config:get HEROKU_KAFKA_CLIENT_CERT_KEY) \
HEROKU_KAFKA_TRUSTED_CERT=$(heroku config:get HEROKU_KAFKA_TRUSTED_CERT) \
HEROKU_KAFKA_PREFIX=$(heroku config:get HEROKU_KAFKA_PREFIX) \
bundle exec ruby stream-lines-to-kafka.rb alice-in-wonderland.txt

Now we can see the word count for specific time windows:

$ heroku pg:psql -c 'select * from windowed_counts order by time_window desc' HEROKU_POSTGRESQL_URL

Anomaly Detection

Let's look at a more interesting use case -- not only because it is more realistic but also because it better showcases continuously updating caluculations based on a stream of data. You'll need two separate terminal windows for this.

In the first one, tail the Heroku application logs

$ heroku logs --tail --app sushi

In the second one, we'll generate some data. This will produce into Kafka fake log data at a rate of 10 messages per second with a 20% chance of anomaly.

$ heroku run ruby data-generators/log-generator/stream-logs-to-kafka.rb 10 .2 --app sushi

Alternatively, if you have Ruby and Bundler installed locally, you can run the data generator locally

$ bundle install --path=vendor/gems
$ cd data-generators/log-generator
$ HEROKU_KAFKA_URL=$(heroku config:get HEROKU_KAFKA_URL) \
HEROKU_KAFKA_CLIENT_CERT=$(heroku config:get HEROKU_KAFKA_CLIENT_CERT) \
HEROKU_KAFKA_CLIENT_CERT_KEY=$(heroku config:get HEROKU_KAFKA_CLIENT_CERT_KEY) \
HEROKU_KAFKA_TRUSTED_CERT=$(heroku config:get HEROKU_KAFKA_TRUSTED_CERT) \
HEROKU_KAFKA_PREFIX=$(heroku config:get HEROKU_KAFKA_PREFIX) \
bundle exec ruby stream-logs-to-kafka.rb 10 .2

Looking at the Heroku applications logs, you will see STDOUT output showing an anomaly has been detected.

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
codequality		codequality
data-generators		data-generators
gradle		gradle
streams-aggregator		streams-aggregator
streams-anomaly-detector		streams-anomaly-detector
streams-text-processor		streams-text-processor
.gitignore		.gitignore
Gemfile		Gemfile
Gemfile.lock		Gemfile.lock
LICENSE		LICENSE
Procfile		Procfile
README.md		README.md
build.gradle		build.gradle
gradlew		gradlew
gradlew.bat		gradlew.bat
settings.gradle		settings.gradle
setup		setup

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

kafka-streams-on-heroku

Dependencies

Local Development

Building

Testing

Building FatJar Artifacts

Running Locally

Deployment: Heroku

Dependencies

Config Vars

Setup

Smoke Testing

Example Use Cases

Word Count

Anomaly Detection

About

Releases

Packages

Contributors 2

Languages

License

kissaten/kafka-streams-on-heroku

Folders and files

Latest commit

History

Repository files navigation

kafka-streams-on-heroku

Dependencies

Local Development

Building

Testing

Building FatJar Artifacts

Running Locally

Deployment: Heroku

Dependencies

Config Vars

Setup

Smoke Testing

Example Use Cases

Word Count

Anomaly Detection

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages