Project: Learning about Apache Airflow, Apache Kafka, Apache Spark, and Docker

This project focuses on developing my knowledge and proficiency in Apache Airflow, Apache Kafka, Apache Spark, and Docker. These technologies play a key role in distributed cloud-based deployments and are widely utilized for orchestrating workflows, streaming data processing, and containerized application management.

Apache Airflow

Apache Airflow is an open-source platform used for orchestrating and scheduling complex workflows. It allows you to define, manage, and monitor workflows as directed acyclic graphs (DAGs). With Airflow, you can easily schedule and execute tasks, making it ideal for data pipelines, ETL processes, and workflow automation.

Apache Kafka

Apache Kafka is a distributed streaming platform designed for high-throughput, fault-tolerant, and scalable data streaming. It provides a publish-subscribe model, where producers publish data to topics, and consumers subscribe to those topics to consume the data in real-time. Kafka is widely used for building real-time data pipelines, event-driven architectures, and streaming applications.

Apache Spark

Apache Spark is a fast and general-purpose cluster computing system. It provides an interface for distributed data processing and analytics, supporting various programming languages such as Scala, Java, Python, and R. Spark offers in-memory processing, fault tolerance, and a wide range of libraries for batch processing, stream processing, machine learning, and graph processing.

Docker

Docker is an open-source platform that enables developers to automate the deployment and management of applications within containers. Containers provide a lightweight and isolated environment for running applications, ensuring consistency across different environments. Docker simplifies the process of packaging, distributing, and running applications, making it easier to build and deploy software in a reproducible manner.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
Models		Models
apps		apps
dags		dags
requirements		requirements
scripts		scripts
tests		tests
.env		.env
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
dockerfile		dockerfile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project: Learning about Apache Airflow, Apache Kafka, Apache Spark, and Docker

Apache Airflow

Apache Kafka

Apache Spark

Docker

About

Releases

Packages

Languages

ofs416/e2eWebStream

Folders and files

Latest commit

History

Repository files navigation

Project: Learning about Apache Airflow, Apache Kafka, Apache Spark, and Docker

Apache Airflow

Apache Kafka

Apache Spark

Docker

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages