GitHub - nekoemperor/kafka-spark-databricks-stream

demo - Confluent, Kafka, Spark, Databricks

DISCLAIMER!!!: This repo is created based on GitHub Sean Coyne and his YouTube for exercise purpose.

Get Twitter API Credentials
Create env file named .env with the below (replacing XXX with your actual keys). This will be ignored by .gitignore.
```
CONSUMER_KEY = "XXX"
CONSUMER_SECRET = "XXX"
ACCESS_TOKEN_KEY = "XXX"
ACCESS_TOKEN_SECRET = "XXX"
```
Create a free Confluent Cloud account
Create a Kafka cluster in Confluent Cloud
Create a Kafka topic named streaming_test_6 with 6 partitions.
Get api credentials and
Setup a Databricks community cloud account
Create a Databricks cluster
Import the kafka_test notebook
In the first cell of the notebook, replace the XXX with your values for confluentApiKey, confluentSecret and host which you will find in the Confluent UI in step 6
Create a kafka config file by running vi ~/.confluent/python.config. In the file replace HOST, API_KEY, API_SECRET with the values from Confluent Cloud

#kafka
bootstrap.servers={HOST}:9092
security.protocol=SASL_SSL
sasl.mechanisms=PLAIN
sasl.username={API_KEY}
sasl.password={API_SECRET}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
producer		producer
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
requirements.txt		requirements.txt
run.sh		run.sh