DISCLAIMER!!!: This repo is created based on GitHub Sean Coyne and his YouTube for exercise purpose.
- Get Twitter API Credentials
- Create env file named
.env
with the below (replacing XXX with your actual keys). This will be ignored by .gitignore.CONSUMER_KEY = "XXX" CONSUMER_SECRET = "XXX" ACCESS_TOKEN_KEY = "XXX" ACCESS_TOKEN_SECRET = "XXX"
- Create a free Confluent Cloud account
- Create a Kafka cluster in Confluent Cloud
- Create a Kafka topic named
streaming_test_6
with 6 partitions. - Get api credentials and
- Setup a Databricks community cloud account
- Create a Databricks cluster
- Import the kafka_test notebook
- In the first cell of the notebook, replace the XXX with your values for
confluentApiKey
,confluentSecret
andhost
which you will find in the Confluent UI in step 6 - Create a kafka config file by running
vi ~/.confluent/python.config
. In the file replace HOST, API_KEY, API_SECRET with the values from Confluent Cloud
#kafka
bootstrap.servers={HOST}:9092
security.protocol=SASL_SSL
sasl.mechanisms=PLAIN
sasl.username={API_KEY}
sasl.password={API_SECRET}
- Build and run the Docker Container
bash run.sh