Twitter data streaming using Twitter APIs
Streaming real-time Twitter Trending Hash-Tag data using twitter API, Kafka, Python.
- Python
- Twitter Developer Account
- Kafka
https://www.python.org/downloads/
https://developer.twitter.com/en/portal/products/elevated
https://kafka.apache.org/quickstart
Steps to run the code successfully -
- Stop previous running zookeeper(if zookeeper stops aburptly when you try to start) -
sudo service zookeeper stop
- Start the zookeeper -
bin/zookeeper-server-start.sh config/zookeeper.properties
- Start the kafka server in new terminal -
bin/kafka-server-start.sh config/server.properties
- Create topics -
bin/kafka-topics.sh --create --partitions 1 --replication-factor 1 --topic trump --bootstrap-server localhost:9092
- Run the python code in new terminal -
python kafka_producer_pandas.py
- Start kafka consumer in new terminal-
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic trump --from-beginning
- Run the kafka consumer and streaming code
python kafka_consumer_pandas_df.py
Streaming real-time Twitter data using twitter API, Kafka, Pyspark.
- Python
- Twitter Developer Account
- Kafka
- Spark
https://www.python.org/downloads/
https://developer.twitter.com/en/portal/products/elevated
https://kafka.apache.org/quickstart
https://spark.apache.org/downloads.html
Steps to run the code successfully -
- Stop previous running zookeeper(if zookeeper stops aburptly when you try to start) -
sudo service zookeeper stop
- Start the zookeeper -
bin/zookeeper-server-start.sh config/zookeeper.properties
- Start the kafka server in new terminal -
bin/kafka-server-start.sh config/server.properties
- Create topics -
bin/kafka-topics.sh --create --partitions 1 --replication-factor 1 --topic trump --bootstrap-server localhost:9092
- Run the pyspark producer code in new terminal -
python Kafka_producer_pyspark.py
- Start kafka consumer in new terminal-
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic trump --from-beginning
- Now run the pyspark streaming code
python kafka-pyspark-twitter-streaming.py