By using Google NLP and Twitter API, I programmed an naive program called "Mood Plot" that could summarize people's sentiment on a topic over a certain amount of the most recent tweets related to the topic. It could also plot the sentiment trend and the average sentiment score on a graph for visualization. The source code is in the file moodplot.py, and the code and the sample I/O are displayed below:
# Copyright 2021 Chuwei Chen [email protected]
# For the use of BU EC601 Project 2, with prof. Osama
import os
from matplotlib import pyplot as plt
from google.cloud import language_v1
import tweepy
# Setting the path for the Google credentials
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = "/Users/francischen/Desktop/EC601/google_credential.json"
#Twitter API credentials
consumer_key = ""
consumer_secret = ""
bearer_token = ""
# Instantiate the client
client = language_v1.LanguageServiceClient()
class Twitter_feed:
def __init__(self,comsumer_Key, consumer_Secret):
self.consumer_key = comsumer_Key
self.consumer_secret = consumer_Secret
self.bearer_token = bearer_token
self.sentiments = []
self.invalid = 0
def sentiment_analysis(self, keyword, count):
auth = tweepy.AppAuthHandler(consumer_key, consumer_secret)
try:
api = tweepy.API(auth)
for tweet in api.search_tweets(q = keyword, count = count):
text = tweet.text
document = language_v1.Document(content=text, type_=language_v1.Document.Type.PLAIN_TEXT)
# Error handling: invalid tweet text for Google NLP
try:
sentiment = client.analyze_sentiment(request={'document': document}).document_sentiment
except:
self.invalid += 1
print("ERROR: Current tweet's text is invalid for Google NLP analysis...\n-------------------------")
print("Text: {}".format(text))
print("Sentiment: {:.1f}, {:.1f}\n-------------------------".format(sentiment.score, sentiment.magnitude))
self.sentiments.append(round(sentiment.score,3))
except:
print("ERROR! Current analysis failed due to some reasons...")
def analysis(keyword, count):
# Type error check
if (not count.isdigit()):
print("Sorry, please give a positive number for number of tweets...")
return ("Sorry, please give a positive number for number of tweets...")
# Empty input error handling
if (not keyword) or (int(count) <= 0):
print ('Sorry, please try again and enter a valid topic and number of tweets...')
return ('Sorry, please try again and enter a valid topic and number of tweets...')
# Conduct the analysis
twitter_acount = Twitter_feed(consumer_key, consumer_secret)
twitter_acount.sentiment_analysis(keyword,count)
# Calculate the average sentiment score and Check empty searched tweets
if len(twitter_acount.sentiments) != 0:
avg_sentiment = round(sum(twitter_acount.sentiments)/len(twitter_acount.sentiments),3)
else:
print ("No results found with the given topic... Please try again with another topic :)")
return ("No results found with the given topic... Please try again with another topic :)")
# Print out the results
print(f"{len(twitter_acount.sentiments)} tweets are analyzed.")
print(f"{twitter_acount.invalid} tweets are invalid.")
print("The user's sentiment score trend is: ",*twitter_acount.sentiments)
print("The user's average sentiment score is: ", avg_sentiment)
# Plot the trend
plt.plot(twitter_acount.sentiments, label = "sentiments trend")
plt.axhline(y=avg_sentiment, color = 'r', linestyle = '-', label = "average sentiment")
plt.legend(loc = "upper left")
plt.xlabel("Number of the Tweet")
plt.ylabel("Sentiment score")
plt.ylim(-1,1)
plt.title(f"Sentiment trend of the topic '{keyword}' over {count} most-recent tweets")
plt.grid(True)
plt.show()
def main():
# Ask user for inputs
keyword = input("Happy to see you here!\nPlease enter a topic that you are interested in: ")
count = input("Number of tweets you want to analyze for: ")
# Call the analysis func!
analysis(keyword, count)
if __name__ == '__main__':
main()
Sample I/O:
Happy to see you here!
Please enter a keyword that you are interested in: tesla
Number of tweets you want to analyze for: 20
20 tweets are analyzed.
0 tweets are invalid.
The user's sentiment score trend is: -0.2 0.2 0.2 0.0 -0.4 -0.1 -0.4 -0.1 0.2 0.0 0.2 0.4 0.2 0.2 0.2 0.1 0.2 0.2 0.2 0.2
The user's average sentiment score is: 0.07499999999999998
** In order to properly execute the moodplot.py, please prepare your Google NLP credential file first and replace my credential path with your credential file path in the 'Setting the path for the Google credentials' part in moodplot.py **
-
- If the tweeter's text is not valid for Google NLP, i.e. language not supported, the program will count the number of invalid tweets and continue to retrive tweets until the user desired amount of tweets are reached. (This test is not written in test_moodplot.py because the recent tweets are changing constantly so I can't make sure there will be invalid tweets for Google NLP everytime.)
-
- If the user's input for keyword or count is empty, i.e. user forgot to give an input, the program will return a message to remind the user to give a valid topic and number of tweets in the next trial.
-
- If the topic from the user is too narrow that there's no recent tweets found about the topic, the program will return a message to inform the user that there are no results found for that specific topic and ask him/her to try another topic.
-
- If the user inputs a non-digit string for count, the program will return a message to ask user for a positive number for count.
-
- If the user inputs a 0 or negative number for count, i.e. no tweets for analysis, the program will return a message to ask user for a valid number for count.
Source code of the Unit-test script is in the file test_moodplot.py, also displayed below:
import moodplot as mp
def test_invalidTopic():
"If the topic's search result is empty, return the error message"
assert mp.analysis("SOMETEXTthatIfiguredTHEREwillBEnoResults", '1') == \
"No results found with the given topic... Please try again with another topic :)"
def test_emptyInputs():
"If the user does not give topic or count, return the error message"
assert mp.analysis("", '5') == \
'Sorry, please try again and enter a valid topic and number of tweets...'
def test_badCount():
"If the user inputs a non-digit string for count, return the error message"
assert mp.analysis("Olympic", "badinput") == \
"Sorry, please give a positive number for number of tweets..."
def test_zeroCount():
"If the user inputs 0 or negative number for count, return the error message"
assert mp.analysis("Olympic", '0') == \
"Sorry, please try again and enter a valid topic and number of tweets..."
assert mp.analysis("Olympic", '-1') == \
"Sorry, please give a positive number for number of tweets..."
The output of running python -m test
is displayed below:
Story #1
- I just heard the new that iPhone 13 was released and I am hesitating of whether I should update my old iPhone, and I'm too lazy to read the reviews about the product, so I use Moodplot to see people's feelings about iPhone 13 to help me decide buy it or not.
Story #2
- I need help deciding which movie to see, Venom 2 or NO TIME TO DIE (James Bonds' new movie!)! So Moodplot can show me the average sentiment score for each of them and I will choose to watch the movie with a higher score.
Story #3
- Halloween is here, but I am not sure what costume is on trend now. Maybe the costume in the Korean show "Squid game" might be a good choice? Let me use Moodplot to check how people like the show, and maybe I'll dress up like the triangle soldier in the show!
-
A module that can take inputs from user for topic and number of tweets for analysis
-
A module that can analysis a given context and return a sentiment score
-
A module that can retrieve text from social media posts
-
A module that can plot the sentiment score over the number of tweets
** The following sections explain the details of how my app works. **
First I used Google's Cloud Natural Language API. It is Google's proven pre-trained model for general content classification; sentiment analysis; entity recognition, etc. After getting access to the client library, I connected to the Cloud and tested sentiments for several sentences using my python script. The script is called language.py in this repository.
Code example
# Imports the Google Cloud client library
from google.cloud import language_v1
# Instantiates a client
client = language_v1.LanguageServiceClient()
# The text to analyze
text = input("Give me a sentence:\n")
document = language_v1.Document(content=text, type_=language_v1.Document.Type.PLAIN_TEXT)
# Detects the sentiment of the text
sentiment = client.analyze_sentiment(request={'document': document}).document_sentiment
print("Text: {}".format(text))
print("Sentiment: {:.1f}, {:.1f}".format(sentiment.score, sentiment.magnitude))
A snippet of the output is attached below:
(base) francischen@Ccws-Macbook-Pro EC601 % python language.py
Give me a sentence:
Professor Osama is the best!
Text: Professor Osama is the best!
Sentiment: 0.9, 0.9
(base) francischen@Ccws-Macbook-Pro EC601 % python language.py
Give me a sentence:
Today's weather sucks
Text: Today's weather sucks
Sentiment: -0.7, 0.7
(base) francischen@Ccws-Macbook-Pro EC601 % python language.py
Give me a sentence:
I hate this world.
Text: I hate this world.
Sentiment: -0.9, 0.9
(base) francischen@Ccws-Macbook-Pro EC601 % python language.py
Give me a sentence:
Could you pass me the pen?
Text: Could you pass me the pen?
Sentiment: -0.3, 0.3
As we observed here, from a scale of -0.9 to 0.9, the NLP returns the sentiment score of the input sentences, following that more positive sentence get a higher score.
A program to retrive tweets. Python script is called Tweeter_Retrive.py in this repository.
Code example
#!/usr/bin/env python
# encoding: utf-8
import requests
from requests import session
import tweepy # https://github.com/tweepy/tweepy
import json
#Twitter API credentials
consumer_key = "XXXXXXXX"
consumer_secret = "XXXXXXX"
bearer_token = "XXXXXXX"
class twitter_access:
def __init__(self,comsumer_Key, consumer_Secret):
self.consumer_key = comsumer_Key
self.consumer_secret = consumer_Secret
self.bearer_token = bearer_token
def build_connections(self):
auth = tweepy.AppAuthHandler(consumer_key, consumer_secret)
try:
api = tweepy.API(auth)
for tweet in tweepy.Cursor(api.search_tweets, q='tweepy').items(10):
print(tweet.text)
except tweepy.TweepError:
print('Error! Failed to get request token.')
if __name__ == '__main__':
web_app = twitter_access(consumer_key, consumer_secret)
web_app.build_connections()
The output is below:
(base) francischen@Ccws-Macbook-Pro EC601 % python Twitter_Retreive.py
tweepy + oauth!
@CoderDrax Latest Covid Updates:-
Total Cases : 233,569,801
Active Cases :18,399,778 Closed… https://t.co/26sbqm3ZJd\
RT @chinryu: 日本のツイートトレンドTOP10は['ゴルゴ13', '岸田さん', '決選投票', 'さいとう・たかをさん死去', '高市さん', 'すい臓がんのため死去', '本人の遺志', 'オリコン', 'いとう先生',\ '#自民党総裁選']です #Pytho…
日本のツイートトレンドTOP10は['ゴルゴ13', '岸田さん', '決選投票', 'さいとう・たかをさん死去', '高市さん', 'すい臓がんのため死去', '本人の遺志', 'オリコン', 'いとう先生', '#自民党総\裁選']です #Python #Tweepy
hello tweepy ですよん
@CoderDrax Latest Covid Updates:-
Total Cases : 233,569,801
Active Cases :18,399,778 Closed… https://t.co/1btK1p5Len\
hello tweepy tweet
hello tweepy tweet https://t.co/9RsHR0XAIU\
日本のツイートトレンドTOP10は['ゴルゴ13', 'さいとう・たかをさん死去', 'オリコン', '本人の遺志', 'すい臓がんのため死去', 'いとう先生', '#出産直後の感想教えて', '#招き猫の日', '… \https://t.co/Tfa7oqPGQA
hello tweepy\