First and foremost, let me make clear that this is NOT a political post. I wanted to create a simple tutorial about on sentiment analysis using comments from a Facebook page. The current POTUS stirs everyone’s emotion and this is a great sample to try a quick and dirty sentiment analysis algorithm.
If you want to get the data or sentiment code, click here.
If you want the text mining code, click here.
If you just want to see the pretty info-graphics, scroll all the way down.
If you are curious as to how to scrape data from FB, click here. You will need to become FB developer. It’s super easy! Simply go to https://developers.facebook.com/ and request developer access. Then, you will need to create an app; the name doesn’t matter. In the app, click on settings and you will see two very important numbers that YOU SHOULD NEVER SHARE WITH ANYONE, App ID and App Secret. Those are your login credentials that allow you to scrape FB.
I have saved you time scraping the posts and comments and all the data is located here. Now onto sentiment analysis!
First, let’s import all the necessary libraries.
from textblob import TextBlob import pandas as pd import numpy as np import nltk import sys import os import string import unicodedata import csv from nltk.text import Text from nltk.probability import FreqDist import unicodedata from collections import Counter
Now, let’s define our sentiment function and read in the data.
def sentiment(comment): analysis = TextBlob(comment) # create a textblob object if analysis.sentiment.polarity > 0: return 'positive' elif analysis.sentiment.polarity == 0: return 'neutral' else: return 'negative' #---Read data---# posts = pd.read_csv("POTUS_facebook_statuses.csv", error_bad_lines = False) comments = pd.read_csv("comment.csv")
Let’s now see what’s the count for likes, loves, hahas, angry, sad on 95 of his posts.
posts.describe() # this will print back to the screen the basic stats of the data.
I made dis pretty.
So it seems like POTUS has quite a fan group. Now, I wonder, are people using “Like”, “Angry”, “Haha” properly? I guess there’s no way to know.
Let’s now call our sentiment function and used it on all the comments we found.
Again, this is super quick and dirty and I know for a fact there are both false positives and false negatives.
#—Get sentiment for all responses—# for i in xrange(len(comments[“comment_message”])): item = comments[“comment_message”][i].decode(“unicode-escape”) result = sentiment(item) print result result = float(result) sent.append(result)
We can use the counter function on text and see what the total count is.
#—How many negatives and positives?—# Counter(sent)
I made dis pretty again.
I used my text mining code, click here to jump to it, to a few of the topics being discussed in the comments. So, it seems the majority of the comments are negative while the posts seem to have many likes. This, once again, makes me wonder if FB users are using the like button correctly. How fun!
Happy coding!