First and foremost, let me make clear that this is NOT a political post.  I wanted to create a simple tutorial about on sentiment analysis using comments from a Facebook page.  The current POTUS stirs everyone’s emotion and this is a great sample to try a quick and dirty sentiment analysis algorithm.

If you want to get the data or sentiment code, click here.

If you want the text mining code, click here.

If you just want to see the pretty info-graphics, scroll all the way down.

If you are curious as to how to scrape data from FB, click here. You will need to become FB developer.  It’s super easy! Simply go to https://developers.facebook.com/ and request developer access.  Then, you will need to create an app; the name doesn’t matter.  In the app, click on settings and you will see two very important numbers that YOU SHOULD NEVER SHARE WITH ANYONE, App ID and  App Secret.  Those are your login credentials that allow you to scrape FB.

I have saved you time scraping the posts and comments and all the data is located here. Now onto sentiment analysis!

First, let’s import all the necessary libraries.

from textblob import TextBlob
import pandas as pd
import numpy as np
import nltk
import sys
import os
import string
import unicodedata
import csv
from nltk.text import Text
from nltk.probability import FreqDist
import unicodedata
from collections import Counter

Now, let’s define our sentiment function and read in the data.

def sentiment(comment):
       analysis = TextBlob(comment) # create a textblob object
       if analysis.sentiment.polarity > 0:
               return 'positive'
       elif analysis.sentiment.polarity == 0:
               return 'neutral'
       else:
              return 'negative'
#---Read data---#
posts = pd.read_csv("POTUS_facebook_statuses.csv", error_bad_lines = False)
comments = pd.read_csv("comment.csv")

Let’s now see what’s the count for  likes, loves, hahas, angry, sad on 95 of his posts.

posts.describe() # this will print back to the screen the basic stats of the data.

I made dis pretty.

Screen Shot 2018-02-11 at 2.36.08 PM

So it seems like POTUS has quite a fan group.  Now, I wonder, are people using “Like”, “Angry”, “Haha”  properly? I guess there’s no way to know.

Let’s now call our sentiment function and used it on all the comments we found.
Again, this is super quick and dirty and I know for a fact there are both false positives and false negatives.

#—Get sentiment for all responses—#
for i in xrange(len(comments[“comment_message”])):
     item = comments[“comment_message”][i].decode(“unicode-escape”)
     result = sentiment(item)
     print result

result = float(result)
sent.append(result)

We can use the counter function on text and see what the total count is.

#—How many negatives and positives?—#
Counter(sent)

I made dis pretty again.

Screen Shot 2018-02-11 at 2.37.10 PM

I used my text mining code, click here to jump to it,  to a few of the topics being discussed in the comments. So, it seems the majority of the comments are negative while the posts seem to have many likes. This, once again, makes me wonder if FB users are using the like button correctly. How fun!

Happy coding!

Posted by:Aisha Pectyo

Astrophysicist turned data rockstar who speaks code and has enough yarn and mod podge to survive a zombie apocalypse.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s