Some of you may or may not know I am obsessed with skincare. Now, you may think I am exaggerating, maybe she only has a couple products. I have 50-75 products.  It’s a problem. It goes without saying, I do not put that many products on a given day. My routine consists of 6-10 steps twice a day. r/SkincareAddiction has been a big part of my love for skincare. It’s such a welcoming and amazing community – it’s what the internet is supposed to be about!

I am not sure why I didn’t do this earlier, but I figured I’d pull up my sleeves and crawl the subreddit comments and paint a picture of what goes on in there – skincare wise! This post won’t be code-based at all. If you want a link to the full script, please click here.  Scroll to the bottom for a summarized infographic.

The data was obtained via the Reddit API using praw, psaw and pushshift API. I removed all comments by the AutoModerator and any duplicates.

 

first look

The first thing I wanted to look at was word count.  I know I can get carried away when I reply to reddit posts.

wordCount

This is a distribution plot. It is an ideal visualization to understand the shape or the range of your data. Here we can most comments are below the 46 words mark and after that they start trailing off.

I wanted to do the same thing for the length of the words used.

wordLen.png

On average, the words used contain 3-5 characters. Most of the bulk words in our day to day language are short words. No surprise here.

Then, I wanted to look at the overall frequency words to start to get an idea of the granularity of the comments.

wordFreq

The plot above already can tell us a lot about the favorite topics of this subreddit. The first two aren’t surprising and honestly, I should have removed them from the sample as we know skin and face will be popular words.  Acids, routines, moisturizers, sunscreens, cleansers are incredibly popular with thousands of mentions.  Acne may be a big issue for this subreddit, but at this stage, just with word frequencies, it is too early to tell.

deep dive: top brands, key parts of speech

I don’t have a dictionary to tell me the names of each brand out there; hence, I had to come up with a way to flag possible brand names.  Traditionally, when you do most NLP analyses you want to quickly clean your data by lowercasing it or removing stopwords. While I will do that, I wanted to look at the part of speech before doing any cleaning. Most people will capitalize brand names; hence, if I leave my text as is (hoping reddit peeps have good grammar!), my script should be able to flag those capitalized terms and help me decide what’s a brand or product name.  Doing that revealed the following:

commonProds

I find this hilarious because it is SO TRUE. r/SkincareAddiction LOVES CeraVe. My husband loves it too, I like their body stuff a lot. Personally, I am not a huge fan of The Ordinary, but people loves those too.  My favorite skincare brands are Farmacy, MISSHA and Dr. Jart+.

Once I got this information, I did go ahead and clean up the text data to reveal key nouns, adjectives, verbs and adverbs.

POS

From this plot, we can start to see a story.  It seems that most people in the subreddit struggle with dry and sensitive skin. They’re addressing this issues by incorporating or trying or recommending asking the subreddit about moisturizers, acids, cleansers, how long to see results, etc.

At this point, I wanted to get a more concise idea of what topics were being discussed so I ran an LDA model to cluster the text into groups. These were the main themes:

  1. Dry and oily skin types going to the subreddit for recommendations or discuss their routines. Specifically to discuss moisturizers, oils and cleansers.
  2.  Thank the community for their support along their skincare journey.
  3. Which sunscreens to buy, which one is best for your skin type and how to use them.
  4. Acne journey, products, recommendations.
  5. Incorporating acids, serums, niacinamide into your routine.
  6. Misc. skin issues.

deep dive: skin type differences

I wanted to go a step deeper and look at the two main skin types and see what the subreddit had to say about them. I filtered the comments by checking which comments contained “dry skin” or “oily skin”. Below are the key words for each skin type.

drycombooilycombo

I had dry skin so I can see terms kike hormonal or sensitive apply to dry skin. And yes, as much as I love winter, they are terrible for dry skin. It makes sense dry-skin-peeps are going to the subreddit to get advice on how to cope with dry skin during the winter time. Also, screw fragance (lol!).

We see a similar story to oily skin. It can be sensitive and it IS GENETIC. Don’t feel ashamed of your acne.  We see a lot of keywords related to pores , redness, oil control, etc.

The last thing I wanted to explore was brands by skin type. When I looked at the distribution of brands per skin type this is what I found:

Dry Skin:

  • The Ordinary
  • CeraVe
  • Neutrogena
  • Cetaphil
  • La Roche-Pose
  • Vaseline
  • Differin
  • Vanicream
  • Hada Labo

Oily Skin:

  • The Ordinary
  • CeraVe
  • Neutrogena
  • Cetaphil
  • Paula’s Choice
  • Witch Hazel (Brand?)

The Ordinary was the top brand for both skin types and I wanted to know which products specifically were associated with dry and oily skin. To do this, I combined trigrams and parts of speech.  I made sure my trigrams contained the words dry skin and/or oily skin and that the following word as a noun. This is what it revealed (I may not get the exact names, so I apologize, I tried to get as close as I could):

TO Products for Dry Skin:

  • Retinol Squalane 1%
  • AHA 30% BHA 2% Peeling Solution
  • 100% Plant-Based Squalane
  • 100% Rose Hip Seed Oil
  • Vitamin C
  • Buffet
  • Toner (couldn’t get any more specifics)
  • Hyaluronic Acid

TO Products for Oily Skin:

  • AHA 30% BHA 2% Peeling Solution
  • Caffeine Solution
  • Hyaluronic Acid
  • NMF
  • Retinol Squalane 1%
  • Glycolic Acid 7% Toning Solution
  • Niacinamide
  • Azelaic Acid

 

This it for now! Hope you’ve enjoyed this post as much as I enjoyed making it 🙂

infographic summary

Skincare

Posted by:Aisha Pectyo

Astrophysicist turned data rockstar who speaks code and has enough yarn and mod podge to survive a zombie apocalypse.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s