MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

40
MusicMood Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics Sebastian Raschka December 10, 2014

Transcript of MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Page 1: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

MusicMoodMachine Learning in Automatic Music Mood Prediction

Based on Song Lyrics

Sebastian Raschka December 10, 2014

Page 2: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Music Mood Prediction

• We like to listen to music [1][2]

• Digital music libraries are growing

• Recommendation system for happy music

(clinics, restaurants ...) & genre selection

[1]  Thomas Schaefer, Peter Sedlmeier, Christine Sta ̈dtler, and David Huron. The psychological functions of music listening. Frontiers in psychology, 4, 2013.

[2]  Daniel Vaestfjaell. Emotion induction through music: A review of the musical mood induction procedure. Musicae Scientiae, 5(1 suppl):173–211, 2002.

Page 3: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Predictive Modeling

Unsupervised learning

Supervised learning

Regression ClassificationClustering

Reinforcement learning

Ranking Hidden Markov models

DBSCAN on a toy dataset Naive Bayes on Iris (after LDA)

Page 4: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Supervised Learning In a Nutshell

Page 5: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Feature Extraction

Feature Selection

Dimensionality Reduction

Normalization

Raw Data Collection

Pre-processing

Sampling

Test Dataset

Training Dataset

Training Learning Algorithms

Post-Processing

Cross Validation

Final Classification/Regression Model

New DataPre-processing

RefinementPrediction

Split

Supervised Learning

Sebastian Raschka 2014

Missing Data

Prediction-error Metrics

Model Selection

Hyperparameter optimization

This work is licensed under a Creative Commons Attribution 4.0 International License.

Supervised Learning - A Quick Overview

Page 6: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

MusicMood - The Plan

Page 7: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

http://labrosa.ee.columbia.edu/millionsong/

The Dataset

Page 8: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Lyrics available? Lyrics in English?

Sampling

200 songs for validation

1000 songs for training

http://lyrics.wikia.com/Lyrics_Wiki Python NLTK

Page 9: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Mood Labels

Downloading mood labels from Last.fm Manual labeling based on lyrics and listening

• Dark topic (killing, war, complaints about politics, ...) • Artist in sorrow (lost love, ...)

sad if ...

Page 10: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics
Page 11: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Word Clouds

happy:

sad:

Page 12: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

A Short Introduction to Naive Bayes Classification

Page 13: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Naive Bayes - Why?• Small sample size, can outperform the more powerful

alternatives  [1]

• "Eager learner" (on-line learning vs. batch learning)

• Fast for classification and re-training

• Success in Spam Filtering [2]

• High accuracy for predicting positive and negative classes in a sentiment analysis of Twitter data [3]

[1]   Pedro Domingos and Michael Pazzani. On the optimality of the simple bayesian classifier under zero-one loss. Machine learning, 29(2-3):103–130, 1997.

[2]   Mehran Sahami, Susan Dumais, David Heckerman, and Eric Horvitz. A bayesian approach to filtering junk e-mail. In Learning for Text Categorization: Papers from the 1998 workshop, volume 62, pages 98–105, 1998.

[3]  Alec Go, Richa Bhayani, and Lei Huang. Twitter sentiment classification using distant supervision. CS224N Project Report, Stanford, pages 1–12, 2009.

Page 14: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Bayes Classifiers It’s All About Posterior Probabilities

objective function: maximize the posterior probability

Page 15: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

The Prior Probability

Maximum Likelihood Estimate (MLE)

Page 16: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

The Effect of Priors on the Decision Boundary

Page 17: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Class-Conditional Probability

Maximum Likelihood Estimate (MLE)

"chance of observing feature given that it belongs to class " "

Page 18: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Evidence

just a normalization factor, can be omitted in decision rule:

Page 19: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Naive Bayes ModelsGaussian Naive Bayes

for continuous variables

Page 20: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Naive Bayes ModelsMulti-variate Bernoulli Naive Bayes

for binary features

Page 21: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Naive Bayes ModelsMultinomial Naive Bayes

Page 22: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Naive Bayes and Text Classification

Page 23: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

The Bag of Words Model

Feature Vectors

Page 24: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Tokenization and N-grams

“a swimmer likes swimming thus he swims”

Page 25: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Stemming and Lemmatization

Porter Stemming

Lemmatization

Page 26: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Stop Word Removal

Page 27: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Term and Frequency

Page 28: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Term Frequency - Inverse Document Frequency (Tf-idf)

Page 29: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

• TP = true positive (happy predicted as happy) • FP = false positive (sad predicted as happy) • FN = false negative (happy predicted as sad)

Grid Search and 10-fold Cross Validation to Optimize F1

Page 30: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

K-Fold Cross Validation

Page 31: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

10-Fold Cross Validation After Grid Search

(final model)

Page 32: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

10-fold Cross Validation (mean ROC) Multinomial vs Multi-variate Bernoulli Naive Bayes

Page 33: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

10-fold Cross Validation (mean ROC) Multinomial Naive Bayes & Hyperparameter Alpha

Page 34: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

10-fold Cross Validation (mean ROC) Multinomial Naive Bayes & Vocabulary Size

Page 35: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

10-fold Cross Validation (mean ROC) Multinomial Naive Bayes & Document Frequency Cut-off

Page 36: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

10-fold Cross Validation (mean ROC) Multinomial Naive Bayes & N-gram Sequence Length

Page 37: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Contingency Tables of the Final Model

training test

Page 38: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Live Demo

http://sebastianraschka.com/Webapps/musicmood.html

Page 39: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Future Plans• Growing a list of mood labels (majority rule).

• Performance comparisons of different machine learning algorithms.

• Genre prediction and selection based on sound.

Page 40: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Thank you!