MUSI-6201 | Computational Music Analysis · 11/9/2015 · MUSI-6201 | Computational Music Analysis...
Transcript of MUSI-6201 | Computational Music Analysis · 11/9/2015 · MUSI-6201 | Computational Music Analysis...
overview intro similarity mood summary
MUSI-6201 — Computational Music AnalysisPart 9.2: Music Similarity and Mood Recognition
alexander lerch
November 11, 2015
overview intro similarity mood summary
temporal analysisoverview
text book
Chapter 8: Musical Genre, Similarity, and Mood (pp. 156–162)
sources: slides (latex) & Matlab
github repository
lecture contentmusic similarityk-means clustering and SOMsmood and emotionlinear regression
overview intro similarity mood summary
temporal analysisoverview
text book
Chapter 8: Musical Genre, Similarity, and Mood (pp. 156–162)
sources: slides (latex) & Matlab
github repository
lecture contentmusic similarityk-means clustering and SOMsmood and emotionlinear regression
overview intro similarity mood summary
temporal analysisoverview
text book
Chapter 8: Musical Genre, Similarity, and Mood (pp. 156–162)
sources: slides (latex) & Matlab
github repository
lecture contentmusic similarityk-means clustering and SOMsmood and emotionlinear regression
overview intro similarity mood summary
temporal analysisoverview
text book
Chapter 8: Musical Genre, Similarity, and Mood (pp. 156–162)
sources: slides (latex) & Matlab
github repository
lecture contentmusic similarityk-means clustering and SOMsmood and emotionlinear regression
overview intro similarity mood summary
temporal analysisoverview
text book
Chapter 8: Musical Genre, Similarity, and Mood (pp. 156–162)
sources: slides (latex) & Matlab
github repository
lecture contentmusic similarityk-means clustering and SOMsmood and emotionlinear regression
overview intro similarity mood summary
similarity and moodintroduction
commonalities with genre classification
similar set of featuresambiguous ’ground truth’unclear value/impact of low level and high level features
differences to genre classification
mood : often not a classification but regressionsimilarity : distance measure instead of categorizing into classes
overview intro similarity mood summary
similarity and moodintroduction
commonalities with genre classification
similar set of featuresambiguous ’ground truth’unclear value/impact of low level and high level features
differences to genre classification
mood : often not a classification but regressionsimilarity : distance measure instead of categorizing into classes
overview intro similarity mood summary
audio similarityintroduction
perception of music similarity
multi-dimensional (melodic, rhythmic, sound quality, . . . )user dependentassociative, may also depend on editorial datamay be context dependent
genres are clusters of musical similarity
⇒ genre classification is a special case of audio similaritymeasures
instead of assigning (genre) labels, the similarity/distancebetween (pairs) of files is measured
overview intro similarity mood summary
audio similarityintroduction
perception of music similarity
multi-dimensional (melodic, rhythmic, sound quality, . . . )user dependentassociative, may also depend on editorial datamay be context dependent
genres are clusters of musical similarity
⇒ genre classification is a special case of audio similaritymeasures
instead of assigning (genre) labels, the similarity/distancebetween (pairs) of files is measured
overview intro similarity mood summary
audio similarityintroduction
perception of music similarity
multi-dimensional (melodic, rhythmic, sound quality, . . . )user dependentassociative, may also depend on editorial datamay be context dependent
genres are clusters of musical similarity
⇒ genre classification is a special case of audio similaritymeasures
instead of assigning (genre) labels, the similarity/distancebetween (pairs) of files is measured
overview intro similarity mood summary
audio similarityK-Means clustering example
simple k-means example
goal: minimize intra-cluster variancedistance: Euclideanprocedure:
1 initialization:randomly select K points in the feature space as initialization.
2 assignment:assign each observation to the cluster with the mean/centroidof the closest cluster.
3 update:compute mean/centroid for each cluster.
4 iteration:go to step 2 until the clusters converge.
overview intro similarity mood summary
audio similarityK-Means clustering example
simple k-means example
goal: minimize intra-cluster variancedistance: Euclideanprocedure:
1 initialization:randomly select K points in the feature space as initialization.
2 assignment:assign each observation to the cluster with the mean/centroidof the closest cluster.
3 update:compute mean/centroid for each cluster.
4 iteration:go to step 2 until the clusters converge.
overview intro similarity mood summary
audio similarityK-Means clustering example
simple k-means example
goal: minimize intra-cluster variancedistance: Euclideanprocedure:
1 initialization:randomly select K points in the feature space as initialization.
2 assignment:assign each observation to the cluster with the mean/centroidof the closest cluster.
3 update:compute mean/centroid for each cluster.
4 iteration:go to step 2 until the clusters converge.
overview intro similarity mood summary
audio similarityK-Means clustering example
simple k-means example
goal: minimize intra-cluster variancedistance: Euclideanprocedure:
1 initialization:randomly select K points in the feature space as initialization.
2 assignment:assign each observation to the cluster with the mean/centroidof the closest cluster.
3 update:compute mean/centroid for each cluster.
4 iteration:go to step 2 until the clusters converge.
overview intro similarity mood summary
audio similarityK-Means clustering example
data points
ma
tla
bso
urc
e:m
atl
ab
/d
isp
layK
Mea
ns.
m
overview intro similarity mood summary
audio similarityK-Means clustering example
init
assignment
ma
tla
bso
urc
e:m
atl
ab
/d
isp
layK
Mea
ns.
m
overview intro similarity mood summary
audio similarityK-Means clustering example
update centroids
assignment
ma
tla
bso
urc
e:m
atl
ab
/d
isp
layK
Mea
ns.
m
overview intro similarity mood summary
audio similarityK-Means clustering example
update centroids
assignment
ma
tla
bso
urc
e:m
atl
ab
/d
isp
layK
Mea
ns.
m
overview intro similarity mood summary
audio similarityK-Means clustering example
update centroids
assignment
ma
tla
bso
urc
e:m
atl
ab
/d
isp
layK
Mea
ns.
m
overview intro similarity mood summary
audio similarityK-Means clustering example
convergence
ma
tla
bso
urc
e:m
atl
ab
/d
isp
layK
Mea
ns.
m
overview intro similarity mood summary
audio similarityvisualization in a 2D space
problemfeature space is high-dimensional
→ cannot be visualized
find mapping to 2D “preserving” (high-dimensional) distancemetrics
overview intro similarity mood summary
audio similarityvisualization in a 2D space
problemfeature space is high-dimensional
→ cannot be visualized
find mapping to 2D “preserving” (high-dimensional) distancemetrics
overview intro similarity mood summary
audio similarityvisualization example: SOM 1/2
1 create a map with ’neurons’2 train
for each training sample find BMU (best matching unit)adapt BMU and neighbors toward training sample
Wv (t + 1) = Wv (t) + θ(v , t)α(t)(D(t)−Wv (t)
)θ(v , t): depends on distance from BMUα(t): learning restraintD(t) training sample
overview intro similarity mood summary
audio similarityvisualization example: SOM 1/2
1 create a map with ’neurons’2 train
for each training sample find BMU (best matching unit)adapt BMU and neighbors toward training sample
Wv (t + 1) = Wv (t) + θ(v , t)α(t)(D(t)−Wv (t)
)θ(v , t): depends on distance from BMUα(t): learning restraintD(t) training sample
https://en.wikipedia.org/wiki/Self-organizing_map
overview intro similarity mood summary
audio similaritySOM 2/2
from11
E. Pampalk, “Islands of Music,” Diploma Thesis, Technische Universitat Wien, 2001.
overview intro similarity mood summary
mood recognitionintroduction
objective:identify mood/emotion of a song
terminology:
Music Mood Recognition and Music Emotion Recognitionusually used simultaneously
What is the difference between mood andemotion
emotion:
temporary, evanescent(directly) related to eternal stimuli
mood :
longer term, stablediffuse affect state
overview intro similarity mood summary
mood recognitionintroduction
objective:identify mood/emotion of a song
terminology:
Music Mood Recognition and Music Emotion Recognitionusually used simultaneously
What is the difference between mood andemotion
emotion:
temporary, evanescent(directly) related to eternal stimuli
mood :
longer term, stablediffuse affect state
overview intro similarity mood summary
mood recognitionchallenges
ground truth dataverbalization of emotions/moods usually misleadingnot easily quantifiable/categorizablechange over time?
research focuscan established basic emotions (happiness, anger, fear, . . . )used for music perceptionaroused vs. transported moods?
overview intro similarity mood summary
mood recognitionchallenges
ground truth dataverbalization of emotions/moods usually misleadingnot easily quantifiable/categorizablechange over time?
research focuscan established basic emotions (happiness, anger, fear, . . . )used for music perceptionaroused vs. transported moods?
overview intro similarity mood summary
mood recognitionchallenges
ground truth dataverbalization of emotions/moods usually misleadingnot easily quantifiable/categorizablechange over time?
research focuscan established basic emotions (happiness, anger, fear, . . . )used for music perceptionaroused vs. transported moods?
overview intro similarity mood summary
mood recognitionmodels
classification into label clusters
Cluster 1 Cluster 2 Cluster 3 Cluster 4 Cluster 5Rowdy Amiable/Good Natured Literate Witty Volatile
Rousing Sweet Wistful Humorous FieryConfident Fun Bittersweet Whimsical VisceralBoisterous Rollicking Autumnal Wry AggressivePassionate Cheerful Brooding Campy Tense/Anxious
Poignant Quirky IntenseSilly
mood model, circumplex model most common2
2J. A. Russel, “A Circumplex Model of Affect,” Journal of Personality and Social Psychology, vol. 39, no. 6,
pp. 1161–1178, 1980, issn: 1939-1315(Electronic);0022-3514(Print). doi: 10.1037/h0077714.
overview intro similarity mood summary
mood recognitionmodels
classification into label clustersmood model, circumplex model most common2
2J. A. Russel, “A Circumplex Model of Affect,” Journal of Personality and Social Psychology, vol. 39, no. 6,
pp. 1161–1178, 1980, issn: 1939-1315(Electronic);0022-3514(Print). doi: 10.1037/h0077714.
overview intro similarity mood summary
mood recognitionmood model: regression modeling
mapping(N-dimensional) observation (feature) to 2-dimensionalcoordinate (valence/arousal)
trainingfind model to minimize error between data points and“prediction”
overview intro similarity mood summary
linear regressionintroduction to regression 1/2
fit a linear function to a series of points (xj , yj)
yn = m · xn + b
https://en.wikipedia.org/wiki/Linear_regression
overview intro similarity mood summary
linear regressionintroduction to regression 2/2
minimize error between model and data (here: least squares)
e2n = (yn −mxn − b)2
E =∑
(yn −mxn − b)2
∂E
∂b=
∑−2(yn −mxn − b) = 0
− 2∑
yn + 2∑
mxn + 2∑
b = 0∑mxn +
∑b =
∑yn
m∑
xn +Nb =∑
yn
∂E
∂m=
∑−2xn(yn −mxn − b) = 0
− 2∑
xnyn + 2∑
mx2n + 2
∑bxn = 0∑
mx2n +
∑bxn =
∑xnyn
m∑
x2n + b
∑xn =
∑xnyn
⇒
m =N
∑xnyn −
∑xn
∑yn
N∑
x2n − (
∑xn)2
b =
∑yn
N−m
∑xn
N
overview intro similarity mood summary
linear regressionintroduction to regression 2/2
minimize error between model and data (here: least squares)
e2n = (yn −mxn − b)2
E =∑
(yn −mxn − b)2
∂E
∂b=
∑−2(yn −mxn − b) = 0
− 2∑
yn + 2∑
mxn + 2∑
b = 0∑mxn +
∑b =
∑yn
m∑
xn +Nb =∑
yn
∂E
∂m=
∑−2xn(yn −mxn − b) = 0
− 2∑
xnyn + 2∑
mx2n + 2
∑bxn = 0∑
mx2n +
∑bxn =
∑xnyn
m∑
x2n + b
∑xn =
∑xnyn
⇒
m =N
∑xnyn −
∑xn
∑yn
N∑
x2n − (
∑xn)2
b =
∑yn
N−m
∑xn
N
overview intro similarity mood summary
linear regressionintroduction to regression 2/2
minimize error between model and data (here: least squares)
e2n = (yn −mxn − b)2
E =∑
(yn −mxn − b)2
∂E
∂b=
∑−2(yn −mxn − b) = 0
− 2∑
yn + 2∑
mxn + 2∑
b = 0
∑mxn +
∑b =
∑yn
m∑
xn +Nb =∑
yn
∂E
∂m=
∑−2xn(yn −mxn − b) = 0
− 2∑
xnyn + 2∑
mx2n + 2
∑bxn = 0
∑mx2
n +∑
bxn =∑
xnyn
m∑
x2n + b
∑xn =
∑xnyn
⇒
m =N
∑xnyn −
∑xn
∑yn
N∑
x2n − (
∑xn)2
b =
∑yn
N−m
∑xn
N
overview intro similarity mood summary
linear regressionintroduction to regression 2/2
minimize error between model and data (here: least squares)
e2n = (yn −mxn − b)2
E =∑
(yn −mxn − b)2
∂E
∂b=
∑−2(yn −mxn − b) = 0
− 2∑
yn + 2∑
mxn + 2∑
b = 0∑mxn +
∑b =
∑yn
m∑
xn +Nb =∑
yn
∂E
∂m=
∑−2xn(yn −mxn − b) = 0
− 2∑
xnyn + 2∑
mx2n + 2
∑bxn = 0∑
mx2n +
∑bxn =
∑xnyn
m∑
x2n + b
∑xn =
∑xnyn
⇒
m =N
∑xnyn −
∑xn
∑yn
N∑
x2n − (
∑xn)2
b =
∑yn
N−m
∑xn
N
overview intro similarity mood summary
linear regressionintroduction to regression 2/2
minimize error between model and data (here: least squares)
e2n = (yn −mxn − b)2
E =∑
(yn −mxn − b)2
∂E
∂b=
∑−2(yn −mxn − b) = 0
− 2∑
yn + 2∑
mxn + 2∑
b = 0∑mxn +
∑b =
∑yn
m∑
xn +Nb =∑
yn
∂E
∂m=
∑−2xn(yn −mxn − b) = 0
− 2∑
xnyn + 2∑
mx2n + 2
∑bxn = 0∑
mx2n +
∑bxn =
∑xnyn
m∑
x2n + b
∑xn =
∑xnyn
⇒
m =N
∑xnyn −
∑xn
∑yn
N∑
x2n − (
∑xn)2
b =
∑yn
N−m
∑xn
N
overview intro similarity mood summary
linear regressionintroduction to regression 2/2
minimize error between model and data (here: least squares)
e2n = (yn −mxn − b)2
E =∑
(yn −mxn − b)2
∂E
∂b=
∑−2(yn −mxn − b) = 0
− 2∑
yn + 2∑
mxn + 2∑
b = 0∑mxn +
∑b =
∑yn
m∑
xn +Nb =∑
yn
∂E
∂m=
∑−2xn(yn −mxn − b) = 0
− 2∑
xnyn + 2∑
mx2n + 2
∑bxn = 0∑
mx2n +
∑bxn =
∑xnyn
m∑
x2n + b
∑xn =
∑xnyn
⇒
m =N
∑xnyn −
∑xn
∑yn
N∑
x2n − (
∑xn)2
b =
∑yn
N−m
∑xn
N
overview intro similarity mood summary
mood recognitionrange of results
5 mood clusters:40–60% classification rate
mood model:0.1–0.4 absolute prediction error (unit circle)
overview intro similarity mood summary
summarylecture content
1 what are typical use cases for music similarity and moodrecognition
2 discuss advantages and disadvantages of using classes vs thecircumplex model for mood recognition
overview intro similarity mood summary
summarylecture content
1 what are typical use cases for music similarity and moodrecognition
2 discuss advantages and disadvantages of using classes vs thecircumplex model for mood recognition