LAM: Musical Audio Similarity
description
Transcript of LAM: Musical Audio Similarity
![Page 1: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/1.jpg)
LAM: Musical Audio Similarity
Michael CaseyCentre for Cognition, Computation and Culture
Department of ComputingGoldsmiths College, University of London
![Page 2: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/2.jpg)
Overview• Machine Music Understanding
• Features / Classes / Clusters
• Real-Time Audio Matching• Feature Extraction• Feature Similarity (Indexing / Retrieval)• PD/MSP Tools
• Music Similarity Applications• Sound object matching• Texture matching
![Page 3: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/3.jpg)
Sound Understanding
Signal Processing Sound Understanding
![Page 4: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/4.jpg)
Feature Extraction
frame 2
frame 3
overlapframe 1
audiosource
20ms10ms 30ms 40ms
![Page 5: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/5.jpg)
Feature Extraction
frame 2
frame 3
overlapframe 1
audiosource
20ms10ms 30ms 40ms
![Page 6: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/6.jpg)
Feature Extraction
frame 2
frame 3
overlapframe 1
audiosource
20ms10ms 30ms 40ms
![Page 7: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/7.jpg)
Feature Extraction
frame 2
frame 3
overlapframe 1
audiosource
20ms10ms 30ms 40ms
![Page 8: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/8.jpg)
Feature Extraction
frame 2
frame 3
overlapframe 1
audiosource
20ms10ms 30ms 40ms
![Page 9: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/9.jpg)
Feature Extraction
frame 2
frame 3
overlapframe 1
audiosource
20ms10ms 30ms 40ms
![Page 10: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/10.jpg)
p( | ) * P( )
Statistical Learningfor Decision Making
Decision boundary
Partitioning of feature space
P( | )= p( )
MusicSpeech
![Page 11: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/11.jpg)
MPEG-7 Audio Tools
Audio
![Page 12: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/12.jpg)
MPEG-7 Audio Tools
Log FrequencySpectrogramAudio
AudioSpectrumEnvelopeD
![Page 13: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/13.jpg)
MPEG-7 Audio Tools
Log FrequencySpectrogramAudio Log
AmplitudeDecorrelatingTransform /
Dimension Reduction
AudioSpectrumEnvelopeD
AudioSpectrumProjectionD
![Page 14: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/14.jpg)
SoundModelStatePathD
State Path
Use estimated state sequence as a feature
![Page 15: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/15.jpg)
MPEG-7 Audio Tools
Log FrequencySpectrogramAudio Log
AmplitudeDecorrelatingTransform /
Dimension Reduction
AudioSpectrumEnvelopeD
AudioSpectrumProjectionD
Hidden MarkovModel
SoundModelDS
![Page 16: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/16.jpg)
MPEG-7 Audio StringsAcoustic Lexicons
Log FrequencySpectrogramAudio Log
AmplitudeDecorrelatingTransform /
Dimension Reduction
AudioSpectrumEnvelopeD
AudioSpectrumProjectionD
Hidden MarkovModel
SoundModelDS StatePath
? 7 1 V 7 1 0 1 ...
SoundModelStatePathD
SYMBOL STRING
![Page 17: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/17.jpg)
![Page 18: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/18.jpg)
State Symbol Sequence (40 State Model)
?71V
7101 .
..
![Page 19: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/19.jpg)
State Symbol Sequence (40 State Model)
?71V
7101 .
..
![Page 20: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/20.jpg)
State Symbol Sequence (40 State Model)
?71V
7101 .
..
![Page 21: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/21.jpg)
State Symbol Sequence (40 State Model)
?71V
7101 .
..
![Page 22: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/22.jpg)
SoundModelStateHistogramD
seconds
stat
e in
dex
stat
e in
dex
0.01s Frames
![Page 23: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/23.jpg)
Self-Similarity Matrix
![Page 24: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/24.jpg)
Self-Similarity Matrix
![Page 25: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/25.jpg)
Self-Similarity Matrix
|||||||||cos, 1
babaT
ba
![Page 26: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/26.jpg)
Self-Similarity Matrix
|||||||||cos, 1
babaT
ba
a
![Page 27: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/27.jpg)
Self-Similarity Matrix
|||||||||cos, 1
babaT
ba
a
b
![Page 28: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/28.jpg)
Self-Similarity Matrix
|||||||||cos, 1
babaT
ba
a
b
![Page 29: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/29.jpg)
Self-Similarity Matrix
|||||||||cos, 1
babaT
ba
![Page 30: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/30.jpg)
S-Matrix
![Page 31: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/31.jpg)
Efficient Storage / Retrieval
• Real-Time Access
• Large Databases
• Distributed Databases
![Page 32: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/32.jpg)
PostgreSQL Database Representation of State Path “Strings” and Histograms
![Page 33: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/33.jpg)
Similarity
• Compute distance between feature pairs• Features == SoundModelStateHistogramD
• Similarity Metric•dist(a,b) >= 0•dist(a,b)== 0 iff a==b•dist(a,b) + dist(b,c) >= dist(a,c)
• Vector Dot Product
|||||||||cos, 1
babaT
ba
![Page 34: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/34.jpg)
Similarity of Feature Trajectories
![Page 35: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/35.jpg)
Dynamic Time Warping
![Page 36: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/36.jpg)
Acousticon Strings
• Distance Metric– String Edit Distance (Levenschtein)
• Scalable to Large Databases– PostgreSQL Implementation– Can use built-in Index Structures
• Scalable to Real-Time Implementation– matching and audio streaming (< 20ms )
![Page 37: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/37.jpg)
Information Retrievalfor Creativity
• Utilize sound extant database for new material
• Take the structure of a music clip but replace the content.
• New interfaces for music creativity.
![Page 38: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/38.jpg)
Audio Information Retrieval
MPEG-7Database
A pre-indexed Collection of Sounds
![Page 39: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/39.jpg)
Audio Query Extract
MPEG-7Database
Segment Match
Result ListA Sound or Scene orList of Sounds
Audio Information Retrieval
![Page 40: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/40.jpg)
Audio Query Extract
MPEG-7Database
Segment Match
Result ListFeature extractionfrom audio.
Audio Information Retrieval
![Page 41: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/41.jpg)
Audio Query Extract
MPEG-7Database
Segment Match
Result ListPartitioningof audio intochunks.
Audio Information Retrieval
![Page 42: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/42.jpg)
Audio Query Extract
MPEG-7Database
Segment Match
Result List
Find similar chunksof Audio
Audio Information Retrieval
![Page 43: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/43.jpg)
Real-Time Matching
![Page 44: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/44.jpg)
MusaicsReal-Time Matching
![Page 45: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/45.jpg)
MusaicsReal-Time MatchingReal-Time Matching
![Page 46: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/46.jpg)
MusaicsReal-Time Matching
![Page 47: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/47.jpg)
MusaicsReal-Time Matching
![Page 48: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/48.jpg)
MusaicsReal-Time Matching
![Page 49: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/49.jpg)
MusaicsReal-Time Matching
![Page 50: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/50.jpg)
MusaicsReal-Time Matching
![Page 51: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/51.jpg)
MusaicsReal-Time Matching
![Page 52: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/52.jpg)
MusaicsReal-Time Matching
![Page 53: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/53.jpg)
MusaicsReal-Time Matching
![Page 54: LAM: Musical Audio Similarity](https://reader035.fdocuments.net/reader035/viewer/2022070500/56816851550346895dde5798/html5/thumbnails/54.jpg)