Motif Detection From Audio In Hindustani Classical Music ...Motifs in Hindustani Music Melodic...
Transcript of Motif Detection From Audio In Hindustani Classical Music ...Motifs in Hindustani Music Melodic...
Motif Detection From Audio In
Hindustani Classical Music:
Methods And Evaluation Strategy
Joe Cheri Ross and Preeti Rao
IIT Bombay
Motifs in Hindustani Music
Melodic motifs or signature phrases are essential building
blocks in Indian Classical music.
Apart from the swaras that define the raga, it is the
characteristic phrases give it a unique identity [1]
Objective of the present work
Identify all occurrences of melodically similar phrases
in the song given a specific instance of the phrase
Audio example: ‘Jag Mein’ Bandish (Composition) Rendered by Pt. Ajoy Chakrabarty
Melodic contour extracted by PloyphonicPDA [3]
An Approach to Motif Detection
Segmentation: find the boundaries (in time) of
candidate phrases. What are the acoustic
cues?
Similarity matching: compute a “melodic
distance” between the given phrase and
candidate phrases. What is a good melodic
distance measure ?
A Prominent Motif: Mukhda phrase
Mukhda is the recurring title phrase of a „Bandish’
(Composition)
Why did we restrict ourselves to Mukhda phrases ? •The ease of marking ground truth based on lyrical
similarity
•The availability of cues to phrase location from the
rhythmic structure
Mukhda Phrases as seen on the pitch contour Song: Piya Jag Swaras: D P G P
Segmentation:
Characteristic of a Mukhda motif
Mukhda phrase has a specific location in the rhythmic cycle- around sam
Ex: Phrase 'Guru Bina' Starts 5 beats before sam (t1)
Ends at sam (t2)
This is the cue for identifying the candidate phrases Candidate phrase length dependent on the tempo at the instant
Mukhda Phrases on the Pitch Contour Song: Guru Bina Swaras: S S N R
Performance of Guru Bina by Pt. Ajoy Chakrabarty
Example
Identification of ‘Guru Bina’ phrase
Positive phrases
Negative phrase
Detects phrases melodically similar to „Guru Bina‟ pitch contour
Emphatic beat
sam
Swaras: S S N R
Example : ‘Piya Jag’ Phrases
Positive phrases
Negative phrase
Similarity Measures for time series
Symbolic Aggregate approXimation(SAX) [7]
Pitch sequence of each phrase is reduced to uniform length(w)
Euclidean distance between phrases is computed
Dynamic Time Warping(DTW) [6]
Finds similarity between sequences which vary in time or
speed
Sakoe-Chiba constraint is enabled to avoid any pathological
warping
1. Extract candidate phrases(same rhythmic structure) from
the song(pitch contour) by automatic detection of the sam
(or similar bols)
2. With the help of annotated ground truth, find the positive
phrases among the generated
3. Compare each positive candidate phrase with the all
phrases using similarity measures
Experiment To evaluate the performance of similarity measures
•The location of positive phrases is manually annotated in the song.
•The pitch sequence of the song (pitch value for each 10ms)
Experiments were done with quantized and un-quantized pitch
Dataset
Expt Bandish Singer #Phrases
POS NEG
A Guru Bina Pt. Bhimsen Joshi 156 715
B Guru Bina Ajoy Chakraborty 1056 9735
C Jana na na na Pt. Bhimsen Joshi 272 1649
D Piya Jaag Kishori Amonkar 1892 7744
E Guru Bina BJ vs AC 429 3835
'Piya Jaag' Distance Distribution
ROC of DTW and SAX
Song: ‘Piya Jaag’
(This work has been reported in Proc. ISMIR 2012 )
Hit rate- 87%
False Alarm- 3.2 %
Why it is Challenging ?
Melodically similar motifs may not occur at the same
location in the rhythmic cycle.
Make it difficult to identify right candidate phrases to be
compared with
Results in increase in number of candidate phrases, thus the
complexity
Extension to other phrases
Mukhda phrase: ‘Jag Mein Kachu’
Emphatic beat sam
Location of Mukhda phrases is consistent w.r.t to location of
emphatic beat sam in rhythmic cycle
Swaras: G-R-SNRS-N-D-N-S N-NDS
Non-Mukhda phrase N-D-S
•N-D-S is one of the prominent phrases in this bandish
•Location of phrases are not consistent in the rhythmic cycle
•Range of variations due to improvisations is high compared to Mukhda phrases.
Vistar(Variations) of the phrase N-D-S
• All these phrases are to be identified as similar motifs
• Phrase ending in Nyas swar(long note) S.
Long note S
Approaches
1. Identify motifs based on repeating patterns
2. Identify motifs based on potential segment
boundary cues and cluster
Approach 1:
Symbolic sequence is derived from the pitch contour
Crochemore algorithm[4,5] extracts repeating patterns
from the input symbolic sequence.
Complexity of algorithm- O(n log n)
n- length of sequence
Find repeating patterns from the symbolic sequence and
similar patterns are grouped together.
Approach 1:
Crochemore Algorithm
Crochemore algorithm extracts repeating patterns from
symbolic sequence.
Example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
{1,4,9,11,18}S
S R G S R G P G S R S R G P G P G S
{2,5,10,12}R {3,6,8,13,15,17}G {7,14,16}P
{1,4,9,11}SR {2,5,12}RG {10}RS {3,8,17}GS {6,8,13,15}GP
{1,4,11}SRG {9}SRS {3,8}GSR {6,13,15}GPG
{1}SRGS {4,11}SRGP {3}GSRG {8}GSRS {6,15}GPGS {13}GPGP
{4,11}SRGPG {6}GPGSR
Approach 1:
Experiment Method •Annotation of location of motifs and the belonging cluster.
•Symbolic sequence from the pitch contour
1. Crochemore algorithm can get the motifs at different levels
from the symbolic sequence
2. Remove short length motifs
3. With the help of annotated ground truth, find the purity
and rand index of clustering
Approach 2:
1. Pauses(Silence) occurs at major boundaries (lyrical
phrase boundaries)
2. Nyasa(Long notes) occurs at most of the boundaries
3. Recurring patterns
Cues to Segmentation:
Find motif boundaries with segmentation cues and cluster
similar motifs
Approach 2:
Experiment Method
1. Extract candidate phrases by segmentation from the
song(pitch contour)
2. Find similar motifs using similarity measures and
cluster(Agglomerative) them
3. With the help of annotated ground truth, find the purity
and rand index of clustering
•Annotation of the location of motifs and the belonging cluster.
•The pitch sequence of the song (pitch value for each 10ms)
Conclusion & Future Work
Detecting phrase motifs is challenging due to the inherent
variability. However:
Prominent swaras remains the same (Ex: N D S)
Explicit phrase segmentation cues need to be further explored
Time-series pattern matching methods may be extended
to motif discovery (i.e. no prior knowledge about motifs is
available)
References
[1] J. Chakravorty, B. Mukherjee and A. K. Datta: “Some Studies in Machine Recognition
of Ragas in Indian Classical Music,” Journal of the Acoust. Soc. India, Vol. 17, No.3&4,
1989.
[2] S. Rao, W. van der Meer and J. Harvey: “The Raga Guide: A Survey of 74 Hindustani
Ragas,” Nimbus Records with the Rotterdam Conservatory of Music, 1999.
[3] V. Rao and P. Rao: “Vocal Melody Extraction in the Presence of Pitched
Accompaniment in Polyphonic Music,” IEEE Trans. Audio Speech and Language
Processing, Vol. 18, No.8, 2010.
[4] M. Crochemore: “An Optimal Algorithm for Computing the Repetitions in a
Word,” Information Processing Letters, Vol.12, No.5, 1981.
[5] E. Cambouropoulos: “Musical parallelism and melodic segmentation: A computational approach,” Music Perception: An Interdisciplinary Journal, Vol.23, No.3, 2006
[6] D. Berndt and J. Clifford: “Using Dynamic Time Warping to Find Patterns in Time Series,” AAAI-94 Workshop on Knowledge Discovery in Databases, 1994.
[7] J. Lin, E. Keogh, S. Lonardi and B. Chiu: “A Symbolic Representation of Time Series, with Implications for Streaming Algorithms,” In Proc. of the Eighth ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, 2003.
[8] A. Mueen , E. Keogh , Q. Zhu and S. Cash: “Exact Discovery of Time Series Motifs,” Proc. of the SIAM International Conference on Data Mining, 2009.
[9] J. Ross, T.P. Vinutha and P.Rao: “Detecting Melodic Motifs From Audio For Hindustani Classical Music,” Proc. of Int. Soc. for Music Information Retrieval Conf. (ISMIR), 2012.