Video-based Vibrato Detection and Analysisfor Polyphonic ...€¦ · Pitch Vibrato Detection &...

30
Video-based Vibrato Detection and Analysis for Polyphonic String Music Bochen Li, Karthik Dinesh, Gaurav Sharma, Zhiyao Duan Audio Information Research Lab University of Rochester The 18 th International Society for Music Information Retrieval Oct 23-27, 2017 Suzhou, China 1 18 TH INTERNATIONAL SOCIETY FOR MUSIC INFORMATION RETRIEVAL, OCT 23 - 2 7, 2017, SUZHOU, CHINA

Transcript of Video-based Vibrato Detection and Analysisfor Polyphonic ...€¦ · Pitch Vibrato Detection &...

Page 1: Video-based Vibrato Detection and Analysisfor Polyphonic ...€¦ · Pitch Vibrato Detection & Analysis for polyphonicmusic played by string instruments. Introduction: Prior Audio-based

Video-based Vibrato Detection and Analysis for Polyphonic String Music

Bochen Li, Karthik Dinesh, Gaurav Sharma, Zhiyao Duan

Audio Information Research LabUniversity of Rochester

The 18th International Society for Music Information Retrieval

Oct 23-27, 2017Suzhou, China

118TH INTERNATIONAL SOCIETY FOR MUSIC INFORMATION RETRIEVAL, OCT 23-27, 2017, SUZHOU, CHINA

Page 2: Video-based Vibrato Detection and Analysisfor Polyphonic ...€¦ · Pitch Vibrato Detection & Analysis for polyphonicmusic played by string instruments. Introduction: Prior Audio-based

Introduction: Vibrato in Music

218TH INTERNATIONAL SOCIETY FOR MUSIC INFORMATION RETRIEVAL, OCT 23-27, 2017, SUZHOU, CHINA

• Important artistic effect• Pitch modulation of a note in a periodic fashion• Characterized by Rate & Extent Spectrogram

Audio

Vibrato

Non-vibrato

Applications of Vibrato Analysis

• Musicological studies• Sound synthesis• Voice extraction

Page 3: Video-based Vibrato Detection and Analysisfor Polyphonic ...€¦ · Pitch Vibrato Detection & Analysis for polyphonicmusic played by string instruments. Introduction: Prior Audio-based

Introduction: Problem Statement

318TH INTERNATIONAL SOCIETY FOR MUSIC INFORMATION RETRIEVAL, OCT 23-27, 2017, SUZHOU, CHINA

• Note-level vibrato/non-vibrato classification

Vibrato Detection

Vibrato Analysis

• Vibrato rate: speed of pitch variation (1/T Hz)

• Vibrato extent: amount of pitch variation (A cents)

T

A

Pitch

Time

Time

Pitch

Vibrato Detection & Analysis for polyphonic music played by string instruments

Page 4: Video-based Vibrato Detection and Analysisfor Polyphonic ...€¦ · Pitch Vibrato Detection & Analysis for polyphonicmusic played by string instruments. Introduction: Prior Audio-based

Introduction: Prior Audio-based Methods

418TH INTERNATIONAL SOCIETY FOR MUSIC INFORMATION RETRIEVAL, OCT 23-27, 2017, SUZHOU, CHINA

• Score-informed [Abeßer et al. 2015] (Baseline)

• Template-based [Driedger et al. 2016]

• Harmonic partial [Hsu et al. 2010]Major drawbacks

• One source from mixture• Fails in high polyphony

Page 5: Video-based Vibrato Detection and Analysisfor Polyphonic ...€¦ · Pitch Vibrato Detection & Analysis for polyphonicmusic played by string instruments. Introduction: Prior Audio-based

Proposed Method Overview and Key Contribution

518TH INTERNATIONAL SOCIETY FOR MUSIC INFORMATION RETRIEVAL, OCT 23-27, 2017, SUZHOU, CHINA

Pitch

PitchSpec

Hand Hand Displacement

Ground-truth

Audio-based, Poly

Video-based

0.2 0.4 0.6 0.8 1.0 1.2 sec

0 0.2 0.4 0.6 0.8 1.0 1.2 sec

0 0.2 0.4 0.6 0.8 1.0 1.2 sec

Page 6: Video-based Vibrato Detection and Analysisfor Polyphonic ...€¦ · Pitch Vibrato Detection & Analysis for polyphonicmusic played by string instruments. Introduction: Prior Audio-based

Proposed Method Overview

6

Score Alignment

Motion Feature

Extraction

Track Association

Vibrato Detection

VibratoAnalysis

Video-based Method

18TH INTERNATIONAL SOCIETY FOR MUSIC INFORMATION RETRIEVAL, OCT 23-27, 2017, SUZHOU, CHINA

Extent

Rate

Page 7: Video-based Vibrato Detection and Analysisfor Polyphonic ...€¦ · Pitch Vibrato Detection & Analysis for polyphonicmusic played by string instruments. Introduction: Prior Audio-based

7

Score Alignment

Motion Feature

Extraction

Track Association

Vibrato Detection

VibratoAnalysis

18TH INTERNATIONAL SOCIETY FOR MUSIC INFORMATION RETRIEVAL, OCT 23-27, 2017, SUZHOU, CHINA

Extent

Rate

Proposed Method

Score Alignment

Page 8: Video-based Vibrato Detection and Analysisfor Polyphonic ...€¦ · Pitch Vibrato Detection & Analysis for polyphonicmusic played by string instruments. Introduction: Prior Audio-based

818TH INTERNATIONAL SOCIETY FOR MUSIC INFORMATION RETRIEVAL, OCT 23-27, 2017, SUZHOU, CHINA

Proposed Method

Score Alignment• Chroma feature

• Dynamic Time Warping

Page 9: Video-based Vibrato Detection and Analysisfor Polyphonic ...€¦ · Pitch Vibrato Detection & Analysis for polyphonicmusic played by string instruments. Introduction: Prior Audio-based

9

Score Alignment

Motion Feature

Extraction

Track Association

Vibrato Detection

VibratoAnalysis

18TH INTERNATIONAL SOCIETY FOR MUSIC INFORMATION RETRIEVAL, OCT 23-27, 2017, SUZHOU, CHINA

Extent

Rate

Proposed Method

Track-player Association

Page 10: Video-based Vibrato Detection and Analysisfor Polyphonic ...€¦ · Pitch Vibrato Detection & Analysis for polyphonicmusic played by string instruments. Introduction: Prior Audio-based

1018TH INTERNATIONAL SOCIETY FOR MUSIC INFORMATION RETRIEVAL, OCT 23-27, 2017, SUZHOU, CHINA

Proposed Method

Track-player Association• Bow motion <--> Score onset

• Previous work [Li et al. 2017]

Page 11: Video-based Vibrato Detection and Analysisfor Polyphonic ...€¦ · Pitch Vibrato Detection & Analysis for polyphonicmusic played by string instruments. Introduction: Prior Audio-based

11

Score Alignment

Motion Feature

Extraction

Track Association

Vibrato Detection

VibratoAnalysis

18TH INTERNATIONAL SOCIETY FOR MUSIC INFORMATION RETRIEVAL, OCT 23-27, 2017, SUZHOU, CHINA

Extent

Rate

Proposed Method

Track-player Association

Page 12: Video-based Vibrato Detection and Analysisfor Polyphonic ...€¦ · Pitch Vibrato Detection & Analysis for polyphonicmusic played by string instruments. Introduction: Prior Audio-based

Proposed Method

12

Motion Feature Extraction• Hand tracking

- KLT tracker with 30 feature points

- Bounding box: 70 x 70 pixels

18TH INTERNATIONAL SOCIETY FOR MUSIC INFORMATION RETRIEVAL, OCT 23-27, 2017, SUZHOU, CHINA

Page 13: Video-based Vibrato Detection and Analysisfor Polyphonic ...€¦ · Pitch Vibrato Detection & Analysis for polyphonicmusic played by string instruments. Introduction: Prior Audio-based

Proposed Method

Motion Feature Extraction• Fine-grained motion capture

- Optical flow estimation à pixel-level motion velocities

- Frame-wise average:

- Subtract moving mean:

Original Frame Color-encoded Optical Flow v(t)18TH INTERNATIONAL SOCIETY FOR MUSIC INFORMATION RETRIEVAL, OCT 23-27, 2017, SUZHOU, CHINA

Page 14: Video-based Vibrato Detection and Analysisfor Polyphonic ...€¦ · Pitch Vibrato Detection & Analysis for polyphonicmusic played by string instruments. Introduction: Prior Audio-based

14

Score Alignment

Motion Feature

Extraction

Track Association

Vibrato Detection

VibratoAnalysis

18TH INTERNATIONAL SOCIETY FOR MUSIC INFORMATION RETRIEVAL, OCT 23-27, 2017, SUZHOU, CHINA

Extent

Rate

Proposed Method

Track-player Association

Page 15: Video-based Vibrato Detection and Analysisfor Polyphonic ...€¦ · Pitch Vibrato Detection & Analysis for polyphonicmusic played by string instruments. Introduction: Prior Audio-based

Proposed Method

15

Vibrato DetectionMethod 1: Supervised framework

• Support Vector Machine (SVM)

• 8-D featureZero-crossing rate (4-D)Frequency (2-D)Auto-correlation peaks (2-D)

• Leave-one-out training strategy

Classifier

Vibrato / Non-vibrato

8-D

t

Note segment

18TH INTERNATIONAL SOCIETY FOR MUSIC INFORMATION RETRIEVAL, OCT 23-27, 2017, SUZHOU, CHINA

Page 16: Video-based Vibrato Detection and Analysisfor Polyphonic ...€¦ · Pitch Vibrato Detection & Analysis for polyphonicmusic played by string instruments. Introduction: Prior Audio-based

Proposed Method

16

Vibrato DetectionMethod 2: Unsupervised framework

• Principal Component Analysis (PCA)

• 1-D Motion Velocity Curve:

• Integration à Motion Displacement Curve:

X (t)

0.2 0.4 0.6 0.8 1.0 1.2 Time

18TH INTERNATIONAL SOCIETY FOR MUSIC INFORMATION RETRIEVAL, OCT 23-27, 2017, SUZHOU, CHINA

Page 17: Video-based Vibrato Detection and Analysisfor Polyphonic ...€¦ · Pitch Vibrato Detection & Analysis for polyphonicmusic played by string instruments. Introduction: Prior Audio-based

17

Score Alignment

Motion Feature

Extraction

Track Association

Vibrato Detection

VibratoAnalysis

18TH INTERNATIONAL SOCIETY FOR MUSIC INFORMATION RETRIEVAL, OCT 23-27, 2017, SUZHOU, CHINA

Extent

Rate

Proposed Method

Vibrato Analysis

Page 18: Video-based Vibrato Detection and Analysisfor Polyphonic ...€¦ · Pitch Vibrato Detection & Analysis for polyphonicmusic played by string instruments. Introduction: Prior Audio-based

Proposed Method

18

Vibrato Analysis

18TH INTERNATIONAL SOCIETY FOR MUSIC INFORMATION RETRIEVAL, OCT 23-27, 2017, SUZHOU, CHINA

• Motion rate = Vibrato rate

• Quadratic interpolation

• Peak distance on auto-correlation of motion curve X(t)

Rate

Ground-truth pitch contour

Motion displacement

Curve X(t)

0 0.2 0.4 0.6 0.8 1.0 1.2 sec

0 0.2 0.4 0.6 0.8 1.0 1.2 sec

Page 19: Video-based Vibrato Detection and Analysisfor Polyphonic ...€¦ · Pitch Vibrato Detection & Analysis for polyphonicmusic played by string instruments. Introduction: Prior Audio-based

Proposed Method

19

Vibrato Analysis

18TH INTERNATIONAL SOCIETY FOR MUSIC INFORMATION RETRIEVAL, OCT 23-27, 2017, SUZHOU, CHINA

Extent • Motion extent ≠ Vibrato extent

• Pixel à Musical cents

• Scale motion curve X(t) to fit pitch contour

Estimated vib extent

Pitch contour

Motion extent

Estimated pitch contour Motion displacement Curve X(t)

Ground-truth pitch contour

Page 20: Video-based Vibrato Detection and Analysisfor Polyphonic ...€¦ · Pitch Vibrato Detection & Analysis for polyphonicmusic played by string instruments. Introduction: Prior Audio-based

Demo of DatasetDataset: URMP Dataset

2018TH INTERNATIONAL SOCIETY FOR MUSIC INFORMATION RETRIEVAL, OCT 23-27, 2017, SUZHOU, CHINA

• Individually recorded in sound booth

• Annotated frame-level / note-level pitch

Page 21: Video-based Vibrato Detection and Analysisfor Polyphonic ...€¦ · Pitch Vibrato Detection & Analysis for polyphonicmusic played by string instruments. Introduction: Prior Audio-based

Demo of DatasetDataset: URMP Dataset

2118TH INTERNATIONAL SOCIETY FOR MUSIC INFORMATION RETRIEVAL, OCT 23-27, 2017, SUZHOU, CHINA

• Assembled together with concert stage background

Page 22: Video-based Vibrato Detection and Analysisfor Polyphonic ...€¦ · Pitch Vibrato Detection & Analysis for polyphonicmusic played by string instruments. Introduction: Prior Audio-based

Experiments: Vibrato Detection Results

22

Overall Evaluation

18TH INTERNATIONAL SOCIETY FOR MUSIC INFORMATION RETRIEVAL, OCT 23-27, 2017, SUZHOU, CHINA

• Video-based method à 92% F-measure• Improvement over audio-based method• SVM > PCA

Proposed

Page 23: Video-based Vibrato Detection and Analysisfor Polyphonic ...€¦ · Pitch Vibrato Detection & Analysis for polyphonicmusic played by string instruments. Introduction: Prior Audio-based

Experiments: Vibrato Detection Results

23

Impact of Polyphony Number

18TH INTERNATIONAL SOCIETY FOR MUSIC INFORMATION RETRIEVAL, OCT 23-27, 2017, SUZHOU, CHINA

• Audio-based method: Poly ↗ Performance ↘• Proposed video-based method: Robust

2 3 4 5Poly No.

Baseline Proposed

Page 24: Video-based Vibrato Detection and Analysisfor Polyphonic ...€¦ · Pitch Vibrato Detection & Analysis for polyphonicmusic played by string instruments. Introduction: Prior Audio-based

Experiments: Vibrato Detection Results

24

Variation Based on Type of Instrument

18TH INTERNATIONAL SOCIETY FOR MUSIC INFORMATION RETRIEVAL, OCT 23-27, 2017, SUZHOU, CHINA

• Audio-based method: Pitch range ↘ Performance ↘• Proposed Video-based method: Robust

Violin Viola Cello BassInstr.

Baseline Proposed

Page 25: Video-based Vibrato Detection and Analysisfor Polyphonic ...€¦ · Pitch Vibrato Detection & Analysis for polyphonicmusic played by string instruments. Introduction: Prior Audio-based

Experiments: Vibrato Analysis Results

25

Vibrato Rate / Extent

18TH INTERNATIONAL SOCIETY FOR MUSIC INFORMATION RETRIEVAL, OCT 23-27, 2017, SUZHOU, CHINA

• 2290 vibrato notes• Rate error: 0.38 Hz• Extent error: 3.47 cents

Page 26: Video-based Vibrato Detection and Analysisfor Polyphonic ...€¦ · Pitch Vibrato Detection & Analysis for polyphonicmusic played by string instruments. Introduction: Prior Audio-based

Conclusions

2618TH INTERNATIONAL SOCIETY FOR MUSIC INFORMATION RETRIEVAL, OCT 23-27, 2017, SUZHOU, CHINA

• Proposed video-based vibrato detection/analysis offers significant improvement over conventional audio-only analysis

• Compared to audio-based methods, proposed video-based method is

• Robust for polyphonic sources

• Robust for different types of instruments• Proposed method provides good estimates for vibrato rate

and extent

• A powerful tool for analyzing string ensembles

Page 27: Video-based Vibrato Detection and Analysisfor Polyphonic ...€¦ · Pitch Vibrato Detection & Analysis for polyphonicmusic played by string instruments. Introduction: Prior Audio-based

Thank you!

Page 28: Video-based Vibrato Detection and Analysisfor Polyphonic ...€¦ · Pitch Vibrato Detection & Analysis for polyphonicmusic played by string instruments. Introduction: Prior Audio-based

Experiments: DatasetURMP Dataset• 19 string ensembles (57 tracks)

• 5 duets, 4 trios, 7 quartets, 3 quintets

• Audio: 48k Hz

• Video: 1080P, 29.97 fps

2818TH INTERNATIONAL SOCIETY FOR MUSIC INFORMATION RETRIEVAL, OCT 23-27, 2017, SUZHOU, CHINA

URMP Dataset

Page 29: Video-based Vibrato Detection and Analysisfor Polyphonic ...€¦ · Pitch Vibrato Detection & Analysis for polyphonicmusic played by string instruments. Introduction: Prior Audio-based

Demo of DatasetDataset: URMP Dataset• 14 instruments, 44 piece arrangements

2918TH INTERNATIONAL SOCIETY FOR MUSIC INFORMATION RETRIEVAL, OCT 23-27, 2017, SUZHOU, CHINA

Page 30: Video-based Vibrato Detection and Analysisfor Polyphonic ...€¦ · Pitch Vibrato Detection & Analysis for polyphonicmusic played by string instruments. Introduction: Prior Audio-based

ExperimentsResults

30

Potential Application on Musicologies

18TH INTERNATIONAL SOCIETY FOR MUSIC INFORMATION RETRIEVAL, OCT 23-27, 2017, SUZHOU, CHINA

• Test on TPs from Vid-PCA method: 2290 vibrato notes• Average error: 0.38 Hz / 3.47 cents• Double bass à lower rate / extent [1]

Vibrato characteristics for different instruments

[1] James Paul Mick. An analysis of double bass vibrato: Rates, widths, and pitches as influenced by pitch height, fingers used, andtempo. PhD thesis, The Florida State University, 2012.