Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with...

32
Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury, Tim DiLauro, Mark Patton, Teal Anderson Levy Project II Digital Knowledge Center Sheridan Libraries Johns Hopkins University

description

Lester S. Levy Collection

Transcript of Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with...

Page 1: Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury,

Digitization of the Lester S. Levy Collection of Sheet Music

Ichiro FujinagaMcGill University

withMichael Droettboom, Karl MacMillan,

G. Sayeed Choudhury, Tim DiLauro, Mark Patton, Teal AndersonLevy Project II

Digital Knowledge CenterSheridan Libraries

Johns Hopkins University

Page 2: Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury,

Contents Levy Project

Levy Sheet Music Collection Digital Workflow Management Optical Music Recognition Gamera Guido / NoteAbility

Current goals

Digitization completed

Under development

Page 3: Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury,

Lester S. Levy Collection

Page 4: Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury,

Lester S. Levy Collectionlevysheetmusic.mse.jhu.edu

North American sheet music (1780–1960)

Digitized 29,000 pieces (130,000 sheets) Began in 1994 includes “The Star-Spangle Banner” and

“Yankee Doodle”

Page 5: Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury,
Page 6: Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury,

Lester S. Levy Collectionlevysheetmusic.mse.jhu.edu

North American sheet music (1780–1960)

Digitized 29,000 pieces (130,000 sheets) Began in 1994 includes “The Star-Spangle Banner” and

“Yankee Doodle”

Database of: metadata images of music (8bit gray) lyrics (first lines of verse and chorus) color images of cover sheets (32bit)

Page 7: Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury,
Page 8: Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury,

Reduce the manual intervention for large-scale digitization projects

Creation of data repository (text, image, sound) Optical Music Recognition (OMR) Gamera

XML-based metadata composer, lyricist, arranger, performer, artist, engraver,

lithographer, dedicatee, and publisher cross-references for various forms of names, pseudonyms authoritative versions of names and subject terms

Music and lyric search engines Analysis toolkit

Digital Workflow Management

Page 9: Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury,

Optical Music Recognition (OMR)

Trainable open-source OMR system in development since 1984

Staff recognition and removal Lyric removal Stems and notehead removal Music symbol classifier Score reconstruction Lyric classifier? Optical Character Recognition (OCR)

Page 10: Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury,

The problem Suitable OCR for lyrics not found Commercial OCR systems are often

inadequate for non-standard documents The market for specialized recognition of

historical documents is very small Researchers performing document

recognition often “re-invent” the basic image processing wheel

Page 11: Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury,

The solution Provide easy to use tools to allow domain

experts (people with specialized knowledge of a collection) to create custom recognition applications

Generalize OMR for structured documents

Page 12: Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury,

Introducing Gamera Framework for creation of structured document

recognition system Designed for domain experts Image processing tools (filters, binarizations, …) Document segmentation and analysis Symbol segmentation and classification Syntactical and semantic analysis

Generalized Algorithms and Methods for Enhancement and Restoration of Archives

Page 13: Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury,

Features of Gamera Portability (Unix, Windows, Mac) Extensibility (Python and C++ plugins) Easy-to-use (experts and programmers) Open source Graphic User Interface Interactive / Batchable (scripts)

Page 14: Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury,

Gamera: Interface(screenshot in Linux)

Page 15: Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury,

Gamera: Interface(screenshot in Linux)

Page 16: Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury,

Histogram(screenshot in Linux)

Page 17: Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury,

Thresholding(screenshot in Linux)

Page 18: Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury,

Thresholding(screenshot in Linux)

Page 19: Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury,

Staff removal: Lute tablature

Page 20: Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury,
Page 21: Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury,

Classifier: Lute(screenshot in Linux)

Page 22: Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury,

Staff removal: Neumes

Page 23: Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury,

Classifier: Neums(screenshot in Linux)

Page 24: Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury,

Greek example

Page 25: Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury,

GUIDO Music Notation FormatH. Hoos, K. Renz, J. Kilian

“A formal language for score-level representation”

Plain text: readable, platform independent Extensible and flexible Adequate representation NoteServer: Web/Windows GUIDO/XML NoteAbility (K. Hamel)

Page 26: Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury,
Page 27: Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury,

Conclusions Levy Collection

Searchable Metadata Online images (public domain) of music and

cover Digital Workflow Management

Optical Music Recognition Gamera for domain experts

Includes an easy-to-use interactive environment for experimentation

Beta version available on Linux OS X and Windows version in preparation

Page 28: Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury,

Acknowledgements National Science Foundation National Endowments for the Humanities Institute of Museum and Library Services The Levy Family

Page 29: Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury,

OMR: Classifier

Connected-component analysis Feature extraction, e.g:

Width, height, aspect ratio Number of holes Central moments

k-nearest neighbor classifier Genetic algorithm

Page 30: Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury,

Overall Architecture for OMR

Staff removalSegmentation

Recognition

K-NN Classifier

Output

Symbol Name

Knowledge BaseFeature Vectors

OptimizationGenetic Algorithm

K-nn ClassifierBest

Weight Vector

ImageFile

Off-line

Page 31: Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury,

Graphic User Interface (wxWindows)

Architecture of Gamera

GAMERA Core (C++)

Scripting Environment (Python)

Plugins (Python)

Automatic Plugin Wrapper (Boost)

Plugins (C++)

Page 32: Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury,

GUIDO: An example{ [ \beamsOff | \clef<"treble"> \key<"D"> f#*1/8. g*1/16 |a*1/4. d2*1/8 d*1/4. c#*1/8 |e1*1/2 _*1/4 f#*1/8. g*1/16 |c#2*1/4. b1*1/8 a*1/4. g*1/8 || e#*1/2 f#*1/4 f#*1/8. g*1/16 |a*1/4. d2*1/8 d*1/4. c#*1/8 |e1*1/2 _*1/4 f#*1/8 g |c#2*1/4. b1*1/8 a*1/4. c#*1/8 ],