Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with...
-
Upload
austin-eaton -
Category
Documents
-
view
222 -
download
0
description
Transcript of Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with...
Digitization of the Lester S. Levy Collection of Sheet Music
Ichiro FujinagaMcGill University
withMichael Droettboom, Karl MacMillan,
G. Sayeed Choudhury, Tim DiLauro, Mark Patton, Teal AndersonLevy Project II
Digital Knowledge CenterSheridan Libraries
Johns Hopkins University
Contents Levy Project
Levy Sheet Music Collection Digital Workflow Management Optical Music Recognition Gamera Guido / NoteAbility
Current goals
Digitization completed
Under development
Lester S. Levy Collection
Lester S. Levy Collectionlevysheetmusic.mse.jhu.edu
North American sheet music (1780–1960)
Digitized 29,000 pieces (130,000 sheets) Began in 1994 includes “The Star-Spangle Banner” and
“Yankee Doodle”
Lester S. Levy Collectionlevysheetmusic.mse.jhu.edu
North American sheet music (1780–1960)
Digitized 29,000 pieces (130,000 sheets) Began in 1994 includes “The Star-Spangle Banner” and
“Yankee Doodle”
Database of: metadata images of music (8bit gray) lyrics (first lines of verse and chorus) color images of cover sheets (32bit)
Reduce the manual intervention for large-scale digitization projects
Creation of data repository (text, image, sound) Optical Music Recognition (OMR) Gamera
XML-based metadata composer, lyricist, arranger, performer, artist, engraver,
lithographer, dedicatee, and publisher cross-references for various forms of names, pseudonyms authoritative versions of names and subject terms
Music and lyric search engines Analysis toolkit
Digital Workflow Management
Optical Music Recognition (OMR)
Trainable open-source OMR system in development since 1984
Staff recognition and removal Lyric removal Stems and notehead removal Music symbol classifier Score reconstruction Lyric classifier? Optical Character Recognition (OCR)
The problem Suitable OCR for lyrics not found Commercial OCR systems are often
inadequate for non-standard documents The market for specialized recognition of
historical documents is very small Researchers performing document
recognition often “re-invent” the basic image processing wheel
The solution Provide easy to use tools to allow domain
experts (people with specialized knowledge of a collection) to create custom recognition applications
Generalize OMR for structured documents
Introducing Gamera Framework for creation of structured document
recognition system Designed for domain experts Image processing tools (filters, binarizations, …) Document segmentation and analysis Symbol segmentation and classification Syntactical and semantic analysis
Generalized Algorithms and Methods for Enhancement and Restoration of Archives
Features of Gamera Portability (Unix, Windows, Mac) Extensibility (Python and C++ plugins) Easy-to-use (experts and programmers) Open source Graphic User Interface Interactive / Batchable (scripts)
Gamera: Interface(screenshot in Linux)
Gamera: Interface(screenshot in Linux)
Histogram(screenshot in Linux)
Thresholding(screenshot in Linux)
Thresholding(screenshot in Linux)
Staff removal: Lute tablature
Classifier: Lute(screenshot in Linux)
Staff removal: Neumes
Classifier: Neums(screenshot in Linux)
Greek example
GUIDO Music Notation FormatH. Hoos, K. Renz, J. Kilian
“A formal language for score-level representation”
Plain text: readable, platform independent Extensible and flexible Adequate representation NoteServer: Web/Windows GUIDO/XML NoteAbility (K. Hamel)
Conclusions Levy Collection
Searchable Metadata Online images (public domain) of music and
cover Digital Workflow Management
Optical Music Recognition Gamera for domain experts
Includes an easy-to-use interactive environment for experimentation
Beta version available on Linux OS X and Windows version in preparation
Acknowledgements National Science Foundation National Endowments for the Humanities Institute of Museum and Library Services The Levy Family
OMR: Classifier
Connected-component analysis Feature extraction, e.g:
Width, height, aspect ratio Number of holes Central moments
k-nearest neighbor classifier Genetic algorithm
Overall Architecture for OMR
Staff removalSegmentation
Recognition
K-NN Classifier
Output
Symbol Name
Knowledge BaseFeature Vectors
OptimizationGenetic Algorithm
K-nn ClassifierBest
Weight Vector
ImageFile
Off-line
Graphic User Interface (wxWindows)
Architecture of Gamera
GAMERA Core (C++)
Scripting Environment (Python)
Plugins (Python)
Automatic Plugin Wrapper (Boost)
Plugins (C++)
GUIDO: An example{ [ \beamsOff | \clef<"treble"> \key<"D"> f#*1/8. g*1/16 |a*1/4. d2*1/8 d*1/4. c#*1/8 |e1*1/2 _*1/4 f#*1/8. g*1/16 |c#2*1/4. b1*1/8 a*1/4. g*1/8 || e#*1/2 f#*1/4 f#*1/8. g*1/16 |a*1/4. d2*1/8 d*1/4. c#*1/8 |e1*1/2 _*1/4 f#*1/8 g |c#2*1/4. b1*1/8 a*1/4. c#*1/8 ],
…