CAREER: Making music documents accessible in musical terms

$506,669FY2007CSENSF

Northwestern University, Evanston IL

Investigators

Abstract

Making Music Documents Accessible in Musical Terms One of the key problems facing us in the 21st century is information retrieval and management. Finding ways to automatically index, label, and access multimedia content (such as music documents) in meaningful ways is an open research question that increases in importance as multimedia databases proliferate and grow. Music collections, such as the 3.5 million recordings in Apple Computer's iTunes repository, comprise one of the most popular categories of on-line multimedia content. For scholars, musicians and even casual listeners, the music document is only the beginning, a tool to initiate the task at hand. Musicians may be interested in remixing a musical recording even though all they have available is the final mix. Scholars may wish to analyze the harmonies in a piece. Others may want karaoke that follows the singer's expressive timing, or a way to remove the sound of an unwanted cell phone ring from a recording of their daughter's flute recital. The objective of this research is to develop two key facilitating technologies to enable these kinds of interactions: score alignment and source separation. Score alignment, involves aligning an audio performance and to the events in a machine-readable music score. When aligned to a score, a performance can be addressed by melodic and harmonic content. We propose to advance the state-of-the-art by enabling a machine to follow partially specified scores (such as Jazz lead sheets). This alignment require significant inference about likely surface structures (the note sequence in an improvised solo) from deeper structural descriptions in the score (the chords in a lead sheet). This will enable alignment of entire classes of music, such as much Jazz, Pop and Rock, that cannot currently be aligned to scores. The second technology, source separation, is the process of isolating individual source signals, given mixtures of the source signals. With source separation, individual instruments and sounds can be accessed, identified and manipulated in ways beyond the power of commercial audio search and editing software. We will advance the field through score-informed separation, as well as new iterative methods for approximating source models from acoustic mixtures. The idea is to develop a synergistic system for music-information-retrieval and interaction that uses multiple document modalities (written scores, audio files, MIDI) to infer more about the music structure than is possible using a single modality. This research will impact the signal-processing community (source separation), the music information retrieval community (music indexing and search) and the artificial intelligence community (tools for intelligent abstraction of real-world data). To broadly disseminate the work, demonstration tools will be made available over the internet and results will be published in relevant journals and conferences. The PI is committed to involving undergraduates and members of historically underrepresented groups in research, working with the SROP and UROP programs to make this happen. The PI also teaches the course "Machine Perception of Music" where research results will be disseminated to a wide variety of students.

View original record on NSF Award Search →