Skip to main content
School of Electronic Engineering and Computer Science

Meta-learning-based domain adaptation for melody extraction

When: Wednesday, September 25, 2024, 11:00 AM - 12:00 PM
Where: GC205, Graduate Centre Building

Speaker: Kavya Ranjan Saxena

Abstract: The task of extracting the dominant pitch from polyphonic audio is crucial in the music information retrieval field. A substantial amount of labelled audio data is required to effectively train the machine learning models to perform the task. Generally, the traditional models trained on audios of one domain, i.e., source, may not accurately extract pitch from audios of different domains, i.e., target. To boost the performance, the models are adapted on minimal labelled data from the target domain, a method known as the supervised domain adaptation. We use the meta-learning algorithm as the supervised domain adaptation method for the task of melody extraction, by proposing a novel weighting technique to handle the class imbalance when adapting to a few audios in the target domain. Further, this method can be extended as an efficient interactive melody extraction method based on active adaptation. This method selects the regions in the target audio that require human annotation using a confidence criterion based on normalized true class probability. The annotations are used by the model to adapt itself to the target domain using meta-learning. The meta-learning-based domain adaptation method is model-agnostic and can be applied to other non-adaptive melody extraction models to boost their performance.
 
Bio: Kavya Ranjan Saxena is a Ph.D. student at the Indian Institute of Technology Kanpur, India. Her research interests are in machine learning for signal processing with a focus on domain adaptation for melody extraction in the field of music information retrieval. Currently, she is working as an Intern – Speech Research Scientist at Krutrim (an Ola company), where her work focuses on Audio LLMs.

Back to top