Publication: Classifying Ragams in Carnatic Music with Machine Learning Models: A Shazam for South Indian Classical Music
No Thumbnail Available
Open/View Files
Date
2024-11-26
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
The Harvard community has made this article openly available. Please share how this access benefits you.
Citation
Narayanan, Hari. 2024. Classifying Ragams in Carnatic Music with Machine Learning Models: A Shazam for South Indian Classical Music. Bachelor's thesis, Harvard University Engineering and Applied Sciences.
Research Data
Abstract
Ragams, analogous to the Western musical concept of mode, are the cornerstone of Indian classical music. Each song in Indian classical music exists within one of a few hundred common ragams. Learning to identify a song’s ragam is a core competency developed during an Indian classical music education, and efforts to develop computational ragam identifiers have thus recently gained traction. In this project, I combine two digital audio signal processing techniques with four deep-learning models to determine whether these models are appropriate tools for classifying ragams in real-world concert settings. I obtained promising results, including 98\% model testing accuracy when distinguishing between songs in two different ragams, 94\% testing accuracy when classifying songs within a pool of ten ragams, and 86\% testing accuracy on a pool of fifteen commonly occurring ragams. My results constitute meaningful strides towards the ultimate goal of engineering a Shazam-like application for reliable real-time song classification in four principal ways:
1. Compiled a dataset of over 70,000 Carnatic songs, creating a comprehensive training dataset with over 1 TB of audio files, labeled by ragam, enabling novel features to be extracted and bigger models to be trained.
2. Designed Artificial Neural Network (ANN), Long Short-term Memory (LSTM), and 2-D Convolutional Neural Network (CNN) ragam recognition models surpassing benchmarks in ragam classification accuracy and capability from previous studies.
3. Provided model explainability by highlighting the significance of particular audio features in ragam predictions, enhanced by Carnatic domain knowledge, unlike prior black-box approaches.
4. Introduced transformer (BERT) models for ragam identification in Carnatic music, marking a novel approach in this research area.
Description
Other Available Sources
Keywords
Carnatic, Machine Learning, Neural Networks, Ragam, Applied mathematics, Biology, Music
Terms of Use
This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service