Publication: Classifying Ragams in Carnatic Music with Machine Learning Models: A Shazam for South Indian Classical Music
Open/View Files
Date
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
Citation
Abstract
Ragams, analogous to the Western musical concept of mode, are the cornerstone of Indian classical music. Each song in Indian classical music exists within one of a few hundred common ragams. Learning to identify a song’s ragam is a core competency developed during an Indian classical music education, and efforts to develop computational ragam identifiers have thus recently gained traction. In this project, I combine two digital audio signal processing techniques with four deep-learning models to determine whether these models are appropriate tools for classifying ragams in real-world concert settings. I obtained promising results, including 98% model testing accuracy when distinguishing between songs in two different ragams, 94% testing accuracy when classifying songs within a pool of ten ragams, and 86% testing accuracy on a pool of fifteen commonly occurring ragams. My results constitute meaningful strides towards the ultimate goal of engineering a Shazam-like application for reliable real-time song classification in four principal ways:
- Compiled a dataset of over 70,000 Carnatic songs, creating a comprehensive training dataset with over 1 TB of audio files, labeled by ragam, enabling novel features to be extracted and bigger models to be trained.
- Designed Artificial Neural Network (ANN), Long Short-term Memory (LSTM), and 2-D Convolutional Neural Network (CNN) ragam recognition models surpassing benchmarks in ragam classification accuracy and capability from previous studies.
- Provided model explainability by highlighting the significance of particular audio features in ragam predictions, enhanced by Carnatic domain knowledge, unlike prior black-box approaches.
- Introduced transformer (BERT) models for ragam identification in Carnatic music, marking a novel approach in this research area.