Unsupervised Neural Transcription of Percussive Audio
Authors : Revanth Reddy Pasula
Volume/Issue : RISEM–2025
Google Scholar : https://tinyurl.com/3cu3uyds
Scribd : https://tinyurl.com/yc8nu34w
DOI : https://doi.org/10.38124/ijisrt/25jun168
Abstract : Transcription of percussive audio without human-labeled data is still a challenging area of research for music information retrieval. This work presents a deep learning solution that learns to transcribe drums autonomously without requiring human-annotated datasets. The strategy uses a neural transcription model alongside a fixed synthesizer module that, collectively, iteratively improve the drum transcription by maximizing the accuracy of reconstructed audio—without any human-annotated datasets. The experimental results indicate that the unsupervised system provides performance that is on par with fully supervised models, with added scalability. These results indicate that self-supervised learning has the potential to actually improve the accuracy of transcription for drums, opening the door for its wider application to automatic music analysis and generative sound modeling.
Keywords : Drum Transcription; Unsupervised Learning; Self-Supervised Audio; Music Information Retrieval; Neural Networks.
Keywords : Drum Transcription; Unsupervised Learning; Self-Supervised Audio; Music Information Retrieval; Neural Networks.

