Music Language Modeling using Temporal Convolutional Network Architecture for Automatic Music Transcription
Automatic music transcription (AMT) is the task, given an acoustic representation of music, to recover a symbolic notation of the written musical score that produced the sound. Much existing work in AMT has focused on developing suitable acoustic or discriminative combined acoustic/sequence models that map spectrogram representations to the desired symbolic representation. Music language modeling (MLM), which produces generative models of musical sequences, can be used in conjunction with acoustic methods in graphical models for AMT. Although MLM for AMT is an open research question, and existing work on the subject demonstrates only modest effectiveness, it has been speculated that MLM has the potential to significantly increase the performance of AMT models for at least two reasons: first, a MLM represents statistically learned prior beliefs that can be incorporated by way of Baysian reasoning into an AMT model. Second, because the training of MLM's does n! ot require acoustic data aligned with symbolic music notations, much more data can be obtained for training. Common autoregressive modeling techniques for this MLM task are Hidden Markov Models and recurrent neural networks such as Long Short-Term Memory (LSTM) models. In contrast to these well established sequential modeling techniques, the present work studies a feed-forward, temporal convolutional neural network (TCNN) architecture for MLM within a local radius of up to 20 seconds. This work is the first to the author's knowledge that uses TCNN architecture specifically for language modeling of symbolic musical data and successfully uses that model to improve upon the AMT performance of a discriminative model. This is also the only work known to the author which uses focal loss, a technique imported from visual object detection, to efficiently and effectively train such a MLM.
Major Advisor: Liang Huang
Committee: Prasad Tadepalli
Committee: Mike Bailey
Wednesday, June 19, 2019 at 9:00am to 11:00am
Kelley Engineering Center, 1007
110 SW Park Terrace, Corvallis, OR 97331