The musicnn library (pronounced as “musician”) employs deep convolutional neural networks to automatically tag songs, and the models that are included achieve the best scores in public evaluation benchmarks. These state-of-the-art models have been released as an open-source library that can be easily installed and used. For example, you can use musicnn to tag this emblematic song from Muddy Waters — and it will predominantly tag it as blues!
Interestingly, although musicnn is quite confident about its blues prediction, it also considers (with less determination!) that some parts of the song can be tagged as jazz, soul, rock or even country — that are music genres that are closely related. See the taggram (which is the evolution of the tags probabilities across time) of the song above:
This project has been developed by the Music Technology Group of the Universitat Pompeu Fabra in Barcelona, and is the result of my PhD research — and now, we have open-sourced musicnn. You can use it by simply installing it:
pip install musicnn
python -m musicnn.tagger your_song.mp3 –print
Note that the choral prelude is well detected, as well as the piano and rock sections. A particularly interesting confusion is that Freddie Mercury‘s voice was tagged as female voice! In addition, the model is very consistent with its predictions (being “reasonable” confusions or not). Up to the point where one can clearly identify the different sections of the song in the taggram. For example, “repeated patterns” can be found for sections that have the same structure, or the model is quite successful at detecting when a singing voice is present or not.