Have you ever wanted to remove the vocals or lead instrument from a favorite song? Often I wish I could extract and remix a specific element of a song for my own musical daydreams. So regardless to say, this has been a life long dream for me. I have thought up all sorts of possible artistic uses.
So imagine my surprise when I found an separateLeadStereo open-source python script that will do just so! Now it is by no means a perfect extraction, but I find the algorithmic decisions and accordingly glitchy sound quite fascinating. I really enjoy how the solo/accompaniment affect each other by ducking out the volume against each other and certain shared frequencies subtly leak through. I am in love with sounds that are obviously aleatoric. Music that shows its roots in a state of entropy.
This process is known as Automatic Extraction of the Main Melody from Polyphonic Music Signals. Or a short nickname: desoloing.
Here are a few examples of songs I have processed.
But there is a catch… By my own tests and rough calculations, it takes 24 hours of rendering for every 45 seconds of audio you want desoloed (estimated with a quad 3.2 GHz processor). Its also a fairly RAM intensive process, with 5 minutes of audio filling up 4GB of RAM. So you should know what your getting into and choose your audio tests wisely.
UPDATE (2013-11-25) – Since I’ve written this post, I’ve processed around one hundred songs. Now a 4 minute song takes only 3 hours to render! That is a huge speed boost. The bottleneck was fixed by installing the MKL build of the NumPy library (Intel’s high performance Math Kernel Library). Also, by installing the 64-bit version of python and the required libraries, you can use much more RAM and process longer songs. On 8GB of RAM, the max song length is about 10 minutes.
You’ll want to choose songs where the lead vocal or instrument stand out from the rest of the music. But some interesting things can happen when the algorithmically-followed lead catches pitches from other instruments. So there are many happy accidents to be expected from this process and maybe even choosing a song where you have no idea what it will try and extract. Only 44.1khz wav’s allowed.
Some command-line experience would be very helpful. But if that foreign to you, well then I’m going to try my best and outline the required steps below. There are no required settings to setup, you just need to point the python script to the audio file you want to desolo.
I would like to thank Jean-Louis Durrieu for releasing his work to the open-source community. It is utterly fantastic and fascinating work. Bravo!
1. Install Python
2. Install these Python libraries – (make sure to download the MKL build of NumPy)
3. Now we need to tell Windows to initialize Python as a regular thing to load. This will allow us to use the command-line prompt much easier with Python.
— Control Panel > System > Advanced > Environmental Variables > System Variables > Path
— Add this to the end of the line C:\Python27;
— Click ‘ok’ and close all those windows.
BEGIN DESOLOING SONGS
1. Download and unzip the separateLeadStereo Python scripts directly into a folder on your desktop called ‘desolo’. Make sure the scripts are not nested within another folder. Also drop the music that you wish to process into this folder (WAV @ 44.1khz).
6. Open the start menu. In the start menu search box, type: ‘cmd’ and hit enter.
This is your command-line prompt, where all the rendering will be triggered from.
7. Now we need to navigate the cmd prompt to the desolo folder where the python scripts and wav are waiting for us.
Type: ‘cd desktop\desolo’ and hit enter.
8. Now for the last step, to tell it to render the wav!
Type: ‘python separateLeadStereoParam.py name-of-your-audio-file.wav’ and hit enter.
9. And now its rendering! It may not look like its doing anything for a while but have no fear and just wait. You will know when its done rendering when it says ‘Done!’. There is no percentage meter. Just make sure not to close the CMD prompt until its done.