Samples for Semi-supervised Monaural Singing Voice Separation with a Masking Network Trained on Synthetic Mixtures

A comparison of our semi-supervised method with the state of the art supervised method of "Monaural Singing Voice Separation with Skip-Filtering Connections and Recurrent Inference of Time-Frequency Mask" by Mimilakis et al., ICASSP 2018.

All baseline results are obtained from https://js-mim.github.io/mss_pytorch/.

It seems that our results are cleaner, in the sense that the singing contains less instrumental music. However, there are more distortions in our results. This is reflected in the somewhat lower SDR we display but the much higher SIR.

Audio Demo on Evaluation subset: (please wait for the audio files to load)

Name Mix True Source GRU-RIS-L Our
Schoolboy Fascination
Back from the Start
The English Actor
Come Around
Kaathaadi
What Have You Done to Me
Rothko

All other music samples used by Mimilakis et al., ICASSP 2018: (please wait for the audio files to load)

Name Mix True Source GRU-RIS-L Our
Hexvessel - Preacher's Orchard Unavailable
Igorrr - Tout Petit Moineau Unavailable
Kanute - Not Sleeping Unavailable
La Coka Nostra - That's Coke Unavailable
Trentemoller - Even Though You Are with Another Girl Unavailable