These are examples from the NUS-48E [1] dataset, on which the system was trained and evaluated
Original Vocals, Re-synthesized using WORLD vocoder | Vocals Synthesized Using WGANSing Model With L1 Loss | Vocals Synthesized Using NPSS [2,3] Model | Vocals Synthesized Using WGANSing Model Without L1 Loss | Vocals Synthesized Using WGANSing Model With L1 Loss, With Voice Change | Vocals Synthesized Using WGANSing Model With L1 Loss, With Voice and Gender Change | |
Male Singing Voice |
Original Vocals, Re-synthesized using WORLD vocoder | Vocals Synthesized Using WGANSing Model With L1 Loss | Vocals Synthesized Using NPSS [2,3] Model | Vocals Synthesized Using WGANSing Model Without L1 Loss | Vocals Synthesized Using WGANSing Model With L1 Loss, With Voice Change | Vocals Synthesized Using WGANSing Model With L1 Loss, With Voice and Gender Change | |
Female Singing Voice |
[1] Duan, Zhiyan, et al. "The NUS sung and spoken lyrics corpus: A quantitative comparison of singing and speech." 2013 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference. IEEE, 2013.
[2] Blaauw, Merlijn, and Jordi Bonada. "A Neural Parametric Singing Synthesizer Modeling Timbre and Expression from Natural Songs." Applied Sciences 7.12 (2017): 1313.
[3] Blaauw, Merlijn, et al. “Data efficient voice cloning forneural singing synthesis,” in2019 IEEE International Conference onAcoustics, Speech and Signal Processing (ICASSP), 2019.