'How do you implement SVoice?
I'm trying to use Facebook's SVoice to split out different speakers in my audio file using python. I found a library that implemented it here:
https://github.com/facebookresearch/svoice
However, I'm having trouble running it. The readme discusses how to train my own dataset which I can't really do since I don't have the noises parsed out in my own audio files. It also talks about how I can separate my own file using one of the models in the models folder but I get the following error when I try to follow the readme and create a model from the toy dataset:
File "/mnt/c/Users/imrea/PycharmProjects/svoice/svoice/data/audio.py", line 34, in find_audio_files
siginfo, _ = torchaudio.info(file)
TypeError: cannot unpack non-iterable AudioMetaData object
How do I run this to test the output on an audio file of my own? Has anyone used this before? Any guidance would be greatly appreciated!
Solution 1:[1]
You need to have torchaudio version 0.6.0 Try: pip install torch==1.6.0+cu101 torchvision==0.7.0+cu101 torchaudio==0.6.0 -f https://download.pytorch.org/whl/torch_stable.html This worked for me.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | cchoi1022 |
