Multiple Speaker Voices #56

Tenidus · 2024-10-23T21:22:14Z

First, I have to say that this is absolutely fantastic. I've tried many, many different TTS models with RVC and this just works. Not only that, it sounds great. I used to train in XTTS-FineTune then train in RVC then run the TTS, import into RVC and be done, this saves me a step and simplifies things dramatically. And the ability to import epub, pdf, etc.. is fantastic! So I greatly appreciate this project.

I do have a question...AllTalk supports 'Character' and 'Narrator' TTS translation. So if a characters speech/text is in encompassed with "" it switches voices and if it encompassed with ** it uses a different voice.
Is that something that might be integrated into this project? In addition, could it be more than 2 different voices? I have written several stories that have 3-6 characters and I run them through this individually but it would obviously save time to have the ability to utilize different character voices that are notated by various symbols like AllTalk does (but it only supports 2).

Thank you again. I really just wanted to post how fantastic this project is ;)

lukaszliniewicz · 2024-10-23T21:40:50Z

Thanks a lot! I'm actually working on this, but I can't say when it will be ready as it requires some changes to both the UI and text processing (I'd like to include automatic speaker attribution via LLM).

Tenidus · 2024-10-23T23:56:44Z

Thank you for the prompt response! I completely understand and can't imagine what it involves but it's nice to hear you are working on that.

I really appreciate all of the effort you've put into this. I greatly appreciate it being all in 1 package/installable without having the need to run multiple installs and conda/python environments. I know it would have been much simpler to put this in Gradio and run it all through the browser but you've gone to the length of making in super Windows friendly.

Will this support multiple GPUs? I have 2 and TTS didn't like it, stating I needed to run a different command, so I did have to modify the easy_tts_trainer.py to include:
os.environ["CUDA_VISIBLE_DEVICES"] = "0"

Wasn't sure if that was another possible thing you were looking into or know a simple fix. It would be really cool if I could either train with both GPUs or run training on 1 GPU while inferring on another at the same time.
Not a huge deal, just curious

Thank you again!

lukaszliniewicz · 2024-10-24T01:02:34Z

I will have to look into it, I've never done anything with multiple GPUs. I have to check if coqui TTS supports it.

JohnF51 · 2024-10-30T13:11:17Z

Sorry for off-topic, please would it be possible to add keyboard shortcuts for PLAY, Regenerate, Mark and Play as playlist to the script? It would be very useful for me.

lukaszliniewicz · 2024-10-30T13:16:39Z

There is one so far - the m key marks a sentence and the right mouse button marks the currently playing and the previous sentence (intended for using when listening but not looking at the interface). I will add the others, sure.

Tenidus · 2024-11-05T17:52:30Z

So I figured out a little work around for this. I have the entire source file/text Generated in a 'narrator' voice, then I go and mark the alternate speaking lines, switch the XTTS Model, Speaker Voice and RVC Voice to a different speakers voice and then Regenerate All. I do this for each speaker and it seems to work very well.
In order to get the sentences properly segmented I separate them into paragraphs, even if it's a single word, the text is on it's own paragraph.

lukaszliniewicz · 2024-11-05T18:12:20Z

Good idea, but a lot of work, unfortunately... I'm still working on a robust solution, and it will probably take another two weeks or so. Thanks for the update.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiple Speaker Voices #56

Multiple Speaker Voices #56

Tenidus commented Oct 23, 2024 •

edited

Loading

lukaszliniewicz commented Oct 23, 2024 •

edited

Loading

Tenidus commented Oct 23, 2024 •

edited

Loading

lukaszliniewicz commented Oct 24, 2024

JohnF51 commented Oct 30, 2024

lukaszliniewicz commented Oct 30, 2024

Tenidus commented Nov 5, 2024 •

edited

Loading

lukaszliniewicz commented Nov 5, 2024

Multiple Speaker Voices #56

Multiple Speaker Voices #56

Comments

Tenidus commented Oct 23, 2024 • edited Loading

lukaszliniewicz commented Oct 23, 2024 • edited Loading

Tenidus commented Oct 23, 2024 • edited Loading

lukaszliniewicz commented Oct 24, 2024

JohnF51 commented Oct 30, 2024

lukaszliniewicz commented Oct 30, 2024

Tenidus commented Nov 5, 2024 • edited Loading

lukaszliniewicz commented Nov 5, 2024

Tenidus commented Oct 23, 2024 •

edited

Loading

lukaszliniewicz commented Oct 23, 2024 •

edited

Loading

Tenidus commented Oct 23, 2024 •

edited

Loading

Tenidus commented Nov 5, 2024 •

edited

Loading