Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple Speaker Voices #56

Open
Tenidus opened this issue Oct 23, 2024 · 7 comments
Open

Multiple Speaker Voices #56

Tenidus opened this issue Oct 23, 2024 · 7 comments

Comments

@Tenidus
Copy link

Tenidus commented Oct 23, 2024

First, I have to say that this is absolutely fantastic. I've tried many, many different TTS models with RVC and this just works. Not only that, it sounds great. I used to train in XTTS-FineTune then train in RVC then run the TTS, import into RVC and be done, this saves me a step and simplifies things dramatically. And the ability to import epub, pdf, etc.. is fantastic! So I greatly appreciate this project.

I do have a question...AllTalk supports 'Character' and 'Narrator' TTS translation. So if a characters speech/text is in encompassed with "" it switches voices and if it encompassed with ** it uses a different voice.
Is that something that might be integrated into this project? In addition, could it be more than 2 different voices? I have written several stories that have 3-6 characters and I run them through this individually but it would obviously save time to have the ability to utilize different character voices that are notated by various symbols like AllTalk does (but it only supports 2).

Thank you again. I really just wanted to post how fantastic this project is ;)

@lukaszliniewicz
Copy link
Owner

lukaszliniewicz commented Oct 23, 2024

Thanks a lot! I'm actually working on this, but I can't say when it will be ready as it requires some changes to both the UI and text processing (I'd like to include automatic speaker attribution via LLM).

@Tenidus
Copy link
Author

Tenidus commented Oct 23, 2024

Thank you for the prompt response! I completely understand and can't imagine what it involves but it's nice to hear you are working on that.

I really appreciate all of the effort you've put into this. I greatly appreciate it being all in 1 package/installable without having the need to run multiple installs and conda/python environments. I know it would have been much simpler to put this in Gradio and run it all through the browser but you've gone to the length of making in super Windows friendly.

Will this support multiple GPUs? I have 2 and TTS didn't like it, stating I needed to run a different command, so I did have to modify the easy_tts_trainer.py to include:
os.environ["CUDA_VISIBLE_DEVICES"] = "0"

Wasn't sure if that was another possible thing you were looking into or know a simple fix. It would be really cool if I could either train with both GPUs or run training on 1 GPU while inferring on another at the same time.
Not a huge deal, just curious

Thank you again!

@lukaszliniewicz
Copy link
Owner

I will have to look into it, I've never done anything with multiple GPUs. I have to check if coqui TTS supports it.

@JohnF51
Copy link

JohnF51 commented Oct 30, 2024

Sorry for off-topic, please would it be possible to add keyboard shortcuts for PLAY, Regenerate, Mark and Play as playlist to the script? It would be very useful for me.

@lukaszliniewicz
Copy link
Owner

There is one so far - the m key marks a sentence and the right mouse button marks the currently playing and the previous sentence (intended for using when listening but not looking at the interface). I will add the others, sure.

@Tenidus
Copy link
Author

Tenidus commented Nov 5, 2024

So I figured out a little work around for this. I have the entire source file/text Generated in a 'narrator' voice, then I go and mark the alternate speaking lines, switch the XTTS Model, Speaker Voice and RVC Voice to a different speakers voice and then Regenerate All. I do this for each speaker and it seems to work very well.
In order to get the sentences properly segmented I separate them into paragraphs, even if it's a single word, the text is on it's own paragraph.

@lukaszliniewicz
Copy link
Owner

Good idea, but a lot of work, unfortunately... I'm still working on a robust solution, and it will probably take another two weeks or so. Thanks for the update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants