Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Other TTS Engines? #93

Open
rjDipcord opened this issue Jan 13, 2024 · 4 comments
Open

Support for Other TTS Engines? #93

rjDipcord opened this issue Jan 13, 2024 · 4 comments
Labels
Type: Feature Request This is a feature request.

Comments

@rjDipcord
Copy link

⚡ Describe the New Feature

Support for alternate TTS engines would be amazing. Especially if users can host their own TTS engine, and point the bot to it. That would certainly resolve API request cost issues and also provide a much greater spectrum of available voices.

@rjDipcord rjDipcord added the Type: Feature Request This is a feature request. label Jan 13, 2024
@moonstar-x
Copy link
Owner

Hey there, do you have an example of what TTS engines could be self-hosted?

At a point I was considering writing an HTTP wrapper for macOS's say command but it would only work in a Mac.

I believe there's some other TTS binaries available for Linux too but I haven't researched them enough.

@rjDipcord
Copy link
Author

Hey there, do you have an example of what TTS engines could be self-hosted?

Sure, two examples would be ElevenLabs TTS: https://elevenlabs.io/ or FakeYou https://fakeyou.com/. Perhaps even Amazon Polly.

As for self-hosted, two of the most used are Mimic-3 https://mycroft.ai/mimic-3/ and Coqui Ai TTS https://github.com/coqui-ai/TTS

@moonstar-x
Copy link
Owner

Hmm, I checked the links you suggested.

Correct me if I'm wrong but ElevenTabs and FakeYou don't seem to be self-hostable, right?

I remember checking out FakeYou over 2 years ago. I tried to implement an interface for it but I quickly reached a rate limit with less than 10 TTS attempts, so I gave up. Not only that but it took an enormous time to generate the voices too, at least in the free version.

As for that mycroft one, it sounds like you need some specialized hardware? I didn't take that much of an intense look in there, so I may be wrong with this one.

In the case of Coqui, I remember seeing it some time ago too. I attempted to run it once on my machine and the performance with CPU was very bad too. Might be my CPU too, since I host the bot with an i3-4170 and I don't have a GPU for that server either.

I have thought of integrating some generative TTS service for this but I have yet to find one that can be used for free or doesn't require specific hardware.

I wouldn't mind working to support any of these, at least the Coqui one, but I don't think I have a way test it with my current server.

@rjDipcord
Copy link
Author

I wouldn't mind working to support any of these, at least the Coqui one, but I don't think I have a way test it with my current server.

If you would want to make a lower priority then that's cool. I had submitted the first two just as examples of services that could be supported. Albeit they are subscription based and not self-hostable.

Some of us however do have the hardware to self-host a solution that could perform well. I don't have the programming experience to contribute that way, but If you need someone to test a branch, dm me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Feature Request This is a feature request.
Projects
None yet
Development

No branches or pull requests

2 participants