Skip to content

GPU lanes requirement - will 4x do? #122

Discussion options

You must be logged in to vote

Bus utilization for Willow and WIS use cases is extremely low. What you effectively end up doing is passing a few seconds of audio worth of data to the model on the GPU and potentially a few seconds of audio back from the TTS model (if configured). We also aggressively cache TTS responses so any invocations after the first matching text string are provided by the nginx frontend proxy and don't even get to WIS itself. In practice these responses end up being nearly instant.

While all of my cards are full x16 PCIe 3.0/4.0 I doubt even x1 PCIe of any version is going to make much of a difference but we'd love to see the performance stats with this configuration to validate this theory!

Just …

Replies: 2 comments 5 replies

Comment options

You must be logged in to vote
0 replies
Answer selected by ashishpandey
Comment options

You must be logged in to vote
5 replies
@kristiankielhofner
Comment options

@jhergeth
Comment options

@ashishpandey
Comment options

@jhergeth
Comment options

@kristiankielhofner
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants