You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would like to use my KoboldCPP fork ( https://github.com/Nexesenex/croco.cpp ) for inference with Lollms (with quantized KV cache, MMQ kernels for Cuda, Tensor split, specific modifications, etc), and I'd need one of these two ways, that SillyTavern offers in order to use your impressive work with the full potential of my hardware :
Text completion:
Chat completion:
If such thing is already possible right now in Lollms, forgive me, I'm a beginner with Lollms.
If not, such access to custom inference software / OAI compatible API would be a great addition to your software, because the LlamaCPP Python bindings or Ollama are too restricted, both by themselves and with the settings you allow the access to for most users to not be drowned) in the possibilities they offer for some Enthusiasts users like me.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hey Saifeddine.
I would like to use my KoboldCPP fork ( https://github.com/Nexesenex/croco.cpp ) for inference with Lollms (with quantized KV cache, MMQ kernels for Cuda, Tensor split, specific modifications, etc), and I'd need one of these two ways, that SillyTavern offers in order to use your impressive work with the full potential of my hardware :
Text completion:
Chat completion:
If such thing is already possible right now in Lollms, forgive me, I'm a beginner with Lollms.
If not, such access to custom inference software / OAI compatible API would be a great addition to your software, because the LlamaCPP Python bindings or Ollama are too restricted, both by themselves and with the settings you allow the access to for most users to not be drowned) in the possibilities they offer for some Enthusiasts users like me.
Best regards from Nantes!
Beta Was this translation helpful? Give feedback.
All reactions