ggllm.cpp support #59

yatesdr · 2023-07-18T16:18:27Z

yatesdr
Jul 18, 2023

ggllm.cpp is a fork of llama.cpp that supports running falcon architecture models, and it is getting quite good.

I think it would be a good idea if possible to either branch or natively add support for it in this tool.

I attempted to run it with some success using symlinks to the appropriate files, and it did work somewhat, so it may not be a real heavy lift for someone more knowledgeable about the architecture of this API. The problem seemed to be that the end of stream tokens and stream closed from ggllm.cpp differently than it does with llama.cpp and this caused it to unload. Other problems are that some of the needed arguments were not supported, for example prompt ingestion batch sizes.

To get it to load, I simply built as normal and sym-linked ggllm.cpp folder to llama.cpp, and inside that folder sym-linked falcon_main to main, then fired it up with chatbot-ui and the appropriate model and arguments.

Any thoughts here?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ggllm.cpp support #59

{{title}}

Replies: 0 comments

Select a reply

ggllm.cpp support #59

yatesdr Jul 18, 2023

Replies: 0 comments

yatesdr
Jul 18, 2023