Skip to content

LMDeploy Release V0.0.7

Compare
Choose a tag to compare
@lvhan028 lvhan028 released this 04 Sep 06:39
· 855 commits to main since this release
d065f3e

Highlights

  • Flash attention 2 is supported, boosting context decoding speed by approximately 45%
  • Token_id decoding has been optimized for better efficiency
  • The gemm-tunned script has been packed in the PyPI package

What's Changed

🚀 Features

💥 Improvements

🐞 Bug fixes

📚 Documentations

Full Changelog: v0.0.6...v0.0.7