Skip to content

Commit

Permalink
bump version to v0.6.3 (#2754)
Browse files Browse the repository at this point in the history
* bump version to v0.6.3

* update supported models
  • Loading branch information
lvhan028 authored Nov 16, 2024
1 parent 9ecc44a commit 0c80baa
Show file tree
Hide file tree
Showing 10 changed files with 14 additions and 29 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -167,6 +167,7 @@ For detailed inference benchmarks in more devices and more settings, please refe
<li>Phi-3.5-vision (4.2B)</li>
<li>GLM-4V (9B)</li>
<li>Llama3.2-vision (11B, 90B)</li>
<li>Molmo (7B-D,72B)</li>
</ul>
</td>
</tr>
Expand Down
1 change: 1 addition & 0 deletions README_ja.md
Original file line number Diff line number Diff line change
Expand Up @@ -163,6 +163,7 @@ LMDeploy TurboMindエンジンは卓越した推論能力を持ち、さまざ
<li>Phi-3.5-vision (4.2B)</li>
<li>GLM-4V (9B)</li>
<li>Llama3.2-vision (11B, 90B)</li>
<li>Molmo (7B-D,72B)</li>
</ul>
</td>
</tr>
Expand Down
1 change: 1 addition & 0 deletions README_zh-CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -168,6 +168,7 @@ LMDeploy TurboMind 引擎拥有卓越的推理能力,在各种规模的模型
<li>Phi-3.5-vision (4.2B)</li>
<li>GLM-4V (9B)</li>
<li>Llama3.2-vision (11B, 90B)</li>
<li>Molmo (7B-D,72B)</li>
</ul>
</td>
</tr>
Expand Down
2 changes: 1 addition & 1 deletion docs/en/get_started/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ pip install lmdeploy
The default prebuilt package is compiled on **CUDA 12**. If CUDA 11+ (>=11.3) is required, you can install lmdeploy by:

```shell
export LMDEPLOY_VERSION=0.6.2
export LMDEPLOY_VERSION=0.6.3
export PYTHON_VERSION=38
pip install https://github.com/InternLM/lmdeploy/releases/download/v${LMDEPLOY_VERSION}/lmdeploy-${LMDEPLOY_VERSION}+cu118-cp${PYTHON_VERSION}-cp${PYTHON_VERSION}-manylinux2014_x86_64.whl --extra-index-url https://download.pytorch.org/whl/cu118
```
Expand Down
16 changes: 3 additions & 13 deletions docs/en/multi_modal/vl_pipeline.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,24 +2,14 @@

LMDeploy abstracts the complex inference process of multi-modal Vision-Language Models (VLM) into an easy-to-use pipeline, similar to the Large Language Model (LLM) inference [pipeline](../llm/pipeline.md).

Currently, it supports the following models.

- [Qwen-VL-Chat](https://huggingface.co/Qwen/Qwen-VL-Chat)
- LLaVA series: [v1.5](https://huggingface.co/collections/liuhaotian/llava-15-653aac15d994e992e2677a7e), [v1.6](https://huggingface.co/collections/liuhaotian/llava-16-65b9e40155f60fd046a5ccf2)
- [Yi-VL](https://huggingface.co/01-ai/Yi-VL-6B)
- [DeepSeek-VL](https://huggingface.co/deepseek-ai/deepseek-vl-7b-chat)
- [InternVL](https://huggingface.co/OpenGVLab/InternVL-Chat-V1-5)
- [Mono-InternVL](https://huggingface.co/OpenGVLab/Mono-InternVL-2B)
- [MGM](https://huggingface.co/YanweiLi/MGM-7B)
- [XComposer](https://huggingface.co/internlm/internlm-xcomposer2-vl-7b)
- [CogVLM](https://github.com/InternLM/lmdeploy/tree/main/docs/en/multi_modal/cogvlm.md)

We genuinely invite the community to contribute new VLM support to LMDeploy. Your involvement is truly appreciated.
The supported models are listed [here](../supported_models/supported_models.md). We genuinely invite the community to contribute new VLM support to LMDeploy. Your involvement is truly appreciated.

This article showcases the VLM pipeline using the [liuhaotian/llava-v1.6-vicuna-7b](https://huggingface.co/liuhaotian/llava-v1.6-vicuna-7b) model as a case study.
You'll learn about the simplest ways to leverage the pipeline and how to gradually unlock more advanced features by adjusting engine parameters and generation arguments, such as tensor parallelism, context window sizing, random sampling, and chat template customization.
Moreover, we will provide practical inference examples tailored to scenarios with multiple images, batch prompts etc.

Using the pipeline interface to infer other VLM models is similar, with the main difference being the configuration and installation dependencies of the models. You can read [here](https://lmdeploy.readthedocs.io/en/latest/multi_modal/index.html) for environment installation and configuration methods for different models.

## A 'Hello, world' example

```python
Expand Down
1 change: 1 addition & 0 deletions docs/en/supported_models/supported_models.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ The following tables detail the models supported by LMDeploy's TurboMind engine
| MiniGeminiLlama | 7B | MLLM | Yes | - | - | Yes |
| GLM4 | 9B | LLM | Yes | Yes | Yes | Yes |
| CodeGeeX4 | 9B | LLM | Yes | Yes | Yes | - |
| Molmo | 7B-D,72B | MLLM | Yes | Yes | Yes | NO |

"-" means not verified yet.

Expand Down
2 changes: 1 addition & 1 deletion docs/zh_cn/get_started/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ pip install lmdeploy
默认的预构建包是在 **CUDA 12** 上编译的。如果需要 CUDA 11+ (>=11.3),你可以使用以下命令安装 lmdeploy:

```shell
export LMDEPLOY_VERSION=0.6.2
export LMDEPLOY_VERSION=0.6.3
export PYTHON_VERSION=38
pip install https://github.com/InternLM/lmdeploy/releases/download/v${LMDEPLOY_VERSION}/lmdeploy-${LMDEPLOY_VERSION}+cu118-cp${PYTHON_VERSION}-cp${PYTHON_VERSION}-manylinux2014_x86_64.whl --extra-index-url https://download.pytorch.org/whl/cu118
```
Expand Down
16 changes: 3 additions & 13 deletions docs/zh_cn/multi_modal/vl_pipeline.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,24 +2,14 @@

LMDeploy 把视觉-语言模型(VLM)复杂的推理过程,抽象为简单好用的 pipeline。它的用法与大语言模型(LLM)推理 [pipeline](../llm/pipeline.md) 类似。

目前,VLM pipeline 支持以下模型:

- [Qwen-VL-Chat](https://huggingface.co/Qwen/Qwen-VL-Chat)
- LLaVA series: [v1.5](https://huggingface.co/collections/liuhaotian/llava-15-653aac15d994e992e2677a7e), [v1.6](https://huggingface.co/collections/liuhaotian/llava-16-65b9e40155f60fd046a5ccf2)
- [Yi-VL](https://huggingface.co/01-ai/Yi-VL-6B)
- [DeepSeek-VL](https://huggingface.co/deepseek-ai/deepseek-vl-7b-chat)
- [InternVL](https://huggingface.co/OpenGVLab/InternVL-Chat-V1-5)
- [Mono-InternVL](https://huggingface.co/OpenGVLab/Mono-InternVL-2B)
- [MGM](https://huggingface.co/YanweiLi/MGM-7B)
- [XComposer](https://huggingface.co/internlm/internlm-xcomposer2-vl-7b)
- [CogVLM](https://github.com/InternLM/lmdeploy/tree/main/docs/zh_cn/multi_modal/cogvlm.md)

我们诚挚邀请社区在 LMDeploy 中添加更多 VLM 模型的支持。
[这个列表中](../supported_models/supported_models.md),你可以查阅每个推理引擎支持的 VLM 模型。我们诚挚邀请社区在 LMDeploy 中添加更多 VLM 模型。

本文将以 [liuhaotian/llava-v1.6-vicuna-7b](https://huggingface.co/liuhaotian/llava-v1.6-vicuna-7b) 模型为例,展示 VLM pipeline 的用法。你将了解它的最基础用法,以及如何通过调整引擎参数和生成条件来逐步解锁更多高级特性,如张量并行,上下文窗口大小调整,随机采样,以及对话模板的定制。

此外,我们还提供针对多图、批量提示词等场景的实际推理示例。

使用 pipeline 接口推理其他 VLM 模型,大同小异,主要区别在于模型依赖的配置和安装。你可以阅读[此处](https://lmdeploy.readthedocs.io/zh-cn/latest/multi_modal/),查看不同模型的环境安装和配置方式

## "Hello, world" 示例

```python
Expand Down
1 change: 1 addition & 0 deletions docs/zh_cn/supported_models/supported_models.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@
| MiniGeminiLlama | 7B | MLLM | Yes | - | - | Yes |
| GLM4 | 9B | LLM | Yes | Yes | Yes | Yes |
| CodeGeeX4 | 9B | LLM | Yes | Yes | Yes | - |
| Molmo | 7B-D,72B | MLLM | Yes | Yes | Yes | NO |

“-” 表示还没有验证。

Expand Down
2 changes: 1 addition & 1 deletion lmdeploy/version.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Copyright (c) OpenMMLab. All rights reserved.
from typing import Tuple

__version__ = '0.6.2'
__version__ = '0.6.3'
short_version = __version__


Expand Down

0 comments on commit 0c80baa

Please sign in to comment.