Skip to content

Commit

Permalink
Merge pull request #62 from xrsrke/feature/moe
Browse files Browse the repository at this point in the history
[Readme] Add contributing guideline
  • Loading branch information
xrsrke authored Dec 11, 2023
2 parents ffc987a + 9565f38 commit 4dc4711
Show file tree
Hide file tree
Showing 2 changed files with 10 additions and 7 deletions.
7 changes: 7 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
We're building an end-to-end multi-modal MoE that works in 3D parallelism, and do pre-training in a decentraized way as proposed in the paper [DiLoCo](https://arxiv.org/abs/2311.08105)

If you want to contribute, please check the following links

- High priority tasks [[link]](https://github.com/xrsrke/pipegoose/issues?q=is%3Aopen+is%3Aissue+label%3A%22help+wanted%22+label%3A%22High+Priority%22)
- Beginner tasks [[link]](https://github.com/xrsrke/pipegoose/issues?q=is%3Aopen+is%3Aissue+label%3A%22help+wanted%22+label%3A%22good+first+issue%22)
- All tasks that need help (include beginner and high priority))[[link]](https://github.com/xrsrke/pipegoose/issues?q=is%3Aopen+is%3Aissue+label%3A%22help+wanted%22)
10 changes: 3 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,14 +14,9 @@ We're building a library for an end-to-end framework for **training multi-modal
- Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism [[link]](https://arxiv.org/abs/1909.08053)


**If you're interested in contributing, check out [[CONTRIBUTING.md]](./CONTRIBUTING.md) [[good first issue]](https://github.com/xrsrke/pipegoose/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22) [[roadmap]](https://github.com/users/xrsrke/projects/5). Come join us: [[discord link]](https://discord.gg/s9ZS9VXZ3p)**

⚠️ **The project is actively under development, and we're actively seeking collaborators. Come join us: [[discord link]](https://discord.gg/s9ZS9VXZ3p) [[roadmap]](https://github.com/users/xrsrke/projects/5) [[good first issue]](https://github.com/xrsrke/pipegoose/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22)**

⚠️ **The APIs is still a work in progress and could change at any time. None of the public APIs are set in stone until we hit version 0.6.9.**

⚠️ **Currently only parallelize `bloom-560m` is supported. Support for hybrid 3D parallelism and distributed optimizer for 🤗 `transformers` will be available in the upcoming weeks (it's basically done, but it doesn't support 🤗 `transformers` yet)**

⚠️ **This library is underperforming when compared to Megatron-LM and DeepSpeed (and not even achieving reasonable performance yet).**
⚠️ **Currently only parallelize `transformers`'s `bloom` is supported.**

```diff
from torch.utils.data import DataLoader
Expand Down Expand Up @@ -94,6 +89,7 @@ We did a small scale correctness test by comparing the validation losses between
- ~~Tensor Parallelism [[link]](https://wandb.ai/xariusdrake/pipegoose/runs/iz17f50n)~~ (We've found a bug in convergence, and we are fixing it)
- ~~Hybrid 2D Parallelism (TP+DP) [[link]](https://wandb.ai/xariusdrake/pipegoose/runs/us31p3q1)~~
- Distributed Optimizer ZeRO-1 Convergence: [[sgd link]](https://wandb.ai/xariusdrake/pipegoose/runs/fn4t9as4?workspace) [[adam link]](https://wandb.ai/xariusdrake/pipegoose/runs/yn4m2sky)
- Mixture of Experts [[link]](https://wandb.ai/xariusdrake/pipegoose/jobs/QXJ0aWZhY3RDb2xsZWN0aW9uOjExOTU2MTU5MA==/version_details/v20)

**Features**
- End-to-end multi-modal including in 3D parallelism including distributed CLIP..
Expand Down

0 comments on commit 4dc4711

Please sign in to comment.