Releases: aqlaboratory/openfold
New Documentation for OpenFold
With this release, we include a new home for OpenFold documentation located at: https://openfold.readthedocs.io/.
We hope that the guides provided in the documentation will help users with common workflows, as well as issues that commonly occur.
A few quality of life changes are also included:
- Adds scripts for creating the OpenFold training set from the datasets that are stored on RODA. We will aim to host the processed datasets on RODA as well in the near future.
- Adds a script for converting OpenFold v1 weights into OpenFold v2 weights, see this page for more info
- Adds
--experiment_config_json
option to bothrun_pretrained_openfold.py
andtrain_openfold.py
to more easily edit model config settings in openfold/config.py
What's Changed
- Fix distributed seeding behavior by @ljarosch in #418
- Fix resolution field in mmcif_parsing by @ljarosch in #420
- Adds mkl version to environment.yml by @jnwei in #437
- Update multi-chain permutation and permutation unittest by @dingquanyu in #406
- Duplicate expansion support by @ljarosch in #419
- Fix Colab by using OF commit from pl_upgrades by @vaclavhanzl in #432
- Adds Documentation and minor quality of life fixes by @jnwei in #439
New Contributors
Full Changelog: v2.0.0...v.2.1.0
v2.0.0
Major Changes
- SoloSeq inference: Single Sequence Inference using ESM-1b embeddings with template features is now supported. Check out SoloSeq in the README for more information.
- Multimer : Inference in multimer mode using the AlphaFold-Multimer weights is now supported. Check out Multimer in the README for more instructions, or try out multimer inference in the Colab notebook.
- Addition of a custom DeepSpeed DS4Sci_EvoformerAttention kernel for 13X reduced peak device memory requirement, leading to 15% faster training and 4x speedup during inference. Test it out using the
use_deepspeed_evo_attention
option inopenfold/config.py
. More information in the README.
All Changes
- add long_sequence_inference option to command line by @decarboxy in #253
- Fix for CPU only install by @p-durandin in #252
- Obsolete parsing and File not found fix by @l-bick in #264
- Load pretrained jax weights by @l-bick in #263
- Fix
--cpus_per_task
argument in README by @awaelchli in #278 - Add a .gitignore file by @awaelchli in #279
- Add missing warmup_num_steps parameters to DeepSpeed config generation script. by @jonathanking in #290
- Fix notebook: Colab now has python 3.8, fix imports, mitigate UTF-8 glitch by @vaclavhanzl in #285
- New option to output in ModelCIF format instead of PDB format by @josemduarte in #287
- Fix comparison for max_seqlen when downloading CAMEO. by @jonathanking in #292
- Resolving torch cuda availability for cpu-only installation by @zrqiao in #275
- Improve TriangularMultiplicativeUpdate stability in fp16 mode by @nikitos9000 in #295
- Advance python version to 3.9 to build docker in Ubuntu Lunar Lobster by @vaclavhanzl in #309
- add comment about the interpretation of ambiguous atoms by @luwei0917 in #314
- Fix Colab again after ModelCIF merge and python version change in Colab by @vaclavhanzl in #308
- Fix Colab: Install all conda packages together by @vaclavhanzl in #320
- Multimer by @dingquanyu in #331
- Added multi-chain permutation steps, multimer datamodule, and training code for multimer by @dingquanyu in #336
- Fix multimer boolean tensor error by @dingquanyu in #337
- Update multi-chain permutation by @dingquanyu in #343
- Update validation by @dingquanyu in #346
- Fixes cuda/float wrapper error in unit tests by @jnwei in #350
- Fixed inference script in multimer mode by @dingquanyu in #348
- Update multi-chain permutation and training codes by @dingquanyu in #353
- fixed the creation of best_align when permutation is turned off by @dingquanyu in #355
- Single-sequence model by @sachinkadyan7 in #354
- move the kabsch rotation step to gpu by @dingquanyu in #359
- Installation updates by @jnwei in #360
- [DNM] Update Docker config by @mattwthompson in #361
- Bump actions/checkout from 2 to 4 by @dependabot in #366
- Bump actions/setup-python from 2 to 4 by @dependabot in #365
- Fixes imports to colab notebook. by @jnwei in #372
- Adds Soloseq parameter download script. by @jnwei in #373
- Fix for MSA block deletion by @christinaflo in #374
- Adds query_multiple to jackhammer.py by @jnwei in #375
- Deepspeed evoformer attention by @christinaflo in #378
- Speed up data loading process by @dingquanyu in #376
- Update data pipeline by @dingquanyu in #385
- Bump actions/setup-python from 4 to 5 by @dependabot in #380
- Readme changes by @jnwei in #389
- suporting newer numpy by @YoelShoshan in #307
- Change test_compare_model in deepspeed test to use mean instead of max by @jnwei in #396
- Fix Miniforge3 download link in Dockerfile by @controny in #402
- Adding multimer support to OpenFold notebook by @jnwei in #401
- Type fixes and README changes for multimer branch by @jnwei in #404
- Full multimer merge by @christinaflo in #405
New Contributors
- @p-durandin made their first contribution in #252
- @l-bick made their first contribution in #264
- @awaelchli made their first contribution in #278
- @vaclavhanzl made their first contribution in #285
- @zrqiao made their first contribution in #275
- @luwei0917 made their first contribution in #314
- @dingquanyu made their first contribution in #331
- @jnwei made their first contribution in #350
- @mattwthompson made their first contribution in #361
- @dependabot made their first contribution in #366
- @christinaflo made their first contribution in #374
- @YoelShoshan made their first contribution in #307
- @controny made their first contribution in #402
Full Changelog: v1.0.1...v2.0.0
OpenFold v1.0.1
OpenFold as of the release of our manuscript. Many new features, including FP16 training + more stable training.
What's Changed
- use multiple models for inference by @decarboxy in #117
- Update input processing by @brianloyal in #116
- adding a caption to the image in the readme by @decarboxy in #133
- Properly handling file outputs when multiple models are evaluated by @decarboxy in #142
- Fix for issue in download_mgnify.sh by @josemduarte in #166
- Fix tag-sequence mismatch when predicting for multiple fastas by @sdvillal in #164
- Support openmm >= 7.6 by @sdvillal in #163
- Fixing issue in download_uniref90.sh by @josemduarte in #171
- Fix propagation of use_flash for offloaded inference by @epenning in #178
- Update deepspeed version to 0.5.10 by @NZ99 in #185
- Fixes errors when processing .pdb files by @NZ99 in #188
- fix incorrect learning rate warm-up after restarting from ckpt by @Zhang690683220 in #182
- Add opencontainers image-spec to
Dockerfile
by @SauravMaheshkar in #128 - Write inference and relaxation timings to a file by @brianloyal in #201
- Minor fixes in setup scripts by @timodonnell in #202
- Minor optimizations & fixes to support ESMFold by @nikitos9000 in #199
- Drop chains that are missing (structure) data in training by @timodonnell in #210
- adding a script for threading a sequence onto a structure by @decarboxy in #206
- Set pin_memory to True in default dataloader config. by @NZ99 in #212
- Fix missing subtract_plddt argument in prep_output call by @mhrmsn in #217
- fp16 fixes by @beiwang2003 in #222
- Set clamped vs unclamped FAPE for each sample in batch independently by @ar-nowaczynski in #223
- Fix probabilities type (
int
->float
) by @atgctg in #225 - Small fix for prep_mmseqs_dbs. by @jonathanking in #232
New Contributors
- @brianloyal made their first contribution in #116
- @josemduarte made their first contribution in #166
- @sdvillal made their first contribution in #164
- @epenning made their first contribution in #178
- @NZ99 made their first contribution in #185
- @Zhang690683220 made their first contribution in #182
- @SauravMaheshkar made their first contribution in #128
- @timodonnell made their first contribution in #202
- @nikitos9000 made their first contribution in #199
- @mhrmsn made their first contribution in #217
- @beiwang2003 made their first contribution in #222
- @ar-nowaczynski made their first contribution in #223
- @atgctg made their first contribution in #225
- @jonathanking made their first contribution in #232
Full Changelog: v1.0.0...v1.0.1
OpenFold v1.0.0
OpenFold at the time of the release of our original model parameters and training database. Adds countless improvements over the previous beta release, including, but not limited to:
- Many bugfixes contribute to stabler, more correct, and more versatile training
- Options to run OpenFold using our original weights
- Custom attention kernels and alternative attention implementations that greatly reduce peak memory usage
- A vastly superior Colab notebook that runs inference many times faster than the original
- Efficient scripts for computation of alignments, including the option to run MMSeqs2's alignment pipeline
- Vastly improved logging during training & inference
- Careful optimizations for significantly improved speeds & memory usage during both inference and training
- Opportunistic optimizations that dynamically speed up inference on short (< ~1500 residues) chains
- Certain changes borrowed from updates made to the AlphaFold repo, including bugfixes, GPU relaxation, etc.
- "AlphaFold-Gap" support allows inference on complexes using OpenFold and AlphaFold weights
- WIP OpenFold-Multimer implementation on the
multimer
branch - Improved testing for the data pipeline
- Partial CPU offloading extends the upper limit on inference sequence lengths
- Docker support
- Missing features from the original release, including learning rate schedulers, distillation set support, etc.
Full Changelog: v0.1.0...v1.0.0
OpenFold v0.1.0
The initial release of OpenFold.
Full Changelog: https://github.com/aqlaboratory/openfold/commits/v0.1.0