Release v0.3.3 · Oneflow-Inc/oneflow

Op 修复和性能优化

[enhancement][op] reduce sum half kernel #4110
[enhancement][op] simplify cosface #4107
[enhancement][op] indexed_slices update support weight_decay #4096
[enhancement][op][python] Migrate swish and mish namespace from math to nn #4104
[enhancement][op] Add elementwise maximum/minimum ops #4069
[enhancement][op] Fix Code format warning in hardswish #4105
[enhancement][feature][op] Add Scalar Pow #4082
[bug][op] Fix bug: mut_shape_view of static output maybe null in UserKernel::ForwardShape #4094
[enhancement][op][refactor] Migrate cast_to_static_shape to user op #4095
[feature][op] Add GroupNorm op #4089
[feature][op] Distributed partial sampler #3857
[enhancement][op][python] add Relu6 activation #4029
[bug][op] Rename ont_hot_op.cpp to one_hot_op.cpp #4093
[bug][op][python] fix hardtanh CI precision error #4091
[enhancement][op] add remove_img_without_anno api for COCOReader #4088
[enhancement][op] Add Hardtanh activation #4049
[enhancement][op] Add ELU activation #4065
[enhancement][op][python] Update logsoftmax.py #4041
[documentation][op] Fix in_top_k api document #4079
[enhancement][op] Add Hardswish activation #4059
[enhancement][op][python] Add hard sigmoid #4043
[enhancement][op] Dev in top k #3611
[bug][op] Fix argwhere tmp buffer infer #4061
[enhancement][op] Optimize softmax cuda kernel #4058
[feature][op] Add InstanceNorm 1d & 3d implementation #4052
[feature][op] Quantization aware training releated ops #3764
[enhancement][op] Generic unfold kernel implementation #4033
[enhancement][op] User op dim_gather support dynamic input and index #4039
[enhancement][op] Reflection pad2d op #3777
[enhancement][op] slice support empty blob #4025
[bug][enhancement][op] Migrate argwhere to user op #4021
[bug][op] Dev rm old tanh #4035
[enhancement][op][refactor] Make MaxWithLogThreshold and SafeLog header only #4030
[op][purge] Tidy up op_conf.proto #3932
[enhancement][op][python] Dev bcewithlogits loss #4024
[feature][op] Add implementation of InstanceNorm2D op #4020
[enhancement][op][refactor] Refactor gpu_atomic_add #4027
[enhancement][op][python] add kldivloss #4012
[enhancement][op][python] Dev oneflow ones #3990
[enhancement][op] Add flatten/squeeze/expand_dims to auto mixed precision clear list and use reshape instead of reshape_like to do reshape grad computation #4015
[enhancement][op][python] add pixel shuffle #4003
[enhancement][op] Scalar kernels use element-wise template #4013
[enhancement][op][python] add zeros api #3991
[enhancement][op] Optimize ComputeEntropyGpu with CUB #3930
[feature][op] CUDA template for element-wise kernels #4007

系统组件

[enhancement][system] migrate job_build_and_infer api to pybind11 #3940
[feature][system] quantization aware training pass #3817
[eager][enhancement][system] Mig op arg para attr #4102
[feature][system] Tensor Float 32 Support. #4072
[enhancement][system] Mig op arg para attr #4090
[enhancement][system] Mig py cfg sbp #4086
[enhancement][system] Refactor python remote blob #4081
[enhancement][system] remove BlobDef #4071
[bug][system] Fix warning: moving a local object in a return statement prevents copy elision #4067
[enhancement][system] Refactor python blob desc #4063
[feature][system] Add nvtx range and thread naming #4064
[documentation][enhancement][system] Add docs on installing legacy versions of oneflow #4056
[bug][system] support eager empty blob #4047
[enhancement][system] Add err info for ncclGroupEnd check #4048
[enhancement][system] Optimize dynamic loss scale parameters #4045
[purge][system] Remove col_id #4046
[enhancement][system] Scope with symbol #4040
[enhancement][system] Job desc with symbol #4032
[enhancement][system] Parallel desc with symbol #4017
[bug][system] change sbp order value for layer norm #3995
[bug][system] Fix eager test_resume_training test #4023
[bug][system] Fix python cfg error bug #4018
[bug][system] Remove redundant pack_size in GenericLauncher #4014
[enhancement][system] Set default block size to 512 #4011
[feature][system] Remove swig in oneflow #3969
[feature][system] Migrate oneflow internal api to pybind11 #3953
[build][enhancement][system] Bump nccl from 2.7.3 to 2.8.3 #3875

Eager 模式

[bug][eager] Fix eager bug of test split like #4004
[bug][eager] add float16 datatype for eager boxing #4092

Python 前端

[feature][python] add stack #3897
[bug][enhancement][python] Fix test kldivloss tolerance #4103
[bug][enhancement][python] Fix "hardsigmoid" eager test error #4085
[bug][documentation][python] Add hardsigmoid #4076
[api][enhancement][python] add deprecate api optimizer.PolynomialSchduler #4038

工具链

[feature][tooling] split_cfg_cpp_and_pybind_generator #4002
[enhancement][tooling] Cfg hash #4084
[enhancement][tooling] Finetune cfg tool #4050
[enhancement][tooling] optimize link time #4042

编译

[build][documentation] Add CentOS specific info on README.md #4099
[build][enhancement] Disable CUDA_SEPARABLE_COMPILATION #4036

CI

[bug][ci] Quit docker after making ssh creadential #4075
[bug][ci] Fix CI outputs wrong cmd when printing failed cmd due to shadowed var #4031
[ci][enhancement] Upload log of distributed CI #4028
[ci][enhancement] Make oneflow worker docker stay alive for 6 hours #4026
[ci][enhancement] Allow to keep oneflow_worker log in distributed CI #4022
[ci][documentation] userop and general pr templates added #3952

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.3.3

Op 修复和性能优化

系统组件

Eager 模式

Python 前端

工具链

编译

CI