Releases
v0.3.3
Op 修复和性能优化
[enhancement ][op ] reduce sum half kernel #4110
[enhancement ][op ] simplify cosface #4107
[enhancement ][op ] indexed_slices update support weight_decay #4096
[enhancement ][op ][python ] Migrate swish
and mish
namespace from math
to nn
#4104
[enhancement ][op ] Add elementwise maximum/minimum ops #4069
[enhancement ][op ] Fix Code format warning in hardswish #4105
[enhancement ][feature ][op ] Add Scalar Pow #4082
[bug ][op ] Fix bug: mut_shape_view of static output maybe null in UserKernel::ForwardShape #4094
[enhancement ][op ][refactor ] Migrate cast_to_static_shape to user op #4095
[feature ][op ] Add GroupNorm op #4089
[feature ][op ] Distributed partial sampler #3857
[enhancement ][op ][python ] add Relu6 activation #4029
[bug ][op ] Rename ont_hot_op.cpp to one_hot_op.cpp #4093
[bug ][op ][python ] fix hardtanh CI precision error #4091
[enhancement ][op ] add remove_img_without_anno api for COCOReader #4088
[enhancement ][op ] Add Hardtanh activation #4049
[enhancement ][op ] Add ELU activation #4065
[enhancement ][op ][python ] Update logsoftmax.py #4041
[documentation ][op ] Fix in_top_k api document #4079
[enhancement ][op ] Add Hardswish activation #4059
[enhancement ][op ][python ] Add hard sigmoid #4043
[enhancement ][op ] Dev in top k #3611
[bug ][op ] Fix argwhere tmp buffer infer #4061
[enhancement ][op ] Optimize softmax cuda kernel #4058
[feature ][op ] Add InstanceNorm 1d & 3d implementation #4052
[feature ][op ] Quantization aware training releated ops #3764
[enhancement ][op ] Generic unfold kernel implementation #4033
[enhancement ][op ] User op dim_gather support dynamic input and index #4039
[enhancement ][op ] Reflection pad2d op #3777
[enhancement ][op ] slice support empty blob #4025
[bug ][enhancement ][op ] Migrate argwhere to user op #4021
[bug ][op ] Dev rm old tanh #4035
[enhancement ][op ][refactor ] Make MaxWithLogThreshold
and SafeLog
header only #4030
[op ][purge ] Tidy up op_conf.proto #3932
[enhancement ][op ][python ] Dev bcewithlogits loss #4024
[feature ][op ] Add implementation of InstanceNorm2D op #4020
[enhancement ][op ][refactor ] Refactor gpu_atomic_add #4027
[enhancement ][op ][python ] add kldivloss #4012
[enhancement ][op ][python ] Dev oneflow ones #3990
[enhancement ][op ] Add flatten/squeeze/expand_dims to auto mixed precision clear list and use reshape instead of reshape_like to do reshape grad computation #4015
[enhancement ][op ][python ] add pixel shuffle #4003
[enhancement ][op ] Scalar kernels use element-wise template #4013
[enhancement ][op ][python ] add zeros api #3991
[enhancement ][op ] Optimize ComputeEntropyGpu with CUB #3930
[feature ][op ] CUDA template for element-wise kernels #4007
系统组件
[enhancement ][system ] migrate job_build_and_infer api to pybind11 #3940
[feature ][system ] quantization aware training pass #3817
[eager ][enhancement ][system ] Mig op arg para attr #4102
[feature ][system ] Tensor Float 32 Support. #4072
[enhancement ][system ] Mig op arg para attr #4090
[enhancement ][system ] Mig py cfg sbp #4086
[enhancement ][system ] Refactor python remote blob #4081
[enhancement ][system ] remove BlobDef #4071
[bug ][system ] Fix warning: moving a local object in a return statement prevents copy elision #4067
[enhancement ][system ] Refactor python blob desc #4063
[feature ][system ] Add nvtx range and thread naming #4064
[documentation ][enhancement ][system ] Add docs on installing legacy versions of oneflow #4056
[bug ][system ] support eager empty blob #4047
[enhancement ][system ] Add err info for ncclGroupEnd check #4048
[enhancement ][system ] Optimize dynamic loss scale parameters #4045
[purge ][system ] Remove col_id #4046
[enhancement ][system ] Scope with symbol #4040
[enhancement ][system ] Job desc with symbol #4032
[enhancement ][system ] Parallel desc with symbol #4017
[bug ][system ] change sbp order value for layer norm #3995
[bug ][system ] Fix eager test_resume_training test #4023
[bug ][system ] Fix python cfg error bug #4018
[bug ][system ] Remove redundant pack_size
in GenericLauncher #4014
[enhancement ][system ] Set default block size to 512 #4011
[feature ][system ] Remove swig in oneflow #3969
[feature ][system ] Migrate oneflow internal api to pybind11 #3953
[build ][enhancement ][system ] Bump nccl from 2.7.3 to 2.8.3 #3875
Eager 模式
[bug ][eager ] Fix eager bug of test split like #4004
[bug ][eager ] add float16 datatype for eager boxing #4092
Python 前端
[feature ][python ] add stack #3897
[bug ][enhancement ][python ] Fix test kldivloss tolerance #4103
[bug ][enhancement ][python ] Fix "hardsigmoid" eager test error #4085
[bug ][documentation ][python ] Add hardsigmoid #4076
[api ][enhancement ][python ] add deprecate api optimizer.PolynomialSchduler #4038
工具链
[feature ][tooling ] split_cfg_cpp_and_pybind_generator #4002
[enhancement ][tooling ] Cfg hash #4084
[enhancement ][tooling ] Finetune cfg tool #4050
[enhancement ][tooling ] optimize link time #4042
编译
[build ][documentation ] Add CentOS specific info on README.md #4099
[build ][enhancement ] Disable CUDA_SEPARABLE_COMPILATION #4036
CI
[bug ][ci ] Quit docker after making ssh creadential #4075
[bug ][ci ] Fix CI outputs wrong cmd when printing failed cmd due to shadowed var #4031
[ci ][enhancement ] Upload log of distributed CI #4028
[ci ][enhancement ] Make oneflow worker docker stay alive for 6 hours #4026
[ci ][enhancement ] Allow to keep oneflow_worker log in distributed CI #4022
[ci ][documentation ] userop and general pr templates added #3952
You can’t perform that action at this time.