Skip to content

v0.3.3

Compare
Choose a tag to compare
@jackalcooper jackalcooper released this 13 Jan 15:09

Op 修复和性能优化

  • [enhancement][op] reduce sum half kernel #4110
  • [enhancement][op] simplify cosface #4107
  • [enhancement][op] indexed_slices update support weight_decay #4096
  • [enhancement][op][python] Migrate swish and mish namespace from math to nn #4104
  • [enhancement][op] Add elementwise maximum/minimum ops #4069
  • [enhancement][op] Fix Code format warning in hardswish #4105
  • [enhancement][feature][op] Add Scalar Pow #4082
  • [bug][op] Fix bug: mut_shape_view of static output maybe null in UserKernel::ForwardShape #4094
  • [enhancement][op][refactor] Migrate cast_to_static_shape to user op #4095
  • [feature][op] Add GroupNorm op #4089
  • [feature][op] Distributed partial sampler #3857
  • [enhancement][op][python] add Relu6 activation #4029
  • [bug][op] Rename ont_hot_op.cpp to one_hot_op.cpp #4093
  • [bug][op][python] fix hardtanh CI precision error #4091
  • [enhancement][op] add remove_img_without_anno api for COCOReader #4088
  • [enhancement][op] Add Hardtanh activation #4049
  • [enhancement][op] Add ELU activation #4065
  • [enhancement][op][python] Update logsoftmax.py #4041
  • [documentation][op] Fix in_top_k api document #4079
  • [enhancement][op] Add Hardswish activation #4059
  • [enhancement][op][python] Add hard sigmoid #4043
  • [enhancement][op] Dev in top k #3611
  • [bug][op] Fix argwhere tmp buffer infer #4061
  • [enhancement][op] Optimize softmax cuda kernel #4058
  • [feature][op] Add InstanceNorm 1d & 3d implementation #4052
  • [feature][op] Quantization aware training releated ops #3764
  • [enhancement][op] Generic unfold kernel implementation #4033
  • [enhancement][op] User op dim_gather support dynamic input and index #4039
  • [enhancement][op] Reflection pad2d op #3777
  • [enhancement][op] slice support empty blob #4025
  • [bug][enhancement][op] Migrate argwhere to user op #4021
  • [bug][op] Dev rm old tanh #4035
  • [enhancement][op][refactor] Make MaxWithLogThreshold and SafeLog header only #4030
  • [op][purge] Tidy up op_conf.proto #3932
  • [enhancement][op][python] Dev bcewithlogits loss #4024
  • [feature][op] Add implementation of InstanceNorm2D op #4020
  • [enhancement][op][refactor] Refactor gpu_atomic_add #4027
  • [enhancement][op][python] add kldivloss #4012
  • [enhancement][op][python] Dev oneflow ones #3990
  • [enhancement][op] Add flatten/squeeze/expand_dims to auto mixed precision clear list and use reshape instead of reshape_like to do reshape grad computation #4015
  • [enhancement][op][python] add pixel shuffle #4003
  • [enhancement][op] Scalar kernels use element-wise template #4013
  • [enhancement][op][python] add zeros api #3991
  • [enhancement][op] Optimize ComputeEntropyGpu with CUB #3930
  • [feature][op] CUDA template for element-wise kernels #4007

系统组件

  • [enhancement][system] migrate job_build_and_infer api to pybind11 #3940
  • [feature][system] quantization aware training pass #3817
  • [eager][enhancement][system] Mig op arg para attr #4102
  • [feature][system] Tensor Float 32 Support. #4072
  • [enhancement][system] Mig op arg para attr #4090
  • [enhancement][system] Mig py cfg sbp #4086
  • [enhancement][system] Refactor python remote blob #4081
  • [enhancement][system] remove BlobDef #4071
  • [bug][system] Fix warning: moving a local object in a return statement prevents copy elision #4067
  • [enhancement][system] Refactor python blob desc #4063
  • [feature][system] Add nvtx range and thread naming #4064
  • [documentation][enhancement][system] Add docs on installing legacy versions of oneflow #4056
  • [bug][system] support eager empty blob #4047
  • [enhancement][system] Add err info for ncclGroupEnd check #4048
  • [enhancement][system] Optimize dynamic loss scale parameters #4045
  • [purge][system] Remove col_id #4046
  • [enhancement][system] Scope with symbol #4040
  • [enhancement][system] Job desc with symbol #4032
  • [enhancement][system] Parallel desc with symbol #4017
  • [bug][system] change sbp order value for layer norm #3995
  • [bug][system] Fix eager test_resume_training test #4023
  • [bug][system] Fix python cfg error bug #4018
  • [bug][system] Remove redundant pack_size in GenericLauncher #4014
  • [enhancement][system] Set default block size to 512 #4011
  • [feature][system] Remove swig in oneflow #3969
  • [feature][system] Migrate oneflow internal api to pybind11 #3953
  • [build][enhancement][system] Bump nccl from 2.7.3 to 2.8.3 #3875

Eager 模式

  • [bug][eager] Fix eager bug of test split like #4004
  • [bug][eager] add float16 datatype for eager boxing #4092

Python 前端

  • [feature][python] add stack #3897
  • [bug][enhancement][python] Fix test kldivloss tolerance #4103
  • [bug][enhancement][python] Fix "hardsigmoid" eager test error #4085
  • [bug][documentation][python] Add hardsigmoid #4076
  • [api][enhancement][python] add deprecate api optimizer.PolynomialSchduler #4038

工具链

  • [feature][tooling] split_cfg_cpp_and_pybind_generator #4002
  • [enhancement][tooling] Cfg hash #4084
  • [enhancement][tooling] Finetune cfg tool #4050
  • [enhancement][tooling] optimize link time #4042

编译

  • [build][documentation] Add CentOS specific info on README.md #4099
  • [build][enhancement] Disable CUDA_SEPARABLE_COMPILATION #4036

CI

  • [bug][ci] Quit docker after making ssh creadential #4075
  • [bug][ci] Fix CI outputs wrong cmd when printing failed cmd due to shadowed var #4031
  • [ci][enhancement] Upload log of distributed CI #4028
  • [ci][enhancement] Make oneflow worker docker stay alive for 6 hours #4026
  • [ci][enhancement] Allow to keep oneflow_worker log in distributed CI #4022
  • [ci][documentation] userop and general pr templates added #3952