v0.5rc2
Changelog
v0.5rc2 (28/09/2021)
Highlights
- First class support for eager execution. The deprecated APIs are moved to
oneflow.compatible.single_client
- Drop-in replacement of
import torch
for existing Pytorch projects. You could test it by inter-changingimport oneflow as torch
andimport torch as flow
. - nn.Module for eager execution
- nn.Graph for lazy execution
- DDP for data parallel
A sneak peek of the new API
Here is a minimum example showcasing how to incorporate a nn.Module
in a nn.Graph
and have it run in lazy mode.
class NeuralGraph(flow.nn.Graph):
def __init__(self, ...):
super().__init__()
self.model = model # model is a nn.Module instance
def build(self, x):
y_pred = self.model(x)
return y_pred
graph = NeuralGraph() # to create a nn.Graph instance
y_pred = graph(x) # to run the created nn.Graph
New in Python API
- [feature][eager][op][test][python][interface] Add test for convtranspose2d #5239
- [enhancement][python][interface] Add GroupNorm #5175
- [enhancement][eager][python][interface] [Add] avgpool1d avgpool3d #5165
- [feature][eager][op][python][interface] Add deconv cpu impl #5224
- [bug][eager][api][python][interface] Fix acosh bug #5221
- [feature][eager][op][python][interface] Dev modules ctc loss #5168
- [bottleneck][bug][documentation][python][interface] Fix meshgrid test bug #5208
- [eager][documentation][python][interface] Rename CosineScheduler to CosineAnnealingLR #5112
- [feature][eager][python][interface] Add meshgrid module #5205
- [enhancement][feature][bug][op][python] support bias in conv2d's parameter list #5322
- [eager][documentation][api][python][interface] add not_equal, greater_equal and less_equal module #5350
- [enhancement][eager][python] refine pow module and its test #5319
- [enhancement][eager][op][python] Add triu op #5329
- [enhancement][bug][python] Fix optimizer for not supporting all kinds of iterables #5355
- [bug][python][interface] raise IndexError in get_canonical_index to support for loop #5345
- [bug][python][interface] tensor slice assign supports broadcasting #5344
- [enhancement][op][python] add cpu group conv logic #5314
- [enhancement][python] Add 'nn.Mish' module and corresponding functions #5310
- [enhancement][build][python] Remove ONNX from setup py #5297
- [enhancement][python][interface] [add] zeropad2d #5278
- [feature][system][python][interface] Lazy nn.Graph FeedInputOpExpr #5458
- [feature][python][interface] integrate nn.image.flip #5411
- [bug][python] Fix issues in point of MultiClientSession #5469
- [enhancement][bug][python] update HasAllMultiClientEnvVars() #5459
- [enhancement][python] Add in_top_k function #5428
- [enhancement][python] Dev add docstring #5449
- [feature][api][python] MultiClientSession #5407
- [documentation][python] remove --user #5431
- [feature][python][interface] nn.Graph python #5309
- [feature][python][interface] Fea/nn graph/graph name #5413
- [bug][python][interface] rm nn.Graph.train #5424
- [op][documentation][api][python][interface] add bernoulli module #5353
- [enhancement][python] flow.S/B/P #5306
- [enhancement][documentation][python] Add instruction on upgrade pip #5400
- [enhancement][python] Rm oneflow export and experimental #5589
- [bug][python] Fix nn.graph.utils module conflict #5598
- [feature][ci][python] Update autotest framework #5520
- [enhancement][python] copy of_proto_python_dir to compatible_single_client_python #5539
- [enhancement][api][python] del default env init #5537
- [enhancement][python] Fix single client using same glog file #5535
- [bug][api][python] Fix Session TryClose #5531
- [enhancement][feature][python] split vector-matrix norm #5478
- [feature][eager][op][python][interface] Add more upsample kernel #5382
- [enhancement][feature][test][python] add torchstyle unittest #5489
- [feature][system][python] nn.Graph with training #5662
- [enhancement][feature][python] Fea/nn graph/block proxy func #5727
- [enhancement][api][python] consistent_tensor_to_api #5703
- [feature][eager][op][python] Dev Align torch avgpool #5610
- [enhancement][python] fix circular deps of sbp python module #5706
- [documentation][python] [part5]Remove singleclient outdated api #5674
- [enhancement][python] [part4]Remove singleclient outdated api #5672
- [bug][op][python] remove outdated code in conv3d #5696
- [enhancement][test][python] enlarge tolerance of dataloader test #5689
- [enhancement][test][python] add autotest for some math ops #5646
- [feature][python] nn.Graph optimizer part 2: add L2, pass job complete, refactor #5604
- [enhancement][python] Add clip_grad_norm #5299
- [purge][python] Remove Single-Client API in oneflow default python #5827
- [bug][python] Fix ddp grad size #5834
- [enhancement][feature][python] Dev RMSprop graph conf #5768
- [enhancement][purge][eager][python] remove scale arg in optimizer #5821
- [enhancement][feature][python] graph/block io check #5803
- [enhancement][feature][python] Dev adam graph conf #5709
- [purge][python] [part10]Remove singleclient outdated api #5756
- [feature][api][python] better repr of nn.Graph for debug #5762
- [bug][python] fix weight decay in RMSprop #5755
- [purge][python] [part9]Remove singleclient outdated api #5752
- [purge][python] [part8]Remove singleclient outdated api #5750
- [documentation][python] add first batch of methods in oneflow.nn.functional namespace #5693
- [purge][python] [part6]Remove singleclient outdated api #5704
- [bug][python] use default_generator.seed() as random_seed in init #5721
- [bug][system][python] ddp broadcast params and buffers #5913
- [enhancement][test][python] Add consistent tensor requires grad test #5925
- [bug][python] wrap flow.nn.init.* with flow.no_grad() #5932
- [feature][api][python][interface] add clip_grad to optimizer #5817
- [enhancement][ci][op][test][python] add randperm with test and docs #5680
- [feature][api][python] Fea/nn graph/ lr_schedule(and cosine lr_sch) and opt_group #5846
- [bug][python] fix bug of SyncOnMasterFn atexit #5909
- [purge][python] Delete single client nn modules #6061
- [enhancement][python] Move framework.distribute to env #6022
- [bug][python] skip sync when abnormally exiting #6025
- [feature][python] Fea/nn graph/warmup amp config #5969
- [documentation][python] add optimizer api docs #6131
- [documentation][python] add_tensor_api_doc #6127
- [bug][python] Fix test_grid_sample.py and test_affine_grid.py threshold #6125
- [documentation][api][python] add doc of graph #6093
- [bug][python] Fix make of_format fail in ubuntu #6120
- [feature][api][python][interface] Fea/graph helpers #6088
- [enhancement][eager][python][interface] Use flow.randint in dataloader #6086
- [feature][eager][api][python][interface] Import oneflow as torch #6076
- [enhancement][test][api][python][refactor] rename OfrecordReader to OFRcordReader #6090
- [purge][python][need-single-client-tests] Delete single client nn modules #6082
- [enhancement][python] flow.load tolerates FileNotFound fault #6083
- [feature][python] Fea/pipeline in graph #6105
- [enhancement][test][python] graph activation checkpointing #6192
- [enhancement][feature][op][python] rnn test #6165
New in Ops:
- [enhancement][op][api][refactor] [Functional] Part2: Add partial unary and math functional apis #5218
- [enhancement][bug][op][interface] Refine deconv kernel #5229
- [enhancement][op][api][interface] add ReflectionPad2d #5172
- [feature][eager][op][api][interface] crossentropyloss and nllloss support ignore_index #5195
- [feature][eager][op][api][interface] Yejiaojiao/dev bcewithlogitsloss #5173
- [bug][ci][op] Dev user op set default is_dynamic #5223
- [enhancement][op] add magic method for pow #5199
- [enhancement][op][interface] add cpu version of upsampling #5194
- [enhancement][bug][op][api][interface] add ReplicationPad2d #5148
- [feature][eager][op][api][interface] add kldivloss module #5155
- [feature][eager][op][documentation][build][api][interface] Add floor module and the corresponding testcases #4964
- [enhancement][feature][op] Dev conv1d module #5280
- [enhancement][op] Add ctc_greedy_decoder op #5294
- [enhancement][op][system] Dev remove default grad func #5320
- [enhancement][op][system] Add pad grad func. #5354
- [enhancement][op][system] Add gradient funcs. #5348
- [feature][purge][bug][eager][op][interface] fix upsample nearest bug #5347
- [enhancement][op][system] [Functional] Part7: Migrate pooling ops #5253
- [enhancement][op] nvjpeg hardware acc #5240
- [enhancement][feature][ci][eager][op][api][interface] Add bmm module #5334
- [enhancement][eager][op] Dev image decode eager #5333
- [enhancement][op] Optimize softmax warp impl #4977
- [enhancement][eager][op] Dev tensor buffer eager #5317
- [enhancement][op][api][refactor] [Functional] Part6: Migrate conv op #5252
- [enhancement][eager][op] Dev sort eager #5284
- [enhancement][bug][op][api] fix bceloss bug in default weight and reduction #5303
- [bug][eager][op] remove redundant assert and check #5264
- [enhancement][bug][ci][op] fix bceloss bug about weight #5269
- [enhancement][op][api][refactor] [Functional] Part5: Migrate nn ops #5249
- [enhancement][eager][op] Dev argsort eager #5273
- [enhancement][op][api][refactor] [Functional] Part4: Migrate array ops #5247
- [enhancement][op][api][refactor] [Functional] Part3: Migrate binary and activation ops #5246
- [bug][ci][op][test] Dev fix rmsprop ci fail #5481
- [enhancement][op] add inplace method: Tensor.sin_ #5471
- [bug][op] hotfix image_batch_align #5461
- [enhancement][eager][op][interface] Dev maxpool series op 123d #5244
- [bug][op] fix pool gpu kernel #5446
- [feature][eager][op][api][interface] add pixelshufflev2 module #5383
- [enhancement][feature][ci][eager][op][documentation][api][interface] Add flow xxx and tensor xxx autotest #5386
- [enhancement][feature][eager][op][api][interface] Modules chunk #5324
- [enhancement][eager][op] add image normalize for eager #5402
- [enhancement][eager][op] Dev batch align module #5401
- [enhancement][eager][op] add coco reader module #5391
- [enhancement][wip][op] Restruct Elementwise kernel #4130
- [bug][op] Fix DecodeRandom reuse mem #5606
- [enhancement][op] Align pytorch maxpool #5525
- [enhancement][bottleneck][eager][op][api] implementation of constantpad-3d op #5529
- [enhancement][eager][op] Add scale size for resize #5509
- [enhancement][op][api][refactor] Dev optimize tensor setitem #5501
- [enhancement][op] register uint8 dtypeto support dataloader #5499
- [enhancement][op] Add unique.cuh #5487
- [enhancement][op][api][interface] Dev ofrecord auto truncating #5412
- [feature][op][system][interface] Feat: LazyInterpret::ApplyImpl support SourceUserOpExpr and Copy #5711
- [enhancement][op][interface] Dev logical_and/or modules #5636
- [enhancement][op] support any number positional arguments for ones and zeros op #5698
- [enhancement][feature][eager][op] Add conv3d Module #5327
- [feature][eager][op][api][interface] add batchnorm3d module #5631
- [bug][eager][op] fix reduce min max backward bug #5651
- [enhancement][op] Debug dim scatter #5371
- [enhancement][op][interface] Dev eye #5583
- [enhancement][eager][op] Dev minimum maximum #5576
- [enhancement][op] Restruct activation grad op #5669
- [enhancement][feature][eager][op] Rewrite activation function #5465
- [bug][op][documentation] add oneflow.cat for documentation #5621
- [enhancement][op] Lcy logsoftmax #5746
- [feature][op][need-simple-ci] Feat empty op #5659
- [enhancement][eager][op] Dev split #5714
- [enhancement][op][interface] add index_select op #5661
- [bug][op] fix nvjpeg hw acc #5851
- [enhancement][op] Remove move in conv_cudnn #5828
- [enhancement][op][interface] Dev logical_xor module #5694
- [bug][eager][op] fix squeeze #5808
- [enhancement][op] Get parallel_id and parallel_num through rank and world size in DDP #5717
- [bug][eager][op] delete interpolate int type #5805
- [bug][op] Fix bug in scatter #5743
- [enhancement][op] Refactor: remove module not required, call function directly #5754
- [enhancement][op] Remove modules not required(tan, erfc, log1p, scatter_nd) #5791
- [enhancement][op] Refactor scatter, clamp and pow in cpp instead of in python #5715
- [enhancement][op] Rm useless code in gather files #5687
- [enhancement][eager][op] change flip_code to scalar #5786
- [enhancement][bug][op][interface] fix upsample bug #5753
- [bug][op][interface] Quick fix Lazy nn.Graph input/output OpConf.BlobConf.is_dynamic #5767
- [enhancement][bug][eager][op] fix argwhere 0-dim bug #5760
- [enhancement][eager][op] delete unused code #5744
- [feature][op] Export fused_scale_tril op #5933
- [bug][op] Fix backward bug in 3d #5908
- [bug][op] Fix one_hot api limit #5927
- [enhancement][eager][op] Dev where scalar #5797
- [bug][op] fix grad error #5914
- [feature][bug][op] Fix inplace op circle reference bug #5910
- [enhancement][op] Move the judgment content to c++, And add scalar fmod #5854
- [enhancement][op] Support combined_margin_loss op in flow.nn.modules #5830
- [enhancement][op][api][interface] functional_one_hot #5315
- [enhancement][op] Dev scalar op #5778
- [bug][eager][op] fix gather kernel 0 shape #5888
- [enhancement][op] add l2_normalize for mutl-client interfaces #5859
- [feature][op] Export function softmax_cross_entropy #6056
- [enhancement][op] Add int attr for functional adaptive average pool #6059
- [enhancement][op][interface] dev full op #5955
- [bug][eager][op] fix 0dim inplace add #6029
- [feature][op][system][interface] Feat: nn.Graph image gpu decoder #6014
- [enhancement][op][interface] dev optim_optim_lr_scheduler_multisteplr #5975
- [enhancement][op] NopKernel #6035
- [enhancement][eager][op][api] Dev tril op #6005
- [enhancement][op] dev unfold and fold #5675
- [enhancement][op] ResNet CUDA Graphs #6018
- [enhancement][feature][op] add broadcast pow #6013
- [enhancement][op][interface] init of op diag #5298
- [op][documentation][api] Fix api document bug #6009
- [enhancement][op] Dev fused functional #5954
- [bug][op][build] Add nvcc flag -Werror cross-execution-space-call #6002
- [bug][op] Fix Normalization grad function #5993
- [enhancement][feature][eager][op][test][interface] Add fused self attention #5966
- [enhancement][bug][ci][eager][op][api][interface] Try to fix var bug #5973
- [enhancement][feature][eager][op][interface] add prod op #5867
- [enhancement][eager][op][api] add glu op #6065
- [enhancement][op] Align Torch.nn.functional poolXd #6184
- [bug][eager][op] fix backward index for gamma beta #6149
- [bug][op][system] Fix BroadcastMatmulGrad bug #6168
- [enhancement][op][api] Add Int support for functional.avg/maxpool #6174
- [bug][eager][op][api][interface] align dropout api name with pytorch #6170
- [enhancement][op] support inplace operation for hardsigmoid #6137
- [enhancement][bug][op] Fix do bias correction in Adam/AdamW #5960
- [bug][eager][op][api][interface] fix repeat 0-dim tensor bug #6150
- [enhancement][bug][op] Fix select_first_grad bug #6142
- [bug][ci][eager][op][documentation][interface] Add clipgrad doc and contiguous #6130
- [bug][op] Fix eager optim dynamic attr bug #6111
- [enhancement][op] Support grid_sample and affine_grid operator #6038
- [op][documentation] Export apis for documentation #6068
- [enhancement][feature][bug][ci][eager][op][documentation][interface] transfer python function to c++ method #6114
- [op][documentation] Dev functional batch_gather #6233
- [enhancement][op][test] fix cross_entropy_loss and its test #5799
- [bug][op] Use attr nd_sbp to check consistent #6222
- [enhancement][op] Dev fused bn functional #6077
- [enhancement][op] support default value in intlist #6201
- [bug][op] fix sparse_softmax get_nd_sbp #6203
- [bug][op] Fix bug in model fused update #6197
- [enhancement][op][system][refactor] Optimize tensor getitem. #5433
New in Eager:
- [enhancement][eager][interface] Reconstruct module files #5251
- [bug][eager][documentation][interface] Fix conv module bug #5245
- [bug][ci][eager][interface] Fix bce withlogitloss ci error #5237
- [feature][eager][api][interface] module BCELoss #5144
- [enhancement][feature][eager][api][interface] Dev norm op #5178
- [enhancement][bug][eager] Fix stack module #5222
- [enhancement][feature][eager] Support different dtype of equal module #5214
- [enhancement][bug][eager][documentation][api][interface] Add nllloss backward #5210
- [enhancement][eager][api][upload-core] Decouple FileSystem and IOConf #5162
- [enhancement][ci][eager] Set lower precision avoid ci failing #5200
- [eager][documentation] Add hint when apply FunctionNode second time #5369
- [enhancement][feature][bug][ci][eager][documentation][api] Fix upsample bilinear bug #5366
- [bug][eager] Fix not contiguous ndarray to tensor bug #5351
- [enhancement][eager][system] Infer consistent tensor meta #5118
- [feature][eager] Feat graph autograd engine #5296
- [enhancement][eager][interface] Dev type as module #5349
- [feature][eager][documentation][api][interface] Add new ones module #5342
- [enhancement][bug][eager] Fix logical slice assign dtype #5339
- [bug][ci][eager][documentation][api][interface] Fix where module bug #5300
- [bug][ci][eager][documentation][api] Fix l1loss ci error #5307
- [enhancement][bug][eager][documentation][api][interface] Qi's First Edit of deleting "print" and ".numpy" #5129
- [feature][eager][refactor] Separate autograd meta to tensor #5267
- [feature][eager][api][interface] add tile module #5234
- [enhancement][eager] Release lambda function to reuse tensor memory #5266
- [feature][bug][eager][documentation] Fix default value not set bug #5483
- [enhancement][eager][interface] [Add] gather_nd scatter_nd #5422
- [enhancement][bug][eager] fix param #5473
- [bug][eager] Fix Tensor.grad setter bug #5462
- [enhancement][eager] Rename now_grad_arg to current_grad #5466
- [eager][test][documentation][interface] Add autotest part1 #5436
- [enhancement][eager] Use functional copy instead of op_builder #5460
- [bottleneck][bug][eager][interface] fix -1 index not support bug #5448
- [bug][ci][eager][documentation][api] Fix concat backward bug #5443
- [enhancement][bug][ci][eager] Add autograd engine warning #5444
- [feature][eager][api][interface] Smoothl1loss #5256
- [enhancement][bottleneck][eager] remove device dtype params #5434
- [bug][ci][eager][documentation][interface] Delete maxpool failed test #5409
- [enhancement][eager][api] Add tensor grad assginment #5379
- [enhancement][bug][eager] fix-abs #5398
- [enhancement][bug][eager][interface] Fix bn track running stats #5393
- [enhancement][bug][eager] Support uint dtype of constant op #5396
- [enhancement][bug][eager][documentation][interface] Delete useless code upsample #5392
- [enhancement][ci][eager][interface] add flow.view #5301
- [enhancement][bug][ci][eager][api][interface] Add masked select module #5356
- [bug][eager][interface] Fix batchnorm backward bug #5602
- [enhancement][eager] Support weight_dacay(l2 actually) #5587
- [feature][eager][documentation][api] Add new autotest #5588
- [enhancement][eager][documentation][api] Dev fmod #5404
- [feature][eager] Support inplace add #5432
- [feature][eager][interface] Feat tensor stride property #5543
- [enhancement][feature][eager][documentation][api] Add flip module #5541
- [feature][eager] Feat module repr #5486
- [enhancement][bottleneck][bug][eager][interface] Fix maxpool1d params #5493
- [enhancement][feature][eager][interface] Dev flow.utils.data part1 #5406
- [bug][eager][api] Fix tensor getitem bug #5474
- [enhancement][eager][need-simple-ci] export datasets interface #5691
- [enhancement][eager][system] rebase #5601
- [enhancement][eager][test] added nn.RecordBytesDecoder with its test #5475
- [enhancement][feature][eager][need-simple-ci] 0-dim tensor support #5552
- [enhancement][bug][eager] rewrite slice_update backward #5677
- [enhancement][bug][eager][interface] align view input style with torch #5676
- [enhancement][eager][interface][need-simple-ci] add autotests for modules #5666
- [enhancement][bottleneck][eager][interface] Dev constantpad1d op #5579
- [enhancement][eager][api][interface] Restruct MathOps AutoTest #5654
- [enhancement][bug][ci][eager] Fix flip bug #5657
- [bug][eager][api][interface] Fix expand module bug #5650
- [enhancement][bug][eager][documentation][api] Fix repeat bug #5633
- [enhancement][eager][test][api][interface] Add new autotest #5617
- [enhancement][eager][api][interface] Dev flow.utils.data part2 #5500
- [enhancement][bug][eager] make setitem device match #5835
- [bug][eager][api][interface] align reshape input param with pytorch #5804
- [feature][bug][eager][api] Align where op with torch #5850
- [enhancement][bug][eager][api] Restruct prelu op #5829
- [bug][eager][need-simple-ci] fix pooling ceil_mode bug #5818
- [enhancement][eager] stateful local kernel supports consistent #5789
- [bug][eager][api][interface] Fix argwhere bug #5816
- [enhancement][eager][documentation][api] dev-nonzero #5809
- [enhancement][feature][eager][api] Add fake quantize op #5690
- [enhancement][bug][eager][documentation][api] Add api #5663
- [enhancement][eager] Refactor consistent infer result #5790
- [bug][eager][need-simple-ci] skip dataloader test #5780
- [bug][eager][need-simple-ci] fix 0-dim tensor.fill_ #5771
- [enhancement][eager] Cpu mpi broadcast #5726
- [feature][eager] Feat grad mode classes #5956
- [enhancement][bug][eager] fix wrong names #5951
- [enhancement][eager][system] Local dep object pool #5953
- [enhancement][eager][interface] rename OpExprInterpState to AutoGradCaptureState #5918
- [bug][eager] Fix linear bug #5945
- [bug][eager] Fix tensor_meta update bug #5924
- [enhancement][eager] use flow.randperm #5928
- [enhancement][eager] consistent init/save/load #5896
- [enhancement][bug][eager][documentation][interface] Restruct sort and argsort op #5911
- [enhancement][bug][eager][interface] Try to fix the problem that the insightface cannot converge。 #5906
- [enhancement][bug][eager][interface] Add autotest #5899
- [enhancement][eager] The scheduler thread joins worker threads #5893
- [enhancement][eager] Bugfix async callback #5881
- [feature][eager] Feat tensor to bool #5836
- [bug][eager] Remove inplace broadcast_add #5551
- [enhancement][eager] Broadcast consistent shape and dtype #5784
- [enhancement][eager] Fix optimizer list parameters input bug #5848
- [enhancement][eager][interface] Dev flow.utils.data part3 #5644
- [enhancement][eager][api] Normalize naming of modules #6066
- [enhancement][feature][eager][api][interface] add truncnormal #6051
- [enhancement][bug][eager] AutoMatedTest support test module.parameter.grad #6043
- [enhancement][feature][bug][eager] add module call kwags #6069
- [enhancement][eager][api][interface] add tensor.item tensor.tolist #6021
- [enhancement][eager][api][interface] Export pool ops api #6047
- [enhancement][bug][eager][test][documentation][interface] Add more autotest sample #6039
- [enhancement][bug][eager][system] disable cuda_h2d stream #6020
- [feature][eager][test][api][interface] Add autotest codegen #6019
- [feature][eager][documentation] Refactor cosine lr scheduler #6000
- [enhancement][eager][interface] tensor.cpu/tensor.cuda #5894
- [enhancement][eager][api] Support consistent_tensor.to(dtype) #5991
- [bug][eager][interface] remove redundant codes in ModuleDict #5961
- [bug][eager] Fix LayerNorm check bug #6196
- [enhancement][eager][api] Change dropout api #6182
- [enhancement][good for pr][eager][api][interface] add: test convert dependency #6023
- [enhancement][bug][eager][interface] Fix autotest codegen bug #6171
- [bug][eager] restore instr_local_dep_object_pool_size for nccl #6160
- [enhancement][eager][api][interface] Aligin pooling op functional api names with torch #6163
- [feature][bug][eager][api][interface] delete file #6162
- [bug][eager] Fix optim load_state_dict bug #6152
- [enhancement][eager][api] add is_training to dropout functor #6148
- [enhancement][eager] Decompose nd sbp boxing #5800
- [enhancement][eager] support consistent_tensor.to(copy=True) #6122
- [feature][eager] Static grad scaler #6135
- [bug][eager] Fix LayerNorm expr bug #6121
- [bug][eager][api] move numpy c api init in numpy.cpp, make np array contiguous before copying #6117
- [enhancement][eager][refactor] Remove params from ParamGroup getitem #6096
- [enhancement][feature][eager] Support tensor and optimizer serialization #6087
- [enhancement][bug][eager] fix bug about tensor str in nonsymmetric cast and getitem in consist… #6239
- [enhancement][eager] Cpu all reduce #5849
- [feature][eager] Support assign copy interface #6228
- [enhancement][eager][api][interface] Dev reconstruct pad ops #6223
- [enhancement][eager][api][interface] support flow.cuda.is_available #6124
- [bug][eager] make flow._C.local_all_reduce sync lanuched #6175
- [enhancement][eager] Rename flow to oneflow in user hint #6190
- [bug][eager][tooling][test][api][interface] Autotest generate input tensor #6206
- [enhancement][eager] consistent tensor zeros_() #6202
- [enhancement][eager] Cpu mpi #5865
Build enhancements:
- [bug][build] Fix GRPC compilation failure on CMake 3.20 #5255
- [bug][build] Refine header file copy #5254
- [bug][build] Fix older version CMake doesn't support multiple targets in CLI #5248
- [bug][build] Turn off NCCL_STATIC/CUDNN_STATIC when CUDA_STATIC is OFF #5243
- [feature][build] Fix support for Ninja and add Ninja build in Simple CI #5236
- [enhancement][build] Add cmake option CUDA_STATIC #5164
- [bug][build] Fix protobuf debug postfix #5233
- [enhancement][ci][build] Move default third party dir into build dir #5230
- [enhancement][build] Refine protobuf cmake #5216
- [enhancement][ci][build] Remove transport test main #5215
- [enhancement][ci][build] Speedup opencv build #5213
- [enhancement][build] Support clang #5015
- [enhancement][documentation][build] Add prefix when creating git archive #5201
- [enhancement][build] Add cmake option NCCL_STATIC #5160
- [enhancement][build] Refine CMake CUDA version handling #5192
- [enhancement][build] Use clang plugin to check Maybe variables are used #5358
- [enhancement][build] Add BUILD_BYPRODUCTS for ExternalProject_Add #5316
- [enhancement][build] Add cmake init cache to simplify user onboarding #5311
- [feature][bug][build] Fix macOS support and run macOS build in Simple CI #4947
- [enhancement][build] flatbuffers use mirror #5295
- [enhancement][build] Don't build test by default #5302
- [enhancement][build] Prevent building from scratch when toggle flag BUILD_GIT_VERSION #5259
- [enhancement][build] Refine gRPC, glog, gflags cmake for conda #5276
- [feature][build] Support XLA with CPU-only #5260
- [enhancement][ci][onnx][build] Remove ONNX from CI #5257
- [enhancement][build] Refactor build_wheel to support oneflowinc images #5427
- [enhancement][build] Add arg skip_audit in build wheel #5423
- [bug][build] hwloc disable shared #5388
- [documentation][build] Update readme for autoconf and libtool #5376
- [enhancement][build] remove dir python and compatible_single_client_python #5609
- [bug][build][system] Fix pyyaml version #5594
- [enhancement][ci][build] force release flags #5574
- [bug][build] prevent endless loop #5534
- [enhancement][build] Support sccache #5528
- [enhancement][build] Add definition for CMAKE_BUILD_TYPE and print cmake_build_type in oneflow doctor #5505
- [enhancement][ci][build][need-simple-ci] Fix macOS for recent changes #5705
- [bug][build] fix return type error on gcc 4.8.5 #5660
- [enhancement][build] Check CMAKE_BUILD_TYPE #5656
- [enhancement][build] add -Werror=return-type #5655
- [enhancement][build] Clean and fix for new py dir #5618
- [enhancement][build] cmake: disable array-bounds check & treat warnings as errors for pyextobj and oneflow_internal & fix warnings #5838
- [bug][build] set CMAKE_BUILD_TYPE to Release if undefined #5842
- [enhancement][build][need-simple-ci] Fix all warnings & Add option TREAT_WARING_AS_ERROR to cmake #5751
- [enhancement][build] add CMAKE_INTERPROCEDURAL_OPTIMIZATION in fast cmake cache #5970
- [enhancement][build] add clang tidy target #5957
- [bug][build] cmake: fix cmake cache args in opencv #5959
- [enhancement][build] Add cmake option USE_SYSTEM_NCCL #5897
- [enhancement][build] cmake: include third party headers as system headers to avoid warnings #5879
- [enhancement][build] Ignore opencv-python on machine aarch64 #5884
- [enhancement][build] enable CMake first class cuda support #5858
- [bug][build] Fix compile warning (strict-aliasing) #5872
- [enhancement][bug][build][need-simple-ci] Upgrade gtest and fix some errors raised by clang #6079
- [bug][ci][build] cmake: fix ninja build in CI #6072
- [bug][build] fix files not actually removed when building for multiple python versions #6060
- [bug][build][api] functional_api: fix build error in mac os #6010
- [bug][build][need-simple-ci][need-single-client-tests] Fix recompile from scratch #6036
- [bug][build] Turn on NVCC's warnings #6011
- [bug][build][need-single-client-tests] fix bundle .so of other python version #6034
- [bug][ci][build][need-single-client-tests] use copy_all_files_in_dir to replace copy_files #6033
- [enhancement][build] check compiler version in cmake #6026
- [enhancement][build] Add CUDA_NVCC_THREADS_NUMBER #6017
- [enhancement][build][need-simple-ci] optimize of_include_copy #5978
- [enhancement][ci][build][need-single-client-tests] CI: remove
-DTREAT_WARNINGS_AS_ERRORS=OFF
#6008 - [enhancement][build][xla] xrt: fix all warnings #5915
- [enhancement][build] Prevent opencv compile failure with std 17 #5997
- [enhancement][build] Use bundled cub #5998
- [enhancement][ci][build] update clang tidy diff warnings-as-errors option #5989
- [enhancement][build] Update run_clang_tidy.py to set return code and add warning-as-errors #5977
- [enhancement][build] check: fix clang-tidy-diff commands #5972
- [bug][build] Suppress NVCC warning #177-D #6094
XLA enhancements:
- [bug][xla] Make the blob header memory aligned. #5286
System:
- [enhancement][system] Refactor Memory Zone #5072
- [enhancement][system] Add interface InferContext::OutputTensorDesc #5219
- [bug][system] Lazy construct functor to make sure that the operators has already been registered. #5225
- [enhancement][system] Refactor infer ctx output isdynamic #5220
- [enhancement][system] Refactor infer ctx input isdynamic #5211
- [enhancement][system] Wake up the heartbeat thread immediately #5081
- [enhancement][system] Fix xla test case fail #5203
- [enhancement][system] Add interface InferContext::InputDType #5153
- [purge][system] delete const_cast in Output #5196
- [feature][system] Add hwloc for topology detection #5291
- [enhancement][system] fix registry may segment #5336
- [enhancement][system] Use functional api instead of op_expr_helper::XXXOp. #5364
- [enhancement][system] move btob to op #5274
- [documentation][system] Add Latest News section in README #5361
- [enhancement][bug][system] fix dropout module: return directly if not training #5346
- [bug][system] add missing JUST #5357
- [documentation][system] Add more communication outlets on README #5359
- [enhancement][feature][system] CommNet dynamic register memory #5281
- [enhancement][system] Use symbol device #5341
- [enhancement][system] fix multithread bug in env #5283
- [bug][system][api] fix bug in cfg_replacement #5335
- [bug][system] Fix create log directory thread-unsafe #5326
- [bug][system] fix_bug_in_make_parallel #5328
- [enhancement][system][cfg] replace train_conf, job_conf using cfg::xx #5263
- [enhancement][system][quantization] support tensorrt in qat #5287
- [enhancement][system][api] Export functional apis for oneflow.experimental. #5313
- [enhancement][system] fix bug check between cfg enum and proto enum #5285
- [enhancement][system] replace CHECK_EQ using CHECK_EQ_OR_RETURN #5279
- [enhancement][system] Refactor SbpXXX to cfg::SbpXXX #5120
- [enhancement][system][api] add detach for LazyMirroredtensorImpl #5270
- [enhancement][system] shorten XXIsDynamic4ArgNameAndIndex to be xxIsDynamic #5265
- [enhancement][system][cfg] job_config to cfg #5235
- [feature][system] Multi-Client LogicalRun degenerate to PhysicalRun #5479
- [enhancement][system] fix ConstructOp without JUST #5480
- [enhancement][system] Output arg modifier return maybe part 1 #5451
- [feature][system][interface] Fea/nn graph/graph build ctx #5420
- [enhancement][system] Throw exception if check failed #5457
- [feature][system] multi client launch #5372
- [enhancement][system][api] Optimize reduce mean #5452
- [enhancement][system] export Tensor only to python #5440
- [enhancement][system] Output arg modifier return maybe part_0 #5447
- [enhancement][system] ThreadMgr support AddPlan #5450
- [enhancement][system] Refactor infer ctx input tensordesc #5226
- [enhancement][system][api] instruction builder return maybe #5442
- [feature][system][interface] MultiClientSessionContext #5421
- [enhancement][feature][system] add launcher, update multi client launch and exit #5414
- [purge][system][refactor] Remove IOConf #5419
- [enhancement][system] Dev refine generator #5426
- [enhancement][system] Support inplace operations #5204
- [enhancement][system][refactor] Dev refactor generator #5397
- [enhancement][system] Add new placement init func #5408
- [enhancement][system] NNGraphIf #5387
- [enhancement][system][refactor] Cast explicitily in unpack call to avoid confilt with Optional. #5380
- [enhancement][system][interface] [Random Generator] Part2: Migrate functional dropout #5378
- [enhancement][system] replace ForeignJobInstance using JobInstance #5374
- [enhancement][system][refactor] Speedup reshape module by 5x. #5381
- [feature][system][interface] [Random Generator] Part1: Dev random generator #5360
- [enhancement][system] Add ONEFLOW_STREAM_CUDA_EVENT_FLAG_BLOCKING_SYNC #5612
- [enhancement][system] [part2]Remove singleclient outdated api #5568
- [feature][system][interface] nn.Graph call and launch impl #5580
- [enhancement][system] remove outdated doctest api and "@experimental_api" #5564
- [feature][system][interface] Register ForeignCallback and Watcher in Multi-Client #5591
- [enhancement][system] [Part-1]remove outdated api and files of multi-client on master branch #5556
- [feature][system][interface] LazyInterpret build LocalTensor if input is local #5582
- [enhancement][system] add job_pass MultiClientAutoSourceAndSinkTick #5507
- [feature][system] Fea/nn graph/optimizer #5533
- [feature][system][interface] New/CloseRuntimeBuffers and RunLazyJob impl #5571
- [feature][system][refactor][interface] NNGraph interface and implement for CompileAndRuntime #5558
- [feature][system] Fea/nn graph/forward graph #5516
- [enhancement][system] Lazy job stream type #5389
- [enhancement][system] Refactor single client autotick #5506
- [enhancement][system] replace underline using dot in single client #5547
- [bug][system] fix return type #5548
- [feature][system][interface] LazyInterpret for UserOpExpr #5544
- [enhancement][system] Add ProfilerStart/ProfilerStop API #5542
- [feature][system][interface] LazyInterpreter for FetchOutputOpExpr and set op parallel_distribution #5527
- [enhancement][system] Multi client push pull #5492
- [enhancement][system] registry_callback_fn return maybe #5456
- [enhancement][system] bw_gen_fn return maybe #5455
- [enhancement][system] gen_bw_fn return maybe #5454
- [enhancement][system] Compatible single client #5417
- [feature][system][interface] GlobalMultiClientEnv and refine EagerExecution #5523
- [enhancement][system] Job pass maybe system #5503
- [enhancement][system] Remove Plan::net_topo #5502
- [feature][system][interface] LazyInterpret for FeedVariableOpExpr #5490
- [enhancement][system] Input arg modifier return maybe #5453
- [feature][system][interface] Fea/nn graph/block scope #5498
- [feature][system] jit_fuse_cast_scale #5332
- [enhancement][system] Remove obsolete Profiler #5747
- [enhancement][system][api] Dev fix batch norm not stats #5733
- [enhancement][system] rename rpc_token to TransportToken #5735
- [enhancement][system][api] Refacotr maximum minimum py2cpp #5724
- [enhancement][system] Replace piece_id with comm_net_sequence_number #5731
- [enhancement][system] beautify stack frame #5686
- [enhancement][system] Add env ONEFLOW_KERNEL_DISABLE_BLOB_ACCESS_CHECKER #5728
- [enhancement][system] Add env ONEFLOW_THREAD_ENABLE_LOCAL_MESSAGE_QUEUE #5720
- [enhancement][system][api][refactor] Refactor functional sub, mul and div apis #5713
- [feature][system] ddp #5008
- [enhancement][system][api][refactor] Refactor functional matmul and add apis. #5697
- [bug][system] Fix ClearKV("plan") #5710
- [enhancement][system] Rename cpu to async cpu #5712
- [enhancement][system] Support tensor.to()/to_local() #5271
- [feature][system][refactor][interface] Multi-Runtime for multi nn.Graph #5683
- [bug][system][refactor] Add tag for Optional inplace constructor #5619
- [enhancement][system] Move Global to env scope #5670
- [enhancement][system] add JUST wrapper #5681
- [enhancement][system] New sync consistent meta info #5634
- [enhancement][system][refactor][interface] Refactor RuntimeCtx for multi-runtime #5664
- [feature][system][interface] Feat: memory shared between EagerTensor with VariableRegst #5649
- [enhancement][system] Use functional call directly instead of construct a module and then call-Add #5613
- [enhancement][system] disable eager_op consistent mode #5647
- [enhancement][system] add msg_penddin_list in ibverbs_qp to optimize qp_init_attr.cap.max_send_wr #5485
- [enhancement][system] IBVerbsCommNet add knobs #5626
- [enhancement][system] Prune python tensor #5596
- [feature][system][interface] Feat: LazyInterpret infer op / tensor ParallelDescScope #5625
- [enhancement][system] Replace src tick with with wait and send ids #5603
- [enhancement][system] Support symbol placement type in functional. #5627
- [enhancement][system][api][refactor][interface] Dev advanced indexing #5559
- [enhancement][system] Optimize maybe. #5839
- [enhancement][system] Decorator 4 disable recursive boxing call #5796
- [enhancement][system] add_eager_boxing_and_op_interpreter_dispatch_error_info #5819
- [enhancement][system] Kernel CUDA Graphs Support #5725
- [bug][system] Fix placement print bug #5853
- [bug][system] when error msg formatting fails, return error->DebugString #5844
- [enhancement][system][refactor] Rename variables named
*parallel_distribution*
to*nd_sbp*
(1) #5815 - [feature][system][interface] Support Free EagerTensor caught in nn.Graph build #5777
- [enhancement][system] Reuse CUDA event / Refine BnInOp2Blob / Refine channel #5837
- [enhancement][system][serving] fix bug in AddInputOutputOpsPass: check existence of key in HashMap(inferface_lbi2scope_sym_id) #5653
- [enhancement][system][api] unpack_call: impl new
unpack_call_dispatcher
for better performance #5820 - [feature][system] Feat consistent tensor python constructor #5812
- [feature][system] Support 0shape tensor #5620
- [documentation][system] fix launcher description #5770
- [feature][system][interface] Multi-nn.Graph memory reuse by Chunk manager #5658
- [bug][system] Fix naive b2p error #5806
- [enhancement][system] set created generator with default rng seed #5801
- [enhancement][system] enhance_local_to_consistent #5761
- [feature][system] add flow.randn #5736
- [enhancement][system] Refactor hierarchical parallel cast autograd #5764
- [enhancement][system] Collective boxing executor add_plan delete_plan #5495
- [enhancement][system] Fix throw abort #5795
- [enhancement][system] DECORATE #5794
- [enhancement][system] Inferface eager boxing #5682
- [enhancement][system] extract_consistent_to_consistent_op_expr #5870
- [enhancement][system] disable backward pass consistent tensor meta check. #5871
- [enhancement][system] Add CudaStreamIndexGenerator::GenerateNamedStreamIndex #5940
- [bug][system] Only query PCI bus id when CUDA version >= 11 #5937
- [enhancement][system] maybe: add
JUST_MSG
andCHECK_JUST_MSG
#5904 - [bug][system] Fix bug scalar #5950
- [enhancement][system] framework: fix rvalue reference warnings #5948
- [purge][system] Remove CudaWorkType #5942
- [enhancement][system] refactor_symbol #5941
- [bug][system] consistent_tensor_infer_cache: fix memory leak #5938
- [feature][system] support to print gpu #5936
- [enhancement][system] Bugfix static check #5935
- [bug][system] fix nccl_version log #5934
- [bug][system] Fix bug of multi-GPU train nn.Graph extra mem cost in rank 0 #5930
- [enhancement][system] Only gradient acc be scheduled in parallel. #5926
- [enhancement][bug][system] fix_ddp_bug_on_8_process #5929
- [enhancement][system] Fix bug error msg format #5866
- [feature][system] print consistent tensor data #5902
- [bug][system] Move parse env to the constructor #5922
- [enhancement][system] Remove GlobalWorkStreamId/GlobalThrdId #5917
- [bug][system] shared_or_scalar: fix alias warnings #5916
- [purge][system] Remove CompActor #5919
- [enhancement][system] Use symbol dtype #5641
- [enhancement][feature][system] Control Graph / Session / Env's python c++ object destruction #5845
- [enhancement][bug][system] Sync access and assign indexing tensor. #5907
- [enhancement][system][api][refactor] Dev consistent arange #5883
- [enhancement][system] Lazy interpreter for new ConsistentToConsistentOpExpr #5903
- [bug][system] Fix BUG of LazyInterpret FreeEagerTensor memory shared with regst #5891
- [bug][system] fix typo in
raise RuntimeError
#5890 - [enhancement][system][refactor] Rename the
ParallelDistribution
class toNdSbp
#5814 - [feature][system] add flow.rand #5722
- [feature][system] Lazy Interpret support infer default device cpu #5880
- [enhancement][system] Tensor str #5783
- [feature][system][interface] Lazy to_consistent #5774
- [enhancement][system] wait vm empty before exiting #5860
- [enhancement][system] Eager boxing n to 1 #5949
- [enhancement][system] add kernel observer #6052
- [enhancement][ci][system] Optimize ddp broadcast and add speed/memory test in ci #6044
- [enhancement][system] add var to control only print warning once when blocked #6045
- [enhancement][system][refactor] Rewrite pow and logical functional apis #6032
- [enhancement][system] Token seq id #5964
- [enhancement][documentation][system] Remove python function wrapper. #6012
- [feature][system] Add timeout and loc for blocking calls #6007
- [enhancement][system] Eager boxing 1 to n #5943
- [enhancement][system] Boxing expr #6015
- [enhancement][system] new_X_to_B #5987
- [enhancement][system] Add unimplemented return information #5952
- [enhancement][system] Revert "Faster decorator" #6006
- [enhancement][system] Throw exception if using advanced indexing for tensor setitem #6001
- [enhancement][system] Support eager boxing sm 2 sn #5869
- [enhancement][system] Move framework/local_dep_object.* to the eager directory #5988
- [enhancement][system] Fix builtin op arg tuple. #5464
- [feature][system][refactor] Dev functional multiple signatures #5982
- [enhancement][system] Faster decorator #5996
- [enhancement][system] Placed nd sbp #5995
- [feature][system] Support asymmetric input/output/variable tensors in nn.Graph #5983
- [enhancement][system] LightActor #5868
- [bug][system] Prevent running oneflow in forked subprocess #5976
- [bug][system] common/error: fix build error in mac os #5971
- [bug][system] fix_bug_test_tensor_str #5958
- [enhancement][system] Refine StreamContext #6191
- [enhancement][system] container_util: fix VectorAt, remove useless MutMapAt #6172
- [enhancement][system] Typesafe KernelState #6198
- [enhancement][system] Primitive based copy task node #6195
- [feature][system][interface] Lazy support Scalar #6181
- [enhancement][system] Disable implicit boxing when parallel num eq one #6188
- [enhancement][system] Primitive #6183
- [enhancement][system] Remove IDMgr::GetGpuPhyIdFromThrdId/IDMgr::GetDeviceTypeFromThrdId #6169
- [enhancement][system] remove op_expr_helper inside gradient_funcs #6057
- [feature][system][api] Add tensor yaml, support export tensor functional api. #6099
- [feature][system] Plan memory log #6151
- [feature][system] Add dtype bfloat16 #5304
- [enhancement][system] StreamContext #6129
- [bug][system] Fix wrong inplace acc grad #6146
- [enhancement][system] UserKernel remove job_desc #6144
- [enhancement][system][api] Fea/graph/add outputs buffer to enable pipeline #6126
- [enhancement][system] not fuse request for nccl 2.10.3 #6136
- [bug][system] NewUniqueId thread safe #6141
- [enhancement][system] XRT remove job_desc #6139
- [enhancement][system] SystemOpFillJobNamePass #6138
- [enhancement][system] mv_boxing_folder_to_core #6140
- [enhancement][system] Refactor boxing interpreter to boxing expr #6134
- [enhancement][system] Eager boxing one to one #6048
- [enhancement][system] Vm cpu efficiency #6110
- [enhancement][system] Naive generic boxing #6116
- [feature][system] send/recv #5992
- [enhancement][system] disable_print_stack_in_tensor_numpy #6123
- [feature][system] add all_reduce by to_consistent #5963
- [enhancement][system] KernelContext #6084
- [enhancement][bug][system] Fix sync nccl and async nccl deadlock #6071
- [bug][system][refactor] Refactor to local #6098
- [enhancement][system] Replace xor with hash combine (part 1) #6078
- [enhancement][system] Optimize error message #6073
- [enhancement][system] Rename Error::xx to Error::xxError #6049
- [enhancement][system] send formatted msg to glog #5999
- [feature][bottleneck][bug][system][interface] [Feat.] NNGraph new eager tensor for new variable created in JobPass #6091
- [bug][system] Fix bug of multi-GPU eager copy D2H extra mem cost in rank 0 #6092
- [enhancement][system][api] Rename module flow.F to flow._C #6053
- [feature][system][interface] [Feat.] Eager consistent OFRecordReader #6089
- [enhancement][system][api] Dev fix and align interface #6075
- [feature][bottleneck][bug][system][interface] NNGraph input/output valid by register tensors #6240
- [bug][system][interface] Fix bug of Multi-Client src tick output order #6221
- [enhancement][bug][system] Add cast primitive #6234
- [feature][bottleneck][system][interface] Auto FixPipelineStageIdPass #6204
- [enhancement][system] move scalar to oneflow namespace. #6235
- [enhancement][system] UserKernel init CUDA Graphs with state #6230
- [feature][system] Comm broadcast #6213
- [enhancement][system][refactor] Rename opname to optype_name in AutogradEngine #6154
- [enhancement][system] Add memset primitive #6218
- [enhancement][system] Add StreamContext::device_type()/DeviceCtx::device_type() #6217
- [feature][system] add all_gather and fix bug of multi rank doctest #6189
- [feature][system][interface] [Feat.] Lazy interpreter skip hierarchical_parallel_cast #6208
- [purge][system] Cleanup KernelUtil #6212
- [enhancement][system] StreamContextAdapter #6205
- [enhancement][system] Dev eliminate gcc warnings #6199
- [feature][bottleneck][system][interface] [Feat.] nn.Graph support grad acc with input/output tensor #6155
- [enhancement][system] Cpu symetric s to s #6153
- [enhancement][system][upload-core] Op expr infer tensor meta #5064
- [enhancement][system] Infer consistent tensor meta #5362
CI enhancements:
- [bug][ci][api][interface] Refine module test #5232
- [enhancement][ci] Add Simple CI, runs CPU-only on GitHub hosted servers #5207
- [enhancement][ci] Run exe test in CPU-only #5202
- [enhancement][ci] Cancel all workflow runs but the latest #5206
- [enhancement][ci] Fix master not running Simple CI #5368
- [enhancement][ci] Refine Simple CI and Clang analysis #5367
- [enhancement][feature][bug][ci][documentation][interface] Fix upsample bilinear bug #5363
- [enhancement][ci] Build nightly for py39 #5318
- [enhancement][ci] Try distributed run for 3 times to prevent failure #5305
- [enhancement][ci] Upload Simple CI logs to cloud #5268
- [enhancement][ci] Remove cpu_op_eager and cuda_op_eager #5470
- [bug][ci] fix segfault in clang plugin #5437
- [enhancement][ci] Refine Simple CI error output #5435
- [enhancement][ci] Add conda env to Simple CI #5385
- [enhancement][ci] Fix clang plugin core file not found #5390
- [bug][ci] upload core when build with clang plugin #5384
- [bug][ci] clang plugin skip more files #5373
- [enhancement][ci] Use gh-action-scheduler-v2 #5370
- [enhancement][ci] relax speed threshold #5569
- [bug][ci] Fix wrong test path under compatible #5567
- [enhancement][ci][need-simple-ci] Prevent upload logs automatically #5560
- [enhancement][ci][interface] Add
nn.AdaptiveAvgPool1d
andnn.AdaptiveAvgPool3d
#5445 - [feature][ci] add speed test in ci #5496
- [enhancement][ci] Reduce usage of Simple CI #5546
- [feature][bug][ci][api] Restruct upsample module #5524
- [feature][ci] multi client launcher test #5488
- [enhancement][ci] Remove automerge if cuda_new_interface failed #5519
- [enhancement][ci] Prevent adding subdir in python/test #5514
- [enhancement][ci] piprepo->pipindex #5517
- [enhancement][ci] add dynamic_loss_scale in ci tests #5337
- [enhancement][ci] Add timeout for wait_gpu_slot #5497
- [enhancement][feature][ci] new static check based on clang-tidy #5476
- [enhancement][ci] Fix url not downloadable in some browers #5701
- [feature][ci] multi client multi machine test #5685
- [enhancement][ci] Add cpu new interface CI #5639
- [enhancement][ci][need-simple-ci] Mv clangtidy to simple ci #5667
- [enhancement][ci][need-simple-ci] use clang tidy appimage in ci #5841
- [enhancement][ci] Use gcc 7 in release to prevent error #5840
- [enhancement][ci] bn tol 1e-4 => 1e-3 #5811
- [enhancement][ci] fix distributed run on built dir #5810
- [enhancement][ci] fix third party mirror check_sum #5802
- [ci][documentation] find more accurately which files need to be doctested #5782
- [enhancement][ci] Print stack unconditionally #5779
- [enhancement][ci][need-simple-ci] Enable more checkers for clang-tidy in CI #5738
- [enhancement][ci] CI: add clang-tidy check to test.yaml #5920
- [ci][documentation] fix docstring in oneflow.nn.functional namespace #5807
- [enhancement][ci] disable TREAT_WARNINGS_AS_ERRORS in Release CI #5886
- [enhancement][ci] Skip ci jobs by git diff #5863
- [bug][ci] quick fix #5978 #6030
- [enhancement][bug][ci] fix clang tidy diff options and file format #5990
- [enhancement][ci] add flow.relu #5847
- [enhancement][ci] equal => allclose #6164
- [bug][ci][need-simple-ci] CI: fix clang tidy checks in simple ci #6161
- [enhancement][bug][ci][documentation][api] add interpolate and layer_norm docs #6157
- [bug][ci] update speed test #6113
- [enhancement][bug][ci][documentation][api] speed import oneflow #6107
- [bug][ci] Also try install dev deps for CODEGEN_PYTHON_EXECUTABLE #6115
- [bug][ci][need-simple-ci] set gtest_CMAKE_DEBUG_POSTFIX "d" #6085
- [enhancement][ci] add cache init file for clang and CI build with clang #6062
- [enhancement][ci] add emoji in speed test output, make it continue-on-error #6214
Test enhancements:
- [bug][test][interface] Fix acos ci bug #5217
- [feature][test] implement automated test #5321
- [enhancement][test] move generator test into ops folder to accelerate tests #5472
- [feature][test][api] Add autotest part2 #5467
- [enhancement][test][api][interface] Add some tests with the new framework for auto testing #5561
- [bug][test] fix test error when do multi case test on graph #5590
- [enhancement][test] Refine module test using auto test by yaochi #5484
- [enhancement][test] Add autotest for BatchNorm2d #5734
- [enhancement][test] RTH_update_op_test #5823
- [enhancement][test] dev adamw graph config #5745
- [feature][test][api][interface] Add new autotest #5562
- [bug][test] restore test of alexnet graph #5798
- [enhancement][test][interface] add zhangshen op-test #5600
- [feature][bug][tooling][test][interface] Record autotest wrong code #5923
- [enhancement][feature][test][api] add randint #5718
- [bug][test] fix multi machine test #5984
- [enhancement][test][interface] some op test #6095