nccl.torch

Torch7 FFI bindings for NVidia NCCL library.

Installation

Install NCCL from https://github.com/NVIDIA/nccl
Have at least Cuda 7.0
Have libnccl.so in your library path

Collective operations supported

allReduce
reduce
broadcast
allGather

Example usage

Argument to the collective call should be a table of contiguous tensors located on the different devices. Example: perform in-place allReduce on the table of tensors:

require 'nccl'
nccl.allReduce(inputs)

where inputs is a table of contiguous tensors of the same size located on the different devices.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
test		test
CMakeLists.txt		CMakeLists.txt
README.md		README.md
ffi.lua		ffi.lua
init.lua		init.lua
nccl-scm-1.rockspec		nccl-scm-1.rockspec

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

nccl.torch

Installation

Collective operations supported

Example usage

About

Releases

Packages

Contributors 4

Languages

ngimel/nccl.torch

Folders and files

Latest commit

History

Repository files navigation

nccl.torch

Installation

Collective operations supported

Example usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages