You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, the librmm library provides a CudaAsyncMemoryResource Python interface (https://docs.rapids.ai/api/rmm/stable/python_api/#rmm.mr.CudaAsyncMemoryResource), which lets users access the C++ cuda_async_memory_resource class. However, there is no equivalent Python interface for accessing the C++ cuda_async_view_memory_resource class.
An existing GPU-accelerated application needs to use the default CUDA memory pool for stream-ordered memory allocation and deallocation through cudaMallocAsync and cudaFreeAsync. If the librmm library always creates a separate pool, the memory pool allocated by librmm can only be used by itself or by cuPy and cuDF, causing some GPU memory resource wastage.
We hope that librmm can provide a CudaAsyncViewMemoryResource Python interface, similar to the one shown below, to access the C++ cuda_async_view_memory_resource class. This would allow us to pass the default memory pool handle (obtained from cudaDeviceGetDefaultMemPool) to librmm:
Thanks, this should be quite doable. We need to think about what type the pool_handle should have in python, probably a cuda-pythoncudart.cudaMemPool_t?
Hi @leofang and all,
Currently, the
librmm
library provides aCudaAsyncMemoryResource
Python interface (https://docs.rapids.ai/api/rmm/stable/python_api/#rmm.mr.CudaAsyncMemoryResource), which lets users access the C++cuda_async_memory_resource
class. However, there is no equivalent Python interface for accessing the C++cuda_async_view_memory_resource
class.The
CudaAsyncMemoryResource
always requires creating a new memory pool throughcudaMemPoolCreate
(https://github.com/rapidsai/rmm/blob/branch-24.08/include/rmm/mr/device/cuda_async_memory_resource.hpp#L107), which is problematic when integratinglibrmm
andcuDF
usage into an existing GPU-accelerated application.An existing GPU-accelerated application needs to use the default CUDA memory pool for stream-ordered memory allocation and deallocation through
cudaMallocAsync
andcudaFreeAsync
. If thelibrmm
library always creates a separate pool, the memory pool allocated bylibrmm
can only be used by itself or bycuPy
andcuDF
, causing some GPU memory resource wastage.We hope that
librmm
can provide aCudaAsyncViewMemoryResource
Python interface, similar to the one shown below, to access the C++cuda_async_view_memory_resource
class. This would allow us to pass the default memory pool handle (obtained fromcudaDeviceGetDefaultMemPool
) tolibrmm
:Alternatively, introduce
rmm.mr.CudaAsyncDefaultMemoryResource()
, which automatically obtains the pool handle fromcudaDeviceGetDefaultMemPool
:Thanks,
Lilo
The text was updated successfully, but these errors were encountered: