howto:compile_with_cuda
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
howto:compile_with_cuda [2019/04/09 09:49] – [Libcusmm] alazzaro | howto:compile_with_cuda [2020/08/21 10:15] (current) – external edit 127.0.0.1 | ||
---|---|---|---|
Line 4: | Line 4: | ||
* Anything that uses '' | * Anything that uses '' | ||
* FFTs, when compiled with '' | * FFTs, when compiled with '' | ||
- | * If linked against an accelerated scalapack/ | + | * If linked against an accelerated scalapack/ |
To enable all CUDA acceleration options the following lines have to be added to the ARCH-file: | To enable all CUDA acceleration options the following lines have to be added to the ARCH-file: | ||
Line 10: | Line 10: | ||
NVCC = / | NVCC = / | ||
DFLAGS += -D__ACC -D__DBCSR_ACC -D__PW_CUDA | DFLAGS += -D__ACC -D__DBCSR_ACC -D__PW_CUDA | ||
- | LIBS += -lcudart -lcublas -lcufft -lrt | + | LIBS += -lcudart -lcublas -lcufft -lnvrtc |
</ | </ | ||
- | See [[https:// | + | See [[https:// |
As a prerequisite the [[https:// | As a prerequisite the [[https:// | ||
===== Libcusmm ===== | ===== Libcusmm ===== | ||
- | The acceleration of DBCSR is performed by libcusmm. This library provides a number of kernels. Each of these kernels can multiply blocks of specific blocksizes. The blocksizes of a simulation are determined by the employed basis-set. As of DBCSR 1.1, by default libcusmm is able to generate any kernel for {m,n,k}< =80, see [[ https:// | + | The acceleration of DBCSR is performed by libcusmm. This library provides a number of kernels. Each of these kernels can multiply blocks of specific blocksizes. The blocksizes of a simulation are determined by the employed basis-set. As of DBCSR 1.1, by default libcusmm is able to generate any kernel for {m,n,k}≤80, see [[ https:// |
< | < | ||
Line 38: | Line 38: | ||
</ | </ | ||
- | More supported GPUs can be added, please refer to the description | + | More supported GPUs can be added, please refer to [[https:// |
- | + | ||
- | New kernel parameters have to be optimized, which [[howto: | + | |
===== Profiling ===== | ===== Profiling ===== | ||
- | If you are interested in profiling CP2K with nvprof have a look at [[dev: | + | If you are interested in profiling CP2K with nvprof have a look at [[dev: |
howto/compile_with_cuda.1554803349.txt.gz · Last modified: 2020/08/21 10:15 (external edit)