Nodes | Cores | Best time (No GPU) (s) | Config | Time with GPU, 8 OMP threads per MPI task (s) |
---|---|---|---|---|
128 | 1024 | 1240.84 | 4 OMP thread per MPI task | 2390.252 |
256 | 2048 | 464.047 | 1 OMP thread per MPI task | 693.845 |
512 | 4096 | 253.007 | 1 OMP threads per MPI task | 313.34 |
1024 | 8192 | 149.176 | 2 OMP threads per MPI task | 201.779 |
2048 | 16384 | 97.25 | 2 OMP threads per MPI task | - |
4096 | 32768 | 66.051 | 4 OMP threads per MPI task | - |