Nodes | Cores | Best time (No GPU) (s) | Config | Time with GPU, 8 OMP threads per MPI task (s) |
---|---|---|---|---|
64 | 512 | 1059.634 | 1 OMP thread per MPI task | 1297.749 |
128 | 1024 | 552.425 | 1 OMP thread per MPI task | 616.313 |
256 | 2048 | 301.17 | 2 OMP thread per MPI task | 372.706 |
512 | 4096 | 166.933 | 2 OMP threads per MPI task | 208.886 |
1024 | 8192 | 98.523 | 4 OMP threads per MPI task | 122.485 |
2048 | 16384 | 68.129 | 4 OMP threads per MPI task | 121.138 |
4096 | 32768 | 48.15 | 8 OMP threads per MPI task | 76.529 |