The entire CP2K code is MPI parallelized. Some additional loops are also OpenMP parallelized. You should therefore first take advantage of the MPI parallelization. However, running one MPI-rank per CPU-core will probably lead to memory shortage.
At this point, OpenMP threads can be used to utilized all CPU-cores without the large memory-footprint of a MPI-process.
The optimal ratio between MPI-ranks and OpenMP-threads depends on the kind of simulation you run. Do your own benchmarks! A ratio of two threads per rank is usually a good point to start.