howto:biochem_qmmm
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
howto:biochem_qmmm [2017/07/18 12:16] – dvanrompaey | howto:biochem_qmmm [2024/01/03 13:13] (current) – oschuett | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | =====QMMM for biochemical systems===== | + | This page has been moved to: https://manual.cp2k.org/trunk/methods/qm_mm/builtin.html |
- | This tutorial will illustrate the setup process of a QM/MM protein - substrate system. The model system we will be using is chorismate mutase, an enzyme that catalyzes the conversion of chorismate | + | |
- | + | ||
- | {{:howto:substrate_in_protein.png? | + | |
- | + | ||
- | The enzyme Chorismate Mutase with its native substrate | + | |
- | + | ||
- | + | ||
- | Before we can get started on the simulations, | + | |
- | + | ||
- | + | ||
- | ===Generating a topology=== | + | |
- | + | ||
- | Now that we have our coordinates, | + | |
- | + | ||
- | + | ||
- | < | + | |
- | for i in A B C; do antechamber -i Lig${i}.mol2 -o Ligand${i}.mol2 -fi mol2 -fo mol2 -c bcc -pf yes -nc -2 -at gaff2 -j 5 -rn CH${i}; done | + | |
- | for i in A B C; do parmchk2 -i Ligand${i}.mol2 -f mol2 -o Ligand${i}.frcmod -s 2; done | + | |
- | </ | + | |
- | + | ||
- | We will then use tleap to load in all the parameters from antechamber and parmchk2, generate parameters for the protein according to the Amber14 forcefield and combine everything into a complex. The commands for tleap can be placed in an input file and ran using `tleap -f script.in`. | + | |
- | + | ||
- | < | + | |
- | source leaprc.protein.ff14SB | + | |
- | source leaprc.gaff2 | + | |
- | source leaprc.water.tip3p | + | |
- | + | ||
- | ligA = loadmol2 LigandA.mol2 | + | |
- | ligB = loadmol2 LigandB.mol2 | + | |
- | ligC = loadmol2 LigandC.mol2 | + | |
- | + | ||
- | loadamberparams LigandA.frcmod | + | |
- | loadamberparams LigandB.frcmod | + | |
- | loadamberparams LigandC.frcmod | + | |
- | + | ||
- | + | ||
- | protein = loadPDB Protein.pdb | + | |
- | complex = combine {protein ligA ligB ligC} | + | |
- | </ | + | |
- | + | ||
- | Now that we have assembled our complex, we have to solvate it. For simplicity' | + | |
- | + | ||
- | < | + | |
- | solvateBox complex TIP3PBOX 14.0 iso | + | |
- | addIonsRand complex | + | |
- | saveamberparm complex complex.prmtop complex.inpcrd | + | |
- | quit | + | |
- | </ | + | |
- | + | ||
- | These two files contain everything we need to get started. Complex.prmtop contains the topology information and complex.inpcrd contains the coordinates. Using your favourite visualisation tool, check that the Na+ ions did not get placed right next to your ligand - their placement is random. The box size can be found at the bottom of the inpcrd file, followed by the vectors. These dimensions should be specified in the `&CELL` subsection of your input file. | + | |
- | + | ||
- | + | ||
- | ===Energy minimization=== | + | |
- | + | ||
- | We can now move on to actually running CP2K. We will first run 1000 steps of energy minimization (or fewer, depending on the convergence) with the input file `em.inp` to eliminate bad contacts. For quantities like cell size, for instance, CP2K allows you to specify your units by placing them in brackets, for instance `TIMESTEP [fs] 0.5`. | + | |
- | + | ||
- | + | ||
- | < | + | |
- | cp2k can be run using a single process: | + | |
- | cp2k.sopt em.inp > em.out | + | |
- | + | ||
- | or using several processes, where N is the number of processes: | + | |
- | mpirun -np N cp2k.popt em.inp > em.out | + | |
- | </ | + | |
- | + | ||
- | Visualize your simulation to check for any abnormalities. Take a look at the output file, paying extra attention to any warnings you might receive. Note: CP2K will warn you about missing forcefield terms. You can have CP2K output these using `& | + | |
- | + | ||
- | In the next step we will perform 5 ps of dynamics in the NVT ensemble, to equilibrate the temperature of our simulation using this `nvt.inp`. We are using the &EXT RESTART section to restart with the last coordinates of the energy minimization procedure. Once the simulation has finished, take a look at the temperature file `NVT-1.PARTICLES.temp` to check that the temperature is oscillating around the correct value (in this case, 298K). Note that we are using the CSVR thermostat with a low time constant of 10 fs to efficiently and quickly control the system temperature. For production simulations a higher time constant such as 100 fs should be chosen to avoid interfering with the dynamics of the system. | + | |
- | + | ||
- | The next step is simulating our system in the constant pressure NPT ensemble to get the correct dimensions for our cell. `npt.inp` runs 50 ps of NPT, outputting the cell dimensions into a separate file every 100 timesteps. A C-C distance in of the chorismate molecules is constrained, | + | |
- | + | ||
- | Congratulations, | + | |
- | + | ||
- | ====Moving on to QMMM==== | + | |
- | <img src=" | + | |
- | <span class=" | + | |
- | + | ||
- | As chorismate mutase performs its catalytic function through noncovalent interactions, | + | |
- | + | ||
- | < | + | |
- | & | + | |
- | MM_INDEX 5668 5669 5670 5682 5685 5686 | + | |
- | &END QM_KIND | + | |
- | & | + | |
- | MM_INDEX 5663 5666 5667 5671 5673 5675 5676 5678 5680 5684 | + | |
- | &END QM_KIND | + | |
- | & | + | |
- | MM_INDEX | + | |
- | &END QM_KIND | + | |
- | </ | + | |
- | + | ||
- | Now that the QM atoms are defined, CP2K needs to know how to treat these. We will use the semi-empirical AM1 method as our QM method, electrostatically embedded in the MM system: `E_COUPL COULOMB`. This allows for the QM region to be polarized by the MM environment. Mechanical embedding, where the QM region only interacts with the MM region as point charges and no polarization occurs, can be selected through `E_COUPL NONE`. In DFT calculations, | + | |
- | + | ||
- | < | + | |
- | parmed complex.prmtop | + | |
- | changeLJSingleType @5724 0.3019 0.047 | + | |
- | changeLJSingleType @4843 0.3019 0.047 | + | |
- | + | ||
- | parmout complex_LJ_mod.prmtop | + | |
- | quit | + | |
- | </ | + | |
- | + | ||
- | The input file `monitor.inp` will perform 5 ps of simulation in the NVT ensemble. Note that the parameter file should be changed to our modified file with LJ sites on the OH hydrogens. We have also defined a collective variable (CV), i.e. a variable that that describes our process of interest well and allows us to monitor this variable as a surrogate for the process. The CV for this system is the difference in distance between the C-O bond that we will be breaking and the C-C bond that we will be forming. Start the simulation and observe the collective variable output in the MONITOR-COLVAR.metadynLog file. | + | |
- | + | ||
- | + | ||
- | ===Metadynamics=== | + | |
- | + | ||
- | We have now simulated a full QM/MM system for 5 ps. Unfortunately, | + | |
- | + | ||
- | + | ||
- | You can now run the metadynamics simulation using `metadynamics.inp`. The evolution of the collective variables over time can be observed in the `METADYN-COLVAR.metadynLog` file. The substrate can be found near the 3 to 8 bohr CV region, whereas the product can be found near the -3 to -8 bohr region. For this tutorial, we will stop the simulation once we have sampled the forward and reverse reaction once. In a production setting you should observe several barrier crossings and observe the change in free energy profile as a function of simulation time. The free energy landscape for the reaction can be obtained by integrating all the gaussians used to bias the system. Fortunately, | + | |
- | + | ||
- | < | + | |
- | graph.sopt -cp2k -file METADYN-1.restart -out fes.dat | + | |
- | </ | + | |
- | + | ||
- | This outputs the `fes.dat` file, containing a one-dimensional energy landscape. Plot it to obtain a graph charting the energy as a function of the distance difference. | + | |
- | + | ||
- | + | ||
- | + | ||
- | | + | |
- | + |
howto/biochem_qmmm.1500380175.txt.gz · Last modified: 2020/08/21 10:15 (external edit)