howto:biochem_qmmm
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
howto:biochem_qmmm [2017/08/02 07:06] – dvanrompaey | howto:biochem_qmmm [2024/01/03 13:13] (current) – oschuett | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ======QMMM for biochemical systems====== | + | This page has been moved to: https://manual.cp2k.org/trunk/methods/qm_mm/builtin.html |
- | *Original author: Dries Van Rompaey | + | |
- | *Input files: {{ : | + | |
- | + | ||
- | This tutorial will illustrate the setup process of a QM/MM protein - substrate system. The model system we will be using is chorismate mutase, an enzyme that catalyzes the conversion of chorismate to prephenate, which is a critical reaction in the shikimate pathway. One attractive feature of this enzyme is that it catalyzes the reaction in an electrostatic manner, without forming covalent bonds to the substrate. This allows us to treat the complete enzyme at the computationally cheap MM level, saving us the headache of setting up bonds across the QM/MM border. In the production simulations we will simulate the substrate at the QM level (using the semi-empirical AM1 method), but we will first equilibrate our system with all components at the MM level. | + | |
- | + | ||
- | {{ : | + | |
- | The enzyme Chorismate Mutase with its native substrate | + | |
- | + | ||
- | + | ||
- | Before we can get started on the simulations, | + | |
- | + | ||
- | + | ||
- | =====Generating a topology===== | + | |
- | + | ||
- | Now that we have our coordinates, | + | |
- | + | ||
- | + | ||
- | < | + | |
- | for i in A B C; do antechamber -i Lig${i}.mol2 -o Ligand${i}.mol2 -fi mol2 -fo mol2 -c bcc -pf yes -nc -2 -at gaff2 -j 5 -rn CH${i}; done | + | |
- | for i in A B C; do parmchk2 -i Ligand${i}.mol2 -f mol2 -o Ligand${i}.frcmod -s 2; done | + | |
- | </ | + | |
- | + | ||
- | We will then use tleap to load in all the parameters from antechamber and parmchk2, generate parameters for the protein according to the Amber14 forcefield and combine everything into a complex. The commands for tleap can be placed in an input file and ran using '' | + | |
- | + | ||
- | < | + | |
- | source leaprc.protein.ff14SB | + | |
- | source leaprc.gaff2 | + | |
- | source leaprc.water.tip3p | + | |
- | + | ||
- | ligA = loadmol2 LigandA.mol2 | + | |
- | ligB = loadmol2 LigandB.mol2 | + | |
- | ligC = loadmol2 LigandC.mol2 | + | |
- | + | ||
- | loadamberparams LigandA.frcmod | + | |
- | loadamberparams LigandB.frcmod | + | |
- | loadamberparams LigandC.frcmod | + | |
- | + | ||
- | + | ||
- | protein = loadPDB Protein.pdb | + | |
- | complex = combine {protein ligA ligB ligC} | + | |
- | </ | + | |
- | + | ||
- | Now that we have assembled our complex, we have to solvate it. For simplicity' | + | |
- | + | ||
- | < | + | |
- | solvateBox complex TIP3PBOX 14.0 iso | + | |
- | addIonsRand complex | + | |
- | saveamberparm complex complex.prmtop complex.inpcrd | + | |
- | quit | + | |
- | </ | + | |
- | + | ||
- | These two files contain everything we need to get started. Complex.prmtop contains the topology information and complex.inpcrd contains the coordinates. Using your favourite visualisation tool, check that the Na+ ions did not get placed right next to your ligand - their placement is random. The box size can be found at the bottom of the inpcrd file, followed by the vectors. These dimensions should be specified in the '' | + | |
- | + | ||
- | + | ||
- | =====Equilibration at MM level===== | + | |
- | + | ||
- | We can now move on to actually running CP2K. We will first run 1000 steps of energy minimization (or fewer, depending on the convergence) with the input file '' | + | |
- | + | ||
- | + | ||
- | < | + | |
- | cp2k can be run using a single process: | + | |
- | cp2k.sopt em.inp > em.out | + | |
- | + | ||
- | or using several processes, where N is the number of processes: | + | |
- | mpirun -np N cp2k.popt em.inp > em.out | + | |
- | </code> | + | |
- | + | ||
- | Visualize your simulation to check for any abnormalities. Take a look at the output file, paying extra attention to any warnings you might receive. Note: CP2K will warn you about missing forcefield terms. You can have CP2K output these using '' | + | |
- | + | ||
- | In the next step we will perform 5 ps of dynamics in the NVT ensemble, to equilibrate the temperature of our simulation using this '' | + | |
- | + | ||
- | The next step is simulating our system in the constant pressure NPT ensemble to get the correct dimensions for our cell. '' | + | |
- | + | ||
- | Congratulations, | + | |
- | + | ||
- | =====Moving on to QMMM===== | + | |
- | {{ : | + | |
- | Chorismate(left) is converted into prephenate(right) by Chorismate Mutase | + | |
- | + | ||
- | As chorismate mutase performs its catalytic function through noncovalent interactions, | + | |
- | + | ||
- | < | + | |
- | & | + | |
- | MM_INDEX 5668 5669 5670 5682 5685 5686 | + | |
- | &END QM_KIND | + | |
- | & | + | |
- | MM_INDEX 5663 5666 5667 5671 5673 5675 5676 5678 5680 5684 | + | |
- | &END QM_KIND | + | |
- | & | + | |
- | MM_INDEX | + | |
- | &END QM_KIND | + | |
- | </ | + | |
- | + | ||
- | Now that the QM atoms are defined, CP2K needs to know how to treat these. We will use the semi-empirical AM1 method as our QM method, electrostatically embedded in the MM system: '' | + | |
- | + | ||
- | < | + | |
- | parmed complex.prmtop | + | |
- | changeLJSingleType @5724 0.3019 0.047 | + | |
- | changeLJSingleType @4843 0.3019 0.047 | + | |
- | + | ||
- | outparm complex_LJ_mod.prmtop | + | |
- | quit | + | |
- | </ | + | |
- | + | ||
- | The input file '' | + | |
- | + | ||
- | + | ||
- | =====Metadynamics===== | + | |
- | + | ||
- | We have now simulated a full QM/MM system for 5 ps. Unfortunately, | + | |
- | + | ||
- | + | ||
- | You can now run the metadynamics simulation using '' | + | |
- | + | ||
- | < | + | |
- | graph.sopt -cp2k -file METADYN-1.restart -out fes.dat | + | |
- | </code> | + | |
- | + | ||
- | This outputs the '' | + | |
- | {{ : | + | |
- | + | ||
- | + | ||
- | Observe the two energy minima for substrate and product, and the maximum associated with the transition state. From this graph we can estimate the activation energy in the enzyme, amounting to roughly 44 kcal/mol. Results from literature generally estimate the energy to be around 20 kcal/mol. While we are in the right order of magnitude, our estimate is not very accurate. One contributing factor might be our choice of QM/MM subsystems. Up to this point we have only treated our substrate with QM and everything else with MM. This is a significant simplification, | + | |
- | + | ||
- | =====QM for enzyme residues===== | + | |
- | High-level calculations in literature have shown Arg90 (take care, this is residue 89 in parmed) to be the most important residue for catalysis, so we will treat this residue at the QM level. This presents us with a bit of trouble. Earlier, we could just add all of the chorismate atoms to the QM system, because it is not covalently bound to the protein. This situation is of course different for enzyme sidechains, which are bound to the main chain. We thus have to choose where to cut the residue into a QM part and an MM part. Try to cut across the most boring aliphatic C-C bond in your residue, and definitely avoid cutting across heavily polarized bonds. In our case, we will cut the Cα - Cβ bond. The Cα will be a part of the MM subsystem while the Cβ will be a part of the QM subsystem. To do this, replace the '' | + | |
- | + | ||
- | < | + | |
- | & | + | |
- | | + | |
- | | + | |
- | | + | |
- | | + | |
- | | + | |
- | | + | |
- | & | + | |
- | & | + | |
- | | + | |
- | | + | |
- | | + | |
- | | + | |
- | | + | |
- | | + | |
- | | + | |
- | | + | |
- | | + | |
- | | + | |
- | | + | |
- | | + | |
- | | + | |
- | | + | |
- | | + | |
- | | + | |
- | | + | |
- | | + | |
- | | + | |
- | & | + | |
- | & | + | |
- | | + | |
- | | + | |
- | | + | |
- | | + | |
- | | + | |
- | | + | |
- | | + | |
- | | + | |
- | | + | |
- | | + | |
- | | + | |
- | | + | |
- | | + | |
- | | + | |
- | & | + | |
- | & | + | |
- | | + | |
- | | + | |
- | | + | |
- | & | + | |
- | & | + | |
- | | + | |
- | | + | |
- | | + | |
- | & | + | |
- | </ | + | |
- | + | ||
- | Note the '' | + | |
- | + | ||
- | < | + | |
- | change charge @1409 -0.3419 | + | |
- | </ | + | |
- | + | ||
- | Save the modified parameters to a new .prmtop file and run the metadynamics simulation. Don’t forget to modify the sections '' | + | |
- | + | ||
- | {{ : | + | |
- | + | ||
- | When we compare the two energy profiles, we can see that the activation energy for the reaction is around 35 kcal/mol, which is quite a bit closer to the literature values. Of course, there are a number of ways we can improve our estimate of the barrier, for instance sampling several transitions, | + |
howto/biochem_qmmm.1501657573.txt.gz · Last modified: 2020/08/21 10:15 (external edit)