GROMACS Tutorial

Step Two: Prepare the Ligand Topology

We must now deal with the ligand. But how does one come up with parameters for some species that the force field does not automatically recognize? Proper treatment of ligands is one of the most challenging tasks in molecular simulation. Force field authors spend years of their lives deriving self-consistent force fields, and it is no small task to introduce a new species into this framework. Force field parameters for any new species must be derived and validated in a manner that is consistent with the original force field.

For the OPLS, AMBER, and CHARMM force fields, this derivation often takes the form of various quantum mechanical calculations. The primary literature for these force fields describes the required procedure. For GROMOS force fields, parameterization methodology is less clear, relying on empirical fitting of condensed-phase behavior. That is, some initial charges and Lennard-Jones parameters are calculated for each atom type, evaluated for their accuracy, and refined. While the end result is very satisfactory, i.e. fluids resemble their real-world counterparts, the derivation process can be laborious and frustrating.

For this reason, automated tools are greatly preferred. For each force field, there are methodologies or software programs that purport to give parameters compatible with various force fields. Not all of them are equally accurate. For a few examples, refer to the following table:

AMBER Antechamber Applies the GAFF force field to the molecule
ACPYPE A Python interface to Antechamber, writes GROMACS topologies

CHARMM CGenFF A generalized force field for CHARMM

GROMOS87/GROMOS96 PRODRG 2.5 An automated server for topology generation
ATB A newer server for topology generation

OPLS-AA Topolbuild Converts a Tripos .mol2 file into a topology
TopolGen A Perl script to convert an all-atom .pdb file to a topology[Note]

Note: I wrote TopolGen. It is very rudimentary, and its output should not be trusted at face value. The proper charge calculation methods for OPLS-AA are described very thoroughly in the literature. Thus, you should use TopolGen to create a skeleton topology, and do proper charge calculations yourself.

According to this table, it would seem that GROMOS parameterization is rather easy. Upload your coordinate file to PRODRG (or draw it using their JME editor), and presto! you have a topology. The unfortunate fact is that, while PRODRG is an extremely handy tool, the output topologies contain numerous inaccuracies that render the results of simulations that use them questionable, at best. Our group did a thorough analysis of these effects. If you are interested, you can find the article here.

For this tutorial, we will use PRODRG to generate a starting topology for our ligand, JZ4. Go to the PRODRG site and upload your JZ4.pdb file. The server presents you with several options for how to treat your ligand:

  • Chirality (yes/no): maintain chiral centers in the input molecule (yes), or reconstruct them (no). In our case, we have no chirality, so leave this option set to its default.
  • Charges (full/reduced): full charges are for condensed-phase systems (43A1 force field); reduced are for "vacuum" simulations (43B1 force field). Use full charges for nearly all applications. The accuracy of 43B1 is debatable, anyway.
  • EM (yes/no): perform energy minimization (yes), or leave coordinates alone (no). In our case, we want to construct our protein-ligand complex from existing coordinates, so we do not want PRODRG to minimize our molecule in vacuo. Choose "no" for this option.

We will make use of two files that PRODRG gives us. Save the output of the field "The GROMOS87/GROMACS coordinate file (polar/aromatic hydrogens)" into a text file called "jz4.gro" and "The GROMACS topology" into a file called "drg.itp."

Let's take a look at the [ atoms ] section of our JZ4 topology:

[ atoms ]
;   nr      type  resnr resid  atom  cgnr   charge     mass
     1       CH3     1  JZ4      C4     1    0.000  15.0350
     2       CH2     1  JZ4     C14     2    0.059  14.0270
     3       CH2     1  JZ4     C13     2    0.060  14.0270
     4         C     1  JZ4     C12     2   -0.041  12.0110
     5       CR1     1  JZ4     C11     2   -0.026  12.0110
     6        HC     1  JZ4     H11     2    0.006   1.0080
     7       CR1     1  JZ4      C7     2   -0.026  12.0110
     8        HC     1  JZ4      H7     2    0.006   1.0080
     9       CR1     1  JZ4      C8     2   -0.026  12.0110
    10        HC     1  JZ4      H8     2    0.007   1.0080
    11       CR1     1  JZ4      C9     2   -0.026  12.0110
    12        HC     1  JZ4      H9     2    0.007   1.0080
    13         C     1  JZ4     C10     3    0.137  12.0110
    14        OA     1  JZ4     OAB     3   -0.172  15.9994
    15         H     1  JZ4     HAB     3    0.035   1.0080

I immediately notice three things that are wrong with this topology:

  1. Charge group 2 encompasses the entire aromatic ring, nearly all of the atoms in this molecule.
  2. Charges of equivalent functional groups (C-H) are not uniform. In some cases, H atoms are +0.006e, and in other cases they are +0.007e.
  3. Charges on these atoms bear no resemblance whatsoever to the expected charges prescribed by the force field.

So what do we do? If you read the paper linked above, you would know what I'm going to recommend already. Assign charges and charge groups from equivalent functional groups in a building block-style manner. For any unknown groups, perform the charge calculations suggested in the paper. Since JZ4 involves only functional groups found in proteins (hydrophobic chain, aromatic ring, and an alcohol), life is pretty easy. We can re-assign charges and charge groups from these moieties. I will not go into proper topology validation in this tutorial; it would be far too time-consuming. Suffice it to say that you must satisfy reviewers that the parameters applied to your molecule are reliable. They should reproduce some well-known observable, such as logP or ΔGhydr, as in the GROMOS96 literature. If you cannot satisfy these requirements for your own system, you may want to reconsider your choice of force field.

I would construct a topology that has something like this for an [ atoms ] directive:

[ atoms ]
;   nr      type  resnr resid  atom  cgnr   charge     mass
     1       CH3     1  JZ4      C4     1    0.000  15.0350
     2       CH2     1  JZ4     C14     2    0.000  14.0270
     3       CH2     1  JZ4     C13     2    0.000  14.0270
     4         C     1  JZ4     C12     2    0.000  12.0110
     5       CR1     1  JZ4     C11     3   -0.100  12.0110
     6        HC     1  JZ4     H11     3    0.100   1.0080
     7       CR1     1  JZ4      C7     4   -0.100  12.0110
     8        HC     1  JZ4      H7     4    0.100   1.0080
     9       CR1     1  JZ4      C8     5   -0.100  12.0110
    10        HC     1  JZ4      H8     5    0.100   1.0080
    11       CR1     1  JZ4      C9     6   -0.100  12.0110
    12        HC     1  JZ4      H9     6    0.100   1.0080
    13         C     1  JZ4     C10     7    0.150  12.0110
    14        OA     1  JZ4     OAB     7   -0.548  15.9994
    15         H     1  JZ4     HAB     7    0.398   1.0080

Note that equivalent functional groups have equivalent charges, hydrophobic groups are completely uncharged, and the charge groups are of a sensible size (2-3 atoms). The other elements of the topology are sufficiently accurate. Bonded parameters assigned by PRODRG are generally reliable. We are now ready to construct the system topology and reconstruct our protein-ligand complex.

Build the Complex

From pdb2gmx, we have a file called "3HTB_processed.gro" that contains the processed, force field-compliant structure of our protein. We also have "jz4.gro" from PRODRG that has included all of the necessary H atoms. Copy 3HTB_processed.gro to a new file to be manipulated, for instance, call it "complex.gro," as the addition of JZ4 to the protein will generate our protein-ligand complex. Next, simply copy the coordinate section of jz4.gro and paste it into complex.gro, below the last line of the protein atoms, and before the box vectors, like so:

  163ASN      C 1691   0.621  -0.740  -0.126
  163ASN     O1 1692   0.624  -0.616  -0.140
  163ASN     O2 1693   0.683  -0.703  -0.011
   5.99500   5.19182   9.66100   0.00000   0.00000  -2.99750   0.00000   0.00000   0.00000

becomes (added text in bold green)...

  163ASN      C 1691   0.621  -0.740  -0.126
  163ASN     O1 1692   0.624  -0.616  -0.140
  163ASN     O2 1693   0.683  -0.703  -0.011
    1JZ4  C4       1   2.429  -2.412  -0.007
    1JZ4  C14      2   2.392  -2.470  -0.139
    1JZ4  C13      3   2.246  -2.441  -0.181
    1JZ4  C12      4   2.229  -2.519  -0.308
    1JZ4  C11      5   2.169  -2.646  -0.295
    1JZ4  H11      6   2.135  -2.683  -0.199
    1JZ4  C7       7   2.155  -2.721  -0.411
    1JZ4  H7       8   2.104  -2.817  -0.407
    1JZ4  C8       9   2.207  -2.675  -0.533
    1JZ4  H8      10   2.199  -2.738  -0.621
    1JZ4  C9      11   2.267  -2.551  -0.545
    1JZ4  H9      12   2.306  -2.516  -0.640
    1JZ4  C10     13   2.277  -2.473  -0.430
    1JZ4  OAB     14   2.341  -2.354  -0.434
    1JZ4  HAB     15   2.369  -2.334  -0.528
   5.99500   5.19182   9.66100   0.00000   0.00000  -2.99750   0.00000   0.00000   0.00000

Since we have added 15 more atoms into the .gro file, increment the second line of complex.gro to reflect this change. There should be 1708 atoms in the coordinate file now.

Build the Topology

Including the parameters for the JZ4 ligand in the system topology is very easy. Just insert a line that says #include "drg.itp" into after the position restraint file is included. The inclusion of position restraints indicates the end of the "Protein" moleculetype section.

; Include Position restraint file
#ifdef POSRES
#include "posre.itp"

; Include water topology
#include "gromos43a1.ff/spc.itp"


; Include Position restraint file
#ifdef POSRES
#include "posre.itp"

; Include ligand topology
#include "drg.itp"

; Include water topology
#include "gromos43a1.ff/spc.itp"

The last adjustment to be made is in the [ molecules ] directive. To account for the fact that there is a new molecule in complex.gro, we have to add it here, like so:

[ molecules ]
; Compound        #mols
Protein_chain_A     1
JZ4                 1

The topology and coordinate file are now in agreement with respect to the contents of the system.

Back: pdb2gmx Next: Solvation

Bevan Lab Homepage

Virginia Tech Homepage

Virginia Tech Biochemistry

Site design and content copyright 2008-2015 by Justin Lemkul
Problems with the site? Send them to the Webmaster