9. Judging the Quality of Models

Revised 8/21/98

Most of the macromolecular models in the PDB are derived from analysis of x-ray diffraction by crystals (a method called x-ray crystallography), or by nuclear magnetic resonance (NMR) spectroscopy. The few others are theoretical models. All of these studies produce more than merely the lists of coordinates that SPdbV uses to produce graphics models. Among other information, these studies yield evidence about the amount of disorder or mobility in various regions of the molecule, and they yield statistics about the precision of the atom positions in the model. If you plan to use a PDB model to help you interpret the results of chemical, kinetic, thermodynamic, or other kinds of studies on a molecule of interest to you, you need to use wisely all the information that accompanies a new molecular structure, and you need to be sure that you are working with a high-quality model.

In the two parts of this section, you will learn to use some features of SPdbV that help with this task, including, for crystallographic models, coloring the model by B-factors and examining the quality of the electron-density map that was used to produce your model; and for NMR models, examining the extent to which multiple models agree with each other.

9a. Judging the Quality of Crystallographic Models

First, read this Technical Note: How Structures Are Determined From Diffracted X-Rays.

For this section, you will need two new files. The first, 1HEL.pdb contains the coordinates of lysozyme without tri-NAG. The second, 1HEL.dn6 is an electron-density map, the molecular image obtained from x-ray crystallography. To obtain these files, click the appropriate one of these links:

You will receive a folder named HEL. Inside the folder, you will find the files described above.

Start SPdbV, but click Cancel on the dialog box for loading files. Even though you did not load a file, you still see the SPdbV menus at the top of the screen.

Prefs: Loading Protein...
On the resulting dialog box, uncheck the last box, Ignore Solvent.... This action tells SPdbV not to ignore, but to include, solvent (usually water) molecules when it loads a file. Not all PDB files contain solvent molecules, but as described in the technical note above, the better models do.

Now load the pdb file 1HEL.pdb. Select, display, and center the full model.

Notice the red crosses all around the protein. Each cross shows the location of a water molecule.

Display: Layer Infos
Click to remove the checkmark under HOH. Click again to replace it. This is a quick way to turn on and off the display of solvent molecules. For now, leave the water molecules off.

Color: B Factor
You should see the model in colors ranging from blue to orange, as you did when you colored by accessibility. This time, colors are based on numbers called temperature factors or B factors found in the PDB file. These numbers result from crystallographic structure determination. They tell, for each atom in the model, how well determined is its position. Positions of atoms colored dark blue are the most certain, while those in red are least certain. Atom positions can be uncertain because of disorder in the crystal from which the structure was determined. In a high-quality model, the B factors reflect the mobility or flexibility of various parts of the molecule. Red residues are the most mobile or "hot," while blue are the most immobile or "cold."

If you color backbone+side by B factor, mainchain color is determined by the highest B factor of mainchain N, CA, or C, while sidechain color is determined by the highest B factor in each sidechain. As you can see, B factors are lower for main-chain than for side chains, especially those on the surface. Identify some of the side chains with high B-factors. Are most of these "hot" residues buried or on the surface? Use a slab with a depth of 8 angstroms to help you answer this question.

Use the label button (marked "LEU 41?") to identify a residue with high B-factors. Remember the residue number. Now click the file button (the dog-eared page symbol below the screen attributes button -- first button in upper left) to view the PDB file. Scroll to the residue number you selected (residue numbers are in the fifth column, following the residue name). Look at the B-factors for atoms in the residue -- they are in the last column before "1HEL". You will most likely see lower B-factors for mainchain atoms than for sidechain atoms. Compare the B-factors for atoms in the side chain you chose with those of nearby residues. Notice the full range of values. The lowest values correspond to the least mobile or best determined parts of the molecule. Values above 60 may signify disorder or errors in the model, or in NMR models, large discrepancies among the various models that fit the NMR data.

Close the file window. Remove any labels by choosing Display: Labels: Clear User Labels. Turn off slabbing if you have been using it. Color the full model in CPK colors.

File: Open DN6 map...
In the dialog, find the file 1HEL.dn6, select it, and click Open. A large dialog appears. Click to darken the button labeled Display Around CA. You should find the number 7.5 in all three boxes adjacent to this button. If not, use tab to place the cursor in these boxes and replace the value shown with 7.5. You are limiting the display of the electron density map to within 7.5 angstroms of the CA atom in the currently centered group. Then click OK. There will be a delay while SPdbV loads this large file. When the graphics window is active, press help or = to center the view. You will see a cloudy, deep red image on part of the model. This is the electron-density map (EDM).

In the Control Panel, find PHE38. Option-click (right mouse button on PC) on its name. This action centers CA of this residue in the display, and displays a section of the map centered on this CA atom. Zoom in to get a close view of the map and model (stereo recommended). The map is the image of the molecule that was obtained from x-ray crystallography. The model you are viewing was actually built to conform to this map. Because electrons, but not atomic nuclei, diffract x-rays, the x-ray image is that of the electron clouds around the atoms. The map is drawn (or contoured) to define the surface at which the electron density has a constant value, the same way that atomic orbital diagrams in your general chemistry textbook are drawn to show the volume within which the probability of finding the electron is constant, say, 90%.

Use an 8-angstrom slab (Display: Slab) to simplify the view, and zoom in close. Rotate the model to see the PHE ring edgewise. This will show you how the model conforms to the map. In this region, the map is well-defined. Select the entire model and color it by B-factor. The model colors in this region are blue, signifying small B-factors. Note that SPdbV turns off the map to allow faster movement. You can change this by choosing Prefs: Real Time Display... but rotations with the map displayed will be choppy unless you have a very fast computer.

Press the cursor-left key (labeled <--). This action moves the center of the display, including the map, down one residue to ASN37. (You can use cursor-left and cursor-right to move through the model one residue at a time, as crystallographers would if they were systematically checking and rebuilding the model to fit the map better.) The side chain of ASN37 has a higher B-factor (colored yellow-green). Notice that part of the model extends out of the map. In other words, the map does not clearly define the positions of atoms with higher B-factors.

Remember that a crystallographic model shows the average molecule in a crystal. If, for example, an ASN side chain is in one conformation in some of the lysozyme molecules in the crystal, and in another conformation in other molecules, the observed ASN electron density will be weakened. When there are two well-defined conformations, the map may show both conformations as low density. Some PDB files will contain coordinates for both conformations.

Press the cursor-down key once and watch the map, especially at the tip of the ASN37 side chain, where the model extends beyond the map. Press again and again until the map encloses the tip of the chain. With each press of the key, you are contouring the map at a lower value of electron density; that is, you are moving the map surface to show lower values of electron density. Where the molecule is disordered, the electron density appears to be lower.

The active-site residues of lysozyme are GLU35 and ASP52. Are these important residues well-defined by this map? This would be an important question if you were planning to base an interpretion of lysozyme's action upon the precise positions of atoms in this region of the model.

Center the display on GLU35. Notice small balls of electron density floating in the vicinity of the GLU side chain. Use the Layer Infos window to turn on display of water molecules (HOH column). Each red cross appears at the proposed location of the oxygen atom of a water molecule. The presence of water is inferred from the balls of density. Their appearance in the electron-density map inplies that these waters must be immobilized on the protein surface in most of the molecules in the crystal.

Now compute H-bonds and use the Layer Infos window to turn on the display of hydrogen-bond distances (Hdst column). You can see that the water molecules in this area are within reasonable hydrogen-bonding distances of atoms on the protein surface, or of other waters. If you see any waters that are not surrounded by electron density, lower the contour level of the map by pressing the cursor-down key. Keep pressing it until electron density appear. A water molecule that exhibits weak density may be present at the indicated location in only a small percentage of the lysozyme molecules in the crystal.

Lysozyme hydrolyzes its substrate (cleaves it with water). GLU35 is proposed to participate in this cleavage. Perhaps one of the waters in this region of the model occupies the site of the water molecule involved in catalysis.

Finally, examine the PDB file for 1HEL by clicking on the document icon just above the graphics display. In the header (comments before the ATOM lines), find the R-factor for this model, and the average deviation (RMSD) of bond lengths and angles from expected values.

In one of the advanced tutorials (Section 12), you can explore EDM's further.

9b. Judging the Quality of NMR Models

First, read this Technical Note: How Structures Are Determined By NMR Spectroscopy.

Click these links to obtain two files, 1BBN, and a shortened version of 1BCN. The full 1BCN file contains an ensemble of 30 NMR models of interleukin 4, a small protein involved in regulating immune and inflammatory responses. All 30 models fit a set of NMR-derived constraints that were used in structure determination. The excerpt you are downloading contains only 5 of the models. File 1BBN contains one model of interleukin 4, derived by averaging the 30 models in the ensemble file, and then energy-minimizing the result.

File: Open PDB File...
Find and open 1BCN. SPdbV will tell you that this file contains more than one model, and ask you to specify how many you want to load. Specify 5 and click OK. The result resembles a porcupine. You are looking at 5 similar models, all superimposed on each other. SPdbV loads the models into separate layers.

Use the Layer Infos window to turn off H in all layers (shift click a check mark in the H column), and observe the result. Unlike most crystallographic models, NMR models contain hydrogen atoms. (The resolution of most protein electron-density maps is too low to reveal hydrogen atoms.)

Use Layer Infos to display all models as alpha-carbon traces (CA column). Then use the Control Panel to hide all side chains (as of this writing, there's no one-step way to do this for all layers at once).

Color: Layer
SPdbV assigns a different color to each layer, making it easy to see the similarities and differences between the models. Rotate the model to get familiar with it. Interleukin 4 is an unusual four-helix bundle with two long loops that connect consecutive helices at opposite ends.

In the helical areas, the five models are very similar. In the loops, there are more differences, while in the chain termini, the models vary dramatically. This means that very different conformations can fit the NMR-derived constraints for the termini, while only similar conformations fit the constraints in the helical regions.

We might say that the structure is well determined in the helical regions, poorly determined at the termini. Or we might say that the termini are very flexible, moving rapidly from one conformation to another. As a result, many atoms show weak or no spectral signs of nearness to other atoms, and NMR spectra provide fewer constraints than needed to define a single conformation. Roughly speaking, the termini are analogous to "hot" regions in a crystallographic model, and the helices are analogous to "cold" regions.

Now load the file 1BBN. SPdbV will automatically place this new model in the same orientation as the 1BCN models. Use Layer Infos to display 1BBN only. Remove H and side chains, and reduce the mainchain to CA only.

Prefs: General...
On the General Preferences dialog, add a checkmark beside Scale B-factor colors so min = dark blue and max = red. This provides for the maximum contrast in B-factor colors.

Color: B-Factor
As you know, PDB files of averaged NMR models contain no B-factors. For a given atom, the number in the B-factor column gives the average (actually, RMS) distance from that atom in averaged model to that same atom in all the other models. Thus this display shows you, on the averaged model, how well or how poorly all the ensemble models agree.

If you want to base an interpretion of the action of interleukin 4 upon the precise positions of atoms in this model, you can do so with more confidence if the region of interest to you is in the blue regions of this model. But if you know that the chain termini are involved in an interaction between interleukin 4 and another molecule, this model would give you very little information about that interaction.

Now add side chains to the display of 1BBN. Your previous actions have colored the side chains by the same criteria as the main chain. Notice that, in general, side chains vary more among the ensemble models (remember that you are looking at a property of the ensemble displayed on the averaged model), and that side chains in the interior vary less than those on the surface. These observations fit with what you saw in crystal structures: interior residues appear more ordered than do surface residues.

Finally, make sure 1BBN is the active layer, and examine the PDB file by clicking on the document icon just above the graphics display. Read lines 50-57 of the file to learn the number and types of constraints from which this model was derived.

Take time to PLAY with the tools introduced in this section.


Next Section: 10. Working With Oligomeric Proteins

To Biochemistry Resources

HOME