| Consensus |

1. Invoke Insight II and Homology
| Type biosym_tutorial at the UNIX prompt. |
Wait a few moments while Insight II loads.
| Select Homology from the Module pulldown by picking the MSI icon with the mouse. |
Even if you have not purchased the Homology product, a subset of its commands is available to Consensus.
| Select the Restore_Folder command from the File pulldown. Pick consensus_lesson1.psv from the value-aid. Leave all the other parameters at their default values. Select Execute. |
Three proteins, chymotrypsin (CHA), trypsinogen (TGN), and elastase (EST) are read in, along with their corresponding sequences. Each protein is colored magenta, except that the regions defined to be structurally conserved regions (SCRs) are colored yellow.
| Select Get from the Sequences pulldown. Select pka.seq from the value-aid. The Get Sequence Name parameter is automatically set to PKA. Select Execute. |
The sequence of porcine kallikrein (PKA) appears in the sequence display at the bottom of the screen. The one-letter sequence codes are in lowercase to indicate that no coordinates have been assigned to the residues of the protein.
(This step assumes that you have the Homology module as well as Consensus.)
| Select the Alignment/Pairwise_Sequence command. Choose Automatic as the Seq Align Mode, then Identity as the Scoring Matrix. From the value-aid, select PKA and TGN. Select Execute. |
(Ignore the error message stating that the summary boxes are not correct. Simply select Done in the message window.)
| In the sequence window, locate residue EST:ASP_97. Using the right mouse button, click and drag the residue to the right to insert a gap of two residues for all three reference proteins. |
| Similarly, insert a gap of two residues by dragging residue PKA:THR_112 two positions to the right. |
This is now the final sequence alignment used to build a structure for PKA.
Most often, all proteins of a family will possess the same SCRs in exactly the same locations along the peptide chain. After all, that is the very essence of what is meant by a conserved region. In areas of the reference proteins where this is found to be the case, it is best to relate the residues of the model protein to those of all the reference proteins simultaneously. In that way, the distance restraints that are calculated will reflect the overall trend for the family as a whole. They won't reflect a single or a small number of the reference proteins, making the prediction of the conformation of the model protein that much less biased and more reliable.
Now, check to see what happened.
| Select List as the Activation mode, and select Execute again. |
The textport pops forward, and a list of associated residues appears. It is sorted by the residues of the model protein. For each, note that there are three reference protein residues. That is because in a summary box, that is in an SCR that spans the entire family of reference proteins, there is always an equal number of protein members.
| Select the Initialize Boxes command. With the mouse, pick the first residue in both PKA and EST. PKA:1 and EST:16 appear in the parameters. Select Execute. |
A green (active) sequence box appears enclosing the two residues.
| Select List as the Activation parameter and select Execute. |
Again, the textport pops forward, and the list of associated residues is displayed. Note that for model residues PKA:1-20, only a single reference residue (from EST) is given.
| Set the Activate parameter back to Add and the Box Type parameter back to Summary. Turn On the All_Summary_Boxes parameter and select Execute. |
The SCRs will be added one at a time in sequence from N- to C-terminus.
| Set the Activation parameter to List and select Execute again. |
Now all the model residues have three associated reference residues except for PKA:1 and PKA:2, since these were the only two residues that extended beyond the first summary box.
| Again, select the DGII_Setup command. Select Parameters as the Setup Operation. Leave all the parameters at their default values and select Execute. Now Cancel. |
9. Make the DGII a batch background job
10. Create the DGII input files
| Select the DGII_Params command from the Consensus pulldown. Turn the three major boolean parameters, Smooth, Embed, and Optimize On. Accept the default values for all the parameters. Select Execute. |
The DGII_Run command is automatically activated.
This step of the calculation will take considerable time, on the order of one hour on an Indigo R3000 workstation. At the end of that time, several DGII database files are output, including PKA_DGII.car and PKA_DGII.mdf, containing the coordinates of a hypothetical protein with the same sequence as the model PKA protein. This was automatically built to establish covalent restraints for the model protein. (This hypothetical protein, and the final results will contain hydrogen atoms or not, depending on whether the Include_Hydrogens parameter was turned on or off.) Also generated is the PKA_DGII.geom file, a binary file containing the restraints for all the interatomic distances angles obtained from the analysis of the aligned reference proteins. The file bkgd_job_pka_dgii0.csh is the shell script that controls the execution of the various images that comprise a complete DGII calculation.
| Select Quit from the Session pulldown and select Execute. |
12. Submit the DGII Background Job
After the DGII files are set up, at the UNIX prompt type:
> bkgd_job_pka_dgii0.csh & |
The programs begin to execute. The time required is approximately 4.5 hours per requested structure on an Indigo R4000 workstation, or roughly overnight. The results are in the file PKA_DGII.arc, containing three proposed conformations for kallikrein, each consistent with the structural information obtained from the three reference proteins. Because this job had been run with the Include_Hydrogens parameter set Off, the molecules contained in PKA_DGII.arc have no hydrogen atoms. They must be added back using the Hydrogens command found in the Modify pulldown of the Biopolymer module before any further energy refinement is done using Discover.