CDiscover



4       Standalone CDiscover


Input files required

A CDiscover run can be set up in a standalone manner, completely outside the Cerius2 or Insight environment. To do this, you need three files:

It is usually easiest to create the model description files with the the Cerius2 or Insight interface. If you write out the command input files from the C2·Discover or Insight·Discover_3 module, the graphical interface program also writes out an input file reflecting the calculation set up via the menus of the module. This input file can serve as a convenient starting point, which you can modify with a text editor.


Overview of BTCL commands

Calculation commands

The primary BTCL commands are calculation commands: energy, minimize, and dynamics. As the names suggest, energy performs a single energy and energy derivative evaluation, minimize modifies atom coordinates to minimize energy, and dynamics performs a simulation of molecular dynamics.

Commands for calculation setup and analysis

Calculation setup is accomplished through the use of a number of auxiliary commands for input (begin, readFile, reset), output ( output, print, writeFile), and control (atomMovability, energyContribution, forcefield, restraint, pseudoAtom). Analysis commands (diffraction and vibrationalAnalysis) allow you to perform additional analysis of output results. The peek command can be used to monitor the progress of minimization and dynamics calculations.

Input commands

The BTCL commands begin, readFile, and reset all read information into the Discover program from external files. The begin command is generally placed at the start of a run_name.inp file to read in the molecular data files and a forcefield file. The readFile command is used to read in a data file created by a previous CDiscover, Cerius2, or Insight run (e.g., a coordinate snapshot from an archive file). It also allows you to work with multiple systems. The reset command is used to reread the molecular data files and forcefield. All information associated with the current molecular system (e.g., coordinates, energy expression) is lost with this command. However, the settings of command defaults (e.g., for dynamics) are unchanged.

Output commands

The output command specifies amount of output desired. The print command (which must be part of a minimize or dynamics command) generates output of system properties to various file types. The writeFile command is used to write a data file that can be read by subsequent CDiscover, Cerius2, or Insight runs.

Control commands

The atomMovability command allows you to fix the position of an atom or disregard its contribution to total energy throughout a calculation. The energyContribution command allows you to specify energy contributions of your own design. The forcefield command allows you to select a forcefield and tailor the contributions of built-in energy terms. The restraint command allows you to impose distance, torsion, or angle restraints of various forms on selected atoms. The pseudoAtom command allows you to create collections of atoms which are to be treated as individual atoms.

Exiting a BTCL session

If you want to end a BTCL session without starting a CDiscover run, enter exit, quit, or <Ctrl> d at the BTCL prompt.

Defining and using your own procedures

One of the most useful features of BTCL is that you can quickly define and use your own procedures (Btcl procedures). Procedures defined in files named filename and located in the current directory of the Discover run are automatically accessible. Procedures defined in any other files may be used by including:


source filename 


in the run_name.inp file. Here, filename is the full path to the procedure-definition file.

Performance and convenience

Interpretive script languages are convenient for user applications. It is generally true, however, that they do not provide high performance. To address this issue, BTCL provides low-level commands, geometry and vector, that support high-performance, low-level geometric and mathematical manipulations. These commands operate on arbitrary lists of scalars or coordinates with performance comparable to that of compiled code. This allows you to construct your own analysis and energy terms for incorporation into the Discover program. Vector operations currently employ a function-call interface.

If you would like to avoid dealing with geometric and mathematical details, a higher-level command (molGeom) allows you to work directly with models, monomers, atoms, and their relationships. molGeom employs some of the same performance enhancement techniques as geometry and vector. There is a bit more overhead involved in the use of molGeom, but its performance should be fine for most purposes. You will probably want to work first with molGeom, and employ the lower-level commands only as necessary. The molGeom command can be used to get and set geometric properties of models, such as the coordinates of atoms, bond lengths, bond angles, etc.

Databases

Information that is read into the Discover program is organized into databases for subsequent use. Each distinct molecular system resides in a separate database with the same name as the molecular system. You can make your own databases and perform various operations on them (or on the system databases) with the database command and database handle operations (see $dbHandle). The database command supports database creation, deletion, and other high-level operations. (Use of the database command to create or delete system databases is not recommended.) The database handle operations are an extension of the database command. They are used to access the data within a particular database.

In addition to standard database search operations provided by the database handle operations, there are special data acquisition operations supported by the following commands:

The subset command is used to create or access subsets in a database. A subset is a particular type of database table which is designed to be compatible with the mechanism of the select command as well as standard database operations.

Please see Databases and Tables for additional information on databases.

Objects

The object command is used to add information to or extract information from a BTCL object. A BTCL object is, in essence, a list of entries of arbitrary type. BTCL objects (of differing types) can be created with various commands, including object, vector, geometry, molGeom, database, database handle operations ($dbHandle), select, and subStructure. A BTCL object is deleted when the BTCL variable used to store its identity (or handle) is unset. As noted previously, objects are used to avoid processing burdensome amounts of data at the BTCL level. The object name is passed from BTCL to compiled code which deals with the underlying data.

Please see Databases and Tables for additional information on objects.


Initiating a run

Given that these files exist, the Discover job can be started by several methods, depending on where you want its output to appear and the format of the command input file.

Direct method

Enter the following at the UNIX prompt:


>	discovery run_name

where run_name is your name for the calculation.

The first line of the command input file run_name.inp must contain the line:


#BIOSYM dsl 3


or (preferably):


#BIOSYM btcl 3


which identifies it as a BTCL command input file for CDiscover.

The output from the calculations goes to the file run_name.out; and run_name.err contains messages from any serious errors that may have occurred.

Using the interactive utility

If no run name is specified after discovery, an interactive BTCL utility is started:


>	discovery

The BTCL prompt appears, and you need to enter the following:


BTCL >	set PROJECT run_name


BTCL >	source run_name.inp

Output from the calculations goes to standard output.

If you decide not to start a Discover run, you can end the BTCL session by entering exit, quit, or <Ctrl> d at the BTCL prompt.

File redirection method

Alternatively, the BTCL command set PROJECT run_name can be the first BTCL command in the file run_name.inp. In this case, initiate the run by entering:


>	discovery < run_name.inp

Output from the calculations goes to standard output.

Running CDiscover in parallel

CDiscover can be run in parallel for certain operations on some platforms. At present, parallelism is supported on SGI multi-processors, the IBM SP2 series, and Convex Exemplar computers. To run in parallel in standalone mode, add the following options anywhere on the command line:


>	discovery -mpi "-np N"

where N is the number of processors.

To run CDiscover jobs submitted from the Insight program in parallel, set the environment variable MPI_ARGS to '-mpi "-np N"' before running CDiscover. This can be done with the Session/Env_var command in the Insight interface or via the UNIX shell before invoking the Insight program:


>	setenv MPI_ARGS '-mpi "-np <N>"'


>	insightII

The efficiency of parallel computation depends on the system size and the calculation method chosen. At present, moderate numbers of processors work best (e.g., 4 CPUs).

Scripts and program initialization

CDiscover automatically initializes with a default discoverRc script. By modifying this script, the site administrator can make site-specific BTCL libraries available to users.

The script directory has been reorganized to include more example and gift scripts which demonstrate how to build scientific functionality with CDiscover.


An example CDiscover run

Input files

Below is a relatively simple example command input file which could be used to initialize a molecular system, minimize it to improve the molecular structure, and then perform a dynamics simulation on the minimized structure.

A more complicated series of command statements is demonstrated in the file named $BIOSYM/gifts/discover/tcl/acenm.inp. (The file sets up a calculation that compares the results obtained with several different forcefields.) Since the filename in this example is acenm.inp, the molecular data are read from the files acenm.car and acenm.mdf, which must be present in the same directory as the acenm.inp file. These example input files can be also found in the directory $BIOSYM/gifts/discover/tcl. To test them, you need to first move to a directory in which you have write permission and copy the files by entering at the UNIX prompt:


>	cp $BIOSYM/gifts/discover/tcl/acenm.* .

A short example command input file


#BIOSYM btcl 3
begin
minimize \
    iteration_limit = 20 \
    newton method = newton_raphson \
    execute \
        frequency = 1 \
        before = 1 \
        after = 1 \
        command = {
              print output energy_summary = 1 \
              internal_energy = 1 nonbond_energy = 1}
dynamics \
    boltzmann = 1 \
    time = 100.000000 \
    timestep = 0.250000 \
    ensemble = nvt \
    execute \
        frequency = {10 * 0.250000} \
        command = {print output energy_summary = 1 
state = 1}
dynamics \
    time = 200.000000 \
    timestep = 0.250000 \
    ensemble = nve \
    execute \
        frequency = {10 * 0.250000} \
        command = {print output energy_summary = 1 
state = 1}


This file can be written entirely by hand, using any text editor, or a file from some previous run could be edited to produce this input file.

Alternatively, the input file can be constructed using the Cerius2 or Insight interface to CDiscover.

These Cerius2·Discover control panels would be used to construct the above file:

The following Insight·Discover_3 pulldowns and commands would be used to construct the above file:

The file is now ready to use in a standalone run.

The sample acenm.car and acenm.mdf files

The model in these files (in the $BIOSYM/gifts/discover/tcl directory) can also be constructed with the Cerius2 or Insight interface, by using the Builder facilities, followed by assigning forcefield atom types. Alternatively, model description files output by other software can be used, provided they have the appropriate formats, contents, and filename extensions.

Explanation of command input file

Setting up the computational system

The first command to be executed is the begin command. This loads the molecular system having the specified name acenm (i.e., the files acenm.car and acenm.mdf). The begin command also opens and writes some initialization information to the default output file, acenm.out.

The computational steps

The minimize command assumes that minimization should be complete within 20 steps.

The dynamics command is then executed, to perform a simulation of 100 femtoseconds (since the timestep parameter is set to 0.25 fs, this amounts to 400 time steps).

Output of results

The dynamics command includes an execute subcommand to print energy information every 2.5 fs, as well as at the beginning and end of the dynamics run. This information is appended to the file acenm.out (which was opened earlier, by the begin command), although it is possible to specify some other filename.

If any errors are encountered during the run, error messages would also appear in the acenm.out file, and the run would be aborted.

Initiating the run

The commands contained in the acenm.inp file are executed by simply changing to the directory that contains the acenm.inp, acenm.car, and acenm.mdf files and entering, at the system prompt:


>	discovery acenm


History file support

CDiscover can read and write history files. These files are in the same format as those of FDiscover and can be read by the Cerius2 and Insight programs. The history file is written by using a print command during a minimization or dynamics simulation.

The readFile command may be used to read a particular frame of a history file into the CDiscover program. In this way a history file might be converted into an archive file, for instance, by using the writeFile archive command. The return value of the readFile command, when it is applied to a history or archive file, is the potential energy of that frame. This would allow you to, for instance, construct scripts that sort the frames in an archive or history file based on energy.


Editing the ESFF forcefield parameters

The explicit parameters of the ESFF forcefield may be modified for a particular molecular system. To do this, you first have to run a job for your system in the default mode (with the edit option flag esffOverrideParameters set to 0). An explicit parameter file called run_name.epa is output when the run finishes. You can then edit this run_name.epa file to modify the explicit parameters. Only the explicit parameters, such as reference values, force constants, etc, may be changed. To rerun the job with the modified explicit parameters you need to insert one command:


set esffOverrideParameters 1


after the command:


begin


in the input command file run_name.inp. With the edit option flag esffOverrideParameters now set to 1, only the parameters in the run_name.epa file are used. To use the default ESFF parameters to run the same molecular system once again, you need to reset the edit option flag esffOverrideParameters back to 0 by deleting the command line:


set esffOverrideParameters 1


from the input command file run_name.inp.


Tutorial--Standalone Mode

This tutorial is not available as a Pilot logfile, since Pilot functions only within the Insight interface. An additional lesson is available: Understanding & Creating BTCL Scripts.

Overview of tutorial lesson

Lesson 1: Using BTCL and TCL commands to manipulate the geometry of two helices is now supplied electronically (html-hypertext only--this lesson is also printed in this section, Lesson 1: Using BTCL and TCL commands to manipulate the geometry of two helices). In this lesson, we have two helices and want to find the relative orientation that has the lowest energy. The two helixes are identical. We will first find their axes, then rotate one of them so that the axes are parallel. Then we will translate one of them so that the line joining the centroids of two axes is perpendicular to both axes at a distance you define. After that, the second helix will be moved along the line joining the centroids. It will be spun around its own axis and then it will also be rotated around the first helix. At each orientation, the energy will be calculated. The data will be written in a table file readable by the Insight program, so we can sort the data using the spreadsheet. All the different conformations are stored in an archive file and can be replayed within the Insight interface using the Analysis/Trajectory command.

This lesson assumes you are already familiar with using the Insight program.

In this lesson you will learn specific syntax and commands for:

The angle between two lines using geometry ang1 angle line1 line2 (You can examine the input file for the relevant commands after you copy it in Step 1 of the lesson.)

A point on a line using geometry point1 point line1

The direction vector of a line using geometry vec vector line1

The distance between a point and a line using geometry dist distance point1 line2

The distance between two lines using geometry dist distance line1 line2

The cross product between two vectors using vector vec3 cross vector1 vector2

The negation of a scalar or vector using vector argout negate argin

The difference (subtraction) between two scalars or vectors using vector output subtract inarg1 inarg2

How to change an angle from radians to degrees using vector angdegree degree angradian (This is necessary because the geometry command works in radians but the molGeom command works in degrees)

Lesson 1: Using BTCL and TCL commands to manipulate the geometry of two helices

1. Obtaining the required files

Copy the files helix.inp, helix.car, and helix.mdf from the directory $BIOSYM/tutorial/discover/ to a directory in which you have write permission

2. Examining the command input file

Read the command input file to understand it--the commands are associated with explanations in comment lines (starting with #).

A few hints to understanding the script are:

Assigning values to variables.

Choosing a subset of atoms and the atom specification syntax. (The most general syntax is "model_name:residue_name:atom_name".)

Writing and calling a procedure.

Writing a for loop.

Using an if statement.

Opening a file and writing data to a file.

Using various molGeom and geometry statements to do geometry manipulation.

Using various vector statements to do vector and scalar operations.

3. Running the CDiscover program in standalone mode

Run CDiscover by entering at the UNIX prompt (indicated as >):

>	discovery helix

Wait for the run to end.

4. Finding the output files

List the files in your working directory by entering:

>	ls -lt

Output files of the run should include helix.out, helix.cor, helix.arc, and helix.tbl

5. Examining the output files

Examine the helix.out file (for example, with the UNIX more command). This is the default output name for the file that contains all the input lines in addition to the relevant output information.

Search for these output lines:

axis1 = line {{11.0926 17.2649 3.27548}} {{-0.821539 0.123481 0.55662}}
axis2 = line {{5.90624 11.5912 -3.12062}} {{-0.821539 0.123481 0.55662}}
line distance = 10
point distance = 10

The lines for axis1 and axis2 show the centroid, (the first { } in each line) and the direction vector (the second { } in each line) of the axes. Note that axis1 and axis2 have the same direction, indicating that they are parallel after execution of the fixorientation procedure. The line distance is the distance between two lines, and the point distance is the distance between two centroids. These two distances are the same, indicating that the centroids are lined up perpendicular to both axes at the predefined distance apart.

6. Displaying the structures with the Insight modeling program

The helix.cor file is the coordinate file written out after fixing the orientation. It should contain two helices parallel to each other and 10.0 Å apart.

You can display the structures in the helix.cor file with the Insight program.

Start up the Insight program from the directory in which you ran the Discover program by entering insightII at the UNIX prompt. Read in the helix.cor file with the File/Import or Molecule/Get command, and use Geometrics/Vector command of the DeCipher module to create the axes of two helices by finding the least-squares fit line of the backbone atoms subset specified by model_name:*:ca,n,c.

Before doing this, you might also want to display the models in the helix.car file to show that the helices in the original file were randomly oriented.

7. Displaying all the configurations that were found

Then use the Trajectory pulldown of the Analysis module to display all configurations of the different orientations of the two helices:

Trajectory/Get = to get the trajectory for the assembly called command HELIX

Trajectory/Animate = to display all the conformations in sequence command

The trajectory file is in the helix.arc file.

8. Sorting the data to find the lowest-energy configurations

Use the Spreadsheet functionality to read in the helix.tbl file and then sort the energies in ascending order.

Select the Spreadsheet icon, then select the Open command from the list that appears. Set the following parameter values:

File Type = Graph

File Name = helix.tbl

Object Name = Sheet

Select Execute.

Grab a corner of the spreadsheet window with the cursor and enlarge it to have a clearer view.

Use the Data/Sort command to sort the data, by first highlighting the Total energy column and then setting the following parameters:

Extent Choice = Column

Primary Key = E0 (type it in or select the Total energy column by clicking the E column label)

Secondary Key = None

Tertiary Key = None

Descending Order = Off

Select Execute.

9. Displaying the lowest-energy conformations

Go back to the animation process and display the configurations that have the lowest energies. You need to load only those conformations, with the Get/Trajectory command. Set Selection Mode to Specified and enter the desired frame numbers in the Frame Spec parameter box, separated by commas.

10. Applying what you have learned

You could try writing your own script to reorient a three-helix system or a four-helix bundle.


Command Summary--Standalone Mode

The BTCL language and its commands are documented in full in Btcl Language and Commands--Standalone Mode, where the commands are listed in alphabetical order. Below is a list of the BTCL commands according to their functions, followed by a brief description and the page on which the complete description starts.

Molecular operations and i/o

begin

Read in the model and choose the forcefield

molGeom

Get and set geometric properties of models

readFile

Read in a data file

reset

Reread the structural data files and forcefield

subStructure

List all atoms connected to an atom that are not also bonded to a second atom

writeFile

Write a data file

Calculations

atomMovability

Set and unset movability characteristics of atoms

cellParameter

Get and set cell parameters

dynamics

Perform molecular dynamics simulations

energy

Perform a single energy evaluation and calculate first and second derivatives

energyContribution

Pass energy and energy gradient contributions from user-defined TCL routines to the CDiscover program

forcefield

Select a forcefield, change and scale the contributions from various energy terms

minimize

Refine the structure of a molecular system

rattle

Set up constraints in bonds, angles, or water fragments for a dynamics run

restraint

Create, delete, and scale restraints on relationships among atoms

Analysis and output

analyzeNonbond

Calculate and examine repulsive, dispersive, and electrostatic energy components of individual atoms and total nonbond interaction energies between sets of atoms

diffraction

Calculate the X-ray, neutron, or electron scattering pattern

discoverHistory

Manipulate the history information from a dynamics run

output

Control the amount of printing to the output file

print

Write out information for various system properties

vibrationalAnalysis

Perform normal mode analysis

Database operations

database

Perform operations on databases

$dbHandle

Access data within a database

object

Extract or print information from a BTCL object variable

pseudoAtom

Create, access, and update pseudoatoms

select

Create a list of row numbers from a hierarchial database

subset

Create or access subsets of objects in a database

Geometry and math operations

geometry

Create, review, and manipulate geometry objects

vector

Perform vector manipulation

Miscellaneous

help

Print information about commands

Discover Btcl IPC commands

Commands for interprocess control

peek

Control the monitoring of iterative processes in minimization and dynamics calculations




Last updated September 26, 1997 at 03:18PM PDT.
Copyright © 1997, Molecular Simulations, Inc. All rights reserved.