Electron density modifications
Electron density can be manipulated by any mean that will yield a more interpretable electron density map, finally resulting in a completly resolved and refined structure of a macromolecule. Most widely used and verified methods are electron density averaging and solvent flattening (Bricogne, Wang). These methods improve phases by including extra information from two sources, the local symmetry within an asymetric unit and the demarkation of areas occupied by protein and solvent. These then alter electron density maps, from which the whole crystal cell is built and structure factors are calculated, beginning a new cycle. The procedure is repeated until the apparent R-factor of the electron density map converges The aim is to produce an interpretable electron density map.
In last years number of programs dealing with electon density modifications and phase improvements is increasing (see for example papers in Acta Cryst. 1993 D49), so MAIN certainly is not the only choice, however, it offers a relatively simple command language structure and does not limit a user to a particular method or their combination or to a paticular asymetric unit or space group or number of crystal forms. MAIN allows imediate verification of procedures through its graphical facilities - thus making life in this field relatively easy.
MAIN offers a map manipulation language, which allows you to do much more as can be possibly described here. If you want to master more than the usual essentials, you are advised to read the 'MAKE MAP' commands from the "Reference manual".
Section "Solvent flattening" describes how to run the standard reals space procedure (Wang, ) and the fast one, which uses the Parseval's correlation theorem to calculate the molecular envelope in reciprocal space. Section "Electron density averaging" brings you essentialls of how to setup an electron density averaging procedure. Two procedures in two subsections are described. Subsection "Averaging of electron density: 2 molecules" explains a procedure within a single unit cell and is easy expandable to any number of equivelent density regions. Subsection "Averaging density between 2 different crystal forms" explains how to average electron density of two equal molecules in two different crystal forms. Section "Electron density combination" explains how and why combine different origin electron density maps in order to construct better maps. Its extension partial model refinement is described in chapter "Crystallographic refinement" in section "Partial model refinement".
Besides these there are some auxiliary sections, which help you to deal with certain concepts and tools connected with MAIN map manipulations. Section "Making envelopes" shows how envelopes can be constructed with MAIN. Section "Local histogram matching" introduces s short reciprocal space based histogram correlation procedure, which can be built in into any kind of map calculation. Section "Map concepts" gives insight into map structure and map manipulation command language. Section "Phase combination" describes how to combine phase combination into an electron density modification procedure using CCP4 package.
Map concepts
In the case of proper averaging one molecular envelope (mask) suffices for all molecules, while for improper averaging the procedure should distinguish between different molecular areas. For this reason the concept of labeled masks was introduced by Bricogne. In MAIN this is solved by storing each molecule's mask in a separate map. Since on each map only a single operation can be performed at a time, there are no principle differences between improper and proper averaging procedures.
The usual way to generate a complete crystal cell from an asymmetric unit was to write a procedure that applies the building rules. Building rules tell which grid point in a cell is equivalent to which one from the asymmetric unit. This approach has several drawbacks which can turn averaging (and solvent flattening as well) to a complicated procedure full of errors. First, for each different space group, different rules should be applied and second, in almost any space group it is possible to choose different definitions of an asymmetric unit. The third complication sometimes arises from differing numbers of grid points in a unit cell. These routines are, however, not available for every possible case and have to be, when necessary, programmed. Also when they are available their correctness should be verified for each single case. It happened quite often that there were whole empty layers left. At this point real problems may begin specially for an unskilled programmer. Therefore in MAIN another approach is applied: A crystal cell is generated from an asymmetric unit by applying crystal symmetry operations. The consequence of this generality is that there are no limitations on placement of an asymmetric unit and the routines work generally for any space group. Besides the asymetric unit may consist of several independent parts, each one stored in a separate map. The MAIN approach has also another advantage: since only empty points can be modified, there is no danger of having multiple density in certain regions.
Maps: enevelopes, density filled regions
A map is a 3-dimensional array of grid points, each with a value. According to the value, they are treated as empty, density or mask points. The grid points with values inside the density interval are the density points, the ones with values below the density interval are empty points, and the ones above are mask points. In the case of character*1 maps 0 is an empty point and 255 is a masked point. In the case of real*4 maps the empty points have values below -9999.0 and masked above 9999.0. The region inbetween comprises the density points.
Each map has size, starting coordinates and cell constants (cell constants are needed for transforming maps from differents cells). Grid points lying in different unit cells with the same fractional coordinates are identical. That means that, when a whole unit cell is defined in a map, the program can expand the density through the whole space.
There are 9 elementary operations types that can be done with maps:
The smallest map contains only a header where cell constants and crystal symmetry operations and number of grid points along the cell axes are stored. MAIN can read PROTEIN, X-PLOR and ASCII CCP4 as well as its own native formats of electron density maps. The maps can be written in Lyn Ten Eyck, X-PLOR and MAIN native formats. The native formats are ASCII files with record length 80 so that they can be edited and changed with a text editor. MAIN map files will be changed to unformatted, since some UNIX systems (Hewlett Packard and Silicon Graphics) cause problems when dealing with fixed length ASCII files. CONVEX FORTRAN causes however no problems. Besides a variety of smaller conversion programs were written to enable conversion of maps between PROTEIN, X-PLOR, P1SF, FRODO and native MAIN formats.
Operations on a map grid point can be applied when the point is empty or masked. (The exception is the SET command that can set a value to a grid point in any specified range.)
A new map can be created from an already existing one by taking its cell constants and size or from scratch. The map size, origin, and number of grid points per cell length and cell constants can be taken from an already existing map. The grid points are initialized to a specified value. The map origin and size can also be defined from an atom selection so that selected atoms plus some boundary grid points lie inside it.
Masked or empty points can be defined in several ways:
A map can be transformed (rotated, translated or copied) into the mask points of another map by linear interpolation. This is done so that the position of a mask point is transformed into the space of the map with density. The masked point density value is then obtained by interpolation from the eight surrounding grid points.
Electron density created from atoms can be added to density of a grid point or it can replace the value of an empty grid point. Empty points of a map can be filled with values of the density points of another map by applying crystal symmetry operations. This routine is independent of crystal symmetry. In order to find the position of the grid point into which the density value should be copied rotation matrices and translation vectors are applied.
Two maps at a time can be added or merged together. Adding differs from merging in the way, that only two equipositioned grid points having a density valu can be added together, while merging two maps allows scaling of both map contributions prior to their adding together. When a grid point has no density value its value taken from the other map.
Solvent flattening
It is assumed that the areas of the electron density map where protein lies have areas of greater positive electron density than the solvent areas. The solvent occupied areas inside the crystal have no rigid structure, so they can be `flattened' (all grids points in the solvent region obtain a single density value - usually zero). (B.C. Wang, Methods in Enzymology, 1985). Solvent flattening, when successfully applied, defines clear boundaries between solvent and protein regions and improves interpretability of an electron density map.
There are several variations of the score map calculations available. They can be performed either in real or reciprocal space using linear or gaussian distance dependent density or RMS deviations of density weighting. I recommand the use of WEIGHT GAUSSIAN.
First variables use later on throught the procedures are defined.
> set vari MAP_DENS = 1 > set vari MAP_FO = 2 > set vari MAP_FOFC = 3 > set vari MAP_2FOFC = 4 > set vari MAP_MASK = 5 > set vari SOLV_RAD = 8.0 > set vari HIST_TEMP = 0.0 > set vari SOLV_CUT 0.40
and symmetry operators read.
> read file >symm/c2.symm symm
The MAP_ variables assign different maps, SOV_RAD is the sphere radius, HIST_TEMP is the B value of the atom used for local histogram matching and SOLV_CUT is the proportion of the map that will be flattened.
You can either continue by reading an initial map
> read file ../prodef_mir.xmap map xpl > read file ../refl/catl.fobs refl init resol 3.0 100. limit -99 0 -99
or calculating it from FOBS and MIR phases
> read file cell.dat cell > read file ../refl/catl.fobs refl init resol 3.0 100. limit -99 0 -99 > set vari AGRID = 70 > set vari BGRID = 50 > set vari CGRID = 120 > make map MAP_DENS from 0 init 0 real grid AGRID BGRID CGRID cell > make map MAP_DENS conv complex > refl fill MAP_DENS defined > four map map MAP_DENS back > make map MAP_DENS rescale
and then create all neccessary maps for the solvent falttening procedure. It is assumed that the map 1 (MIR_MAP) contains the whole unit cell density. The MIR_MAP is here never altered in order to allow variations of paramaters during different runs. At each restart the MAP_DENS map density should be COPIED into the MAP_FO (2) map.
> make map MAP_FO from MAP_DENS init 9999 real cell copy > make map 3 from MAP_DENS init -9999 real cell > make map 4 from MAP_DENS init -9999 real cell
The calculation can be significantly accelerated if only points in an asymmetric unit are solvent flattened, however the BOX should be at least one grid larger then the actual asymetric unit.
> make map MAP_MASK from MAP_FO init 9999 box 0 0 0 37 50 62
The procedure is faster then the original programs (Wang, Methods in Enzymology, 1985). The summ of density above the specified threshold value (CUT 0.0) at all grid points FROM map MAP_FO within the specified radius (SOLV_RAD) around the central point WEIGHTED LINEARLY by distance is stored into the equivalent central point in map MAP_MASK.
> make map MAP_MASK from MAP_FO cut 0.0 radi SOLV_RAD \ > weight linear
WEIGHT GAUSSIAN 2.0 replaces the linear weighting, it however introduces additional parameter: a gaussian sigma value exp( - dist**2 / radius**2 * sigma**2). The default sigma is 2.0.
An additional keyword RMS calculates instead of density its RMS deviations within the specified sphere (Abrahams Acta. Cryst. D52, 30-42, 1996).
The reciprocal space (fast) procedure
The difference between the standard solvent flattening procedure described by Wang and the herein described fast procedure is in the scrore map calculation. The standard procedure calculates the weighted sum of positive electron density within a sphere. The fast procedure is based on Parseval's theorem which allows to calculate the scoring map (basis of a molecular envelope) in reciprocal space. Fourier coefficients of the weighting scheme P1 map and Fourier coefficients of the positive parts of the map to be flattened are multiplied and then backtransformed. The resulting map is a score map: convolution of the weighting scheme and the posiive parts of the map (Leslie, Acta Cryst. A43, 134-136).
The fast procedure is not numerically identical to the standard one because of Foruier series truncation error, however, it is sufficiently accurate to be used. The major advantage of the fast procedure is that it is incomparable faster than the standard one. It can be performed interactively at a workstation display. Its speed is not sphere radius dependent.
WARNING: Accuracy of the reciprocal space procedure depends on the completness of the low resolution data. If substantial part of low resolution data is missing use the real space procedures only.
You can generate the Leslie's analytical structure factors for the convlution sphere directly
> reflect solv_flat SOLV_RAD
or use your own form of weighting scheme by generating its density via atoms. I recommand use of the GAUSSIAN form.
Initiate a unit cell with the density set to EMPTY and a density of an atom that is placed at coordinate system origin and which density FUNCTION uses LINEAR or GAUSSIAN distance dependance (1. - d / SOLV_RAD). Set the empty points to zero, Fourier transform the map and save its Fourier coeficients in the FWORKSET array.
> make map MAP_SOLV from MAP_FO init -9999 cell real > make map MAP_SOLV atom function SOLV_RAD gaussian 2. > make map MAP_SOLV set -100000 0 0.0 > fourier map MAP_SOLV
And then store in both cases the obtained FCALC structure factors in the FWORK set:
> reflect set ampl phase fwork = fcalc * 1.0
Copy the MAP_FO into the MAP_SOLV, set all negative density values to zero and Fourier transform it. R-value calculation is performed in order to follow convergence of a further procedure.
> make map MAP_SOLV from MAP_FO init 9999 cell real > make map MAP_SOLV from MAP_FO copy > make map MAP_SOLV set -99999 0 0.0 > fourier map MAP_SOLV > reflect shells 10 r-values
Multiply FWORKSET (sphere model) and FCALCULATE (truncated MIR density) Fourier coeficients, convert the MAP_SOLV map to a convex map, fill it with the multiplied Fourier coefficients and Fourier BACK transform it. (The DEFINE option and application of all symmetry operators in the reflection fill-map command is OK too.)
> reflect set ampl phase fwork = fwork * 1.0 * fcalc * 1.0 > make map MAP_SOLV conv complex > make map MAP_SOLV zero > refl fill-map MAP_MASK > four map MAP_SOLV back > make map MAP_SOLV rescale
The map MAP_MASK is further analyzed - a histogram of density points in the 256 density intervals is prepared. (Number of intervals can vary from 1 to the maximal number of atoms your MAIN can accept). This histogram serves for the envelope cut. Note that in the case of convoultion theorem based score MAP_SOLV should be equal to MAP_MASK.
> analyze init > analyze map MAP_MASK step 256
> make map MAP_MASK analyzes range SOLV_CUT
Filling holes and removeing clouds
> set vari HOLE_SIZE = 6
At this stage you can in addition influence the maps either by FILLING the holes or remove the cloudes of the scattered mask. The command can be iterated.
> make map MAP_MASK + 1 from MAP_MASK init 9999 copy > make map MAP_MASK from MAP_MASK + 1 remove HOLE_SIZE
> make map MAP_MASK + 1 from MAP_MASK init 9999 copy > make map MAP_MASK from MAP_MASK + 1 fill HOLE_SIZE
The map MAP_SOLV is converted to a real*4 map empty map, with density values set to -9999. The regions occupied with molecule obtain density, while the solvent region remains empty.
> make map MAP_MASK from MAP_FO copy > make map MAP_SOLV conv real > make map MAP_SOLV zero > make map MAP_SOLV set -1 1 -9999. > make map MAP_SOLV from MAP_MASK cell
Note that in the case of convolution theorem based score map calculation no extra map is needed (MAP_MASK should be equal to MAP_SOLV) and that the unit cell generation step is superficial, so you only need to copy the density into the masked MAP_SOLV:
> make map MAP_SOLV from MAP_FO copy
Flattening and flippig of the solvent region
In order to flatten the solvent (empty) region its value can be simply set to zero or to some small negative value as -0.1.
> make map MAP_SOLV set -10000 -9900 0.0
For any other solvent region maipulations the solvent needs first to be converted into a mask
> make map MAP_SOLV set -10000 -9900 9999.0
and then flipped
> make map MAP_SOLV from MAP_FO invert reshift
or set to its average value
> make map MAP_SOLV from MAP_FO invert reshift scale 0.0
For further possibilities see "Command Reference Manual" MAKE MAP commands.
Fourier transformation of the solvent flattened map.
> fourier map MAP_SOLV > reflect shells 10 r-values
> reflect set ampl phase fwork = fcalc * 1.0
Use FOBS
> reflect set ampl fwork = fobs * 1.0
or 2FOBS - FCALC map for further calculation
> reflect set ampl scale fcalc fwork = fobs * 2.0 - fcalc * 1.0
At this point you can call the MAIN atomic probability function (See below the section "Local Histogram Matching".)
> if ( HIST_TEMP .gt. 2.0 ) <>utils/hist_match
or include phase combination (see the section "Phase Combination")
> make map MAP_FO zero > make map MAP_FO conv complex > refl fill-map MAP_FO defined > four map MAP_FO back > make map MAP_FO rescale
At this stage the procedure ends. The results can be stored or a new cycle started.
> write over file FILE_MAP map MAP_FO xpl
Solvent flattening procedure based on a skeletonized map
See menu items SKELETON and SOLV_MAS in menu block MAP_ATOM. In a while a better description will appear also here.
Generating molecular envelopes
Molecular envelope or a mask is a region within a map, which is accounted to one or several molecules. Points of a molecular mask in the real*4 map presentations have values larger than 9998.0.
Masks can be created with help of atoms by masking all grid points within a ceratin radii from the selected atoms and write it to a file.
> make map MAP_MASK from 1 init -9999 around 10 select segm name A end > make map MAP_MASK select segm name A end atom mask 4.0 > write fils model_mask.xmap map MAP_MASK xplor
In the simplest case a molecule is already available, so we can simply used it for the mask. In cases when this is not, a mask should be created out of a given electron density map. As soon as the electron density electron density interpretation proceeds, include the current models in mask editing procedures.
Filling holes and removing lonely clouds
Remove cloudes, which have less than a HOLE_SIZE in diameter, from a mask:
> make map MAP_MASK + 1 from MAP_MASK init 9999 copy > make map MAP_MASK from MAP_MASK + 1 remove HOLE_SIZE
Enlarge the mask for the holes which have less than HOLE_SIZE in diameter.
> make map MAP_MASK + 1 from MAP_MASK init 9999 copy > make map MAP_MASK from MAP_MASK + 1 fill HOLE_SIZE
Avoiding mask overlap(s) in a unit cell
Since the masks are based on atoms, the simplest is to check the symmetry overlap interactively on the screen by adapting a "re_image.cmds" and "symmetry.cmds" files so that they will generate images of the atoms used for a mask generation and their symmetry equivalents. Fiddle with the atoms until no overlap is seen (display maps and use them for defining an asymmetric unit when they are available).
The mask atoms in contact their symmatry mates need a reduce radius. The simplest way, however, is, to generate the symmetry mates around the masked atoms and then used them to remove the regions, they occupy, from a mask. Of course their radii should be reduced when compared to the radii used for the mask creation.
> delete atom sele segm name #* end > symmetry select mask_atoms end around select mask_atoms end dist 10. cut > make map MASK_MAP atom clear 1.8 sele segm name #* end
You can inspect the mask overlap also by using the masks to generate the unit cell.
Using secondary structure elements (helices)
A polypeptide chain folded into a secondary structure (a helix) can be used to efficiently mask substantial regions of space. With this models it is easy to check possible overlaps with a symmetry related regions. It is enough to master the elementary model building features and the way to ab initio envelope creation is open. The disadvantage of maps based on secondary structure models is that such models only approximately follow a map.
Skeleton is a way of an electron density map presentation, where skeleton atoms are conected with covalent bonds and displayed. The implemented algorithm is based on S. Swansons ideas. Skeleton atoms are placed at local density maxima and on saddle points connecting them. The maxima as well the saddle points have to be above a chosen density threshold level. The file "get_skel.cmds" creates and displayes skeleton atoms.
MAP_DENS provides density for skelotisation algorithm and MAP_SKEL is the map where the algorithm is performed. SKEL_CUT is the density threshold value.
> set vari MAP_DENS = 3 > set vari MAP_SKEL = 4 > set vari SKEL_CUT = 1.0
Here its is assumed that the MAP 1 includes the whole unit cell. The MAP_DENS becomes a copy of the unit cell density around the segment name MADL, you can also use a BOX (the commented example) to move the skeleton region out of the unit cell.
> make map MAP_DENS from 1 init 9999 around 10 sele segm name AMDL end > make map MAP_DENS from 1 copy >
Before the start, all old skeleton (skel) and symmetry (segm name #*) are deleted.
> delete atom sele skel .or. segm name #* end > key old sele all end
The follwoing block of commands performs the map skeletonization.
> make map MAP_SKEL from MAP_DENS init 0 integer > make point init from map MAP_DENS extreme SKEL_CUT 100. > make map MAP_SKEL from MAP_DENS local SKEL_CUT 100. > make map MAP_SKEL cloud > make point from map MAP_DENS saddle MAP_SKEL > make atom from point reindex > key skel sele .not old end
The skeleton atoms, which are not attached to any other atoms, are deleted and skeleton atoms are then reorganized so, that each covalent bond network forms a segment. Bonds crossing the unit cell (longer than 10. AA) as well as too small segments (containing less than 15 atoms) are deleted.
> dele atom sele .not by bond all .and skel end > make segm from atom sele skel end
> dele bond sele bond range 10. 1000. end > dele atom sele .not old .a segm range 1 15 end > rena sele skel end atom CX > set temp sele skel end 50. > set weigh sele skel end 1.
After a skeleton is created, it remains to choose the region that correspond to one or several molecules. A skeleton is composed of segments. A skeleton segment is a group of atoms attached into a covalent bond network. Segments include different numbers of atoms.
Segments including less than 15 atoms were already deleted. Small segments include between 10 and 200 atoms. Large segments are composed of more than 200 atoms, even a few thousand is possible. The small and large segments get the colors according to their size and are displayed.
> set col > sele skel .a segm range 10 100 end col 200 > sele skel .a segm range 100 10000 end col 160 > exit > image sele skel end set bond > return
It remains to select the molecular regions. The easiest way to prepare masks is by using the SELECT menu block items and when needed to break connection and delete one or several atoms. For this purpose the key 'active' is exploited.
Use only the default flags of the SELECT (light blue) menu block. Items 'KEY_ACTI', 'SEL_LAST', 'SELE_nn ' and 'SEL_MOLE' should be activated. With this flags on, you will choose only whole networks of covalent bonds (molecules). By picking atoms and functions 'SEL .or.', 'SEL.and.' and 'SEL.not.' enable you to select whole segments with help of logical operations: with 'SEL .or.' you include into the key 'active' additional network of skeleton atoms, with 'SEL.and.' you keep only the last picked network and with 'SEL.not.' you exclude the last network from the key. If a network is too large you can break it into smaller pieces by breaking bonds ('DELE_BON') or deleting some of its atoms ('DELE_ATO').
After the process is finished you can write the selected atoms to a file or immediately create a mask.
Procedures based on map statistics
The simplest approach is mask definition during a solvent flattening cycle.
Placing atoms on a 10 grid lattice
Electron density averaging
Electron density averaging requires clear boundaries between protein and solvent volumes, it should be almost inevitably preceded by a solvent flattening in the case of multiple isomorphous replacement phase evaluation.
There are two types of non-crystallographic or local symmetry: proper (also called spherical) and improper symmetry. Molecules of an asymetric unit are related by proper symmetry when they can be superimposed upon each other by a single rotation about a local symmetry axis, while in the case of improper local symmetry, superposition of the molecules require other operations (rotation usually combined with translation). Therefore, procedures for proper and improper symmetry averaging differ. For proper symmetry averaging, equivalent areas do not have to be separated, while for improper averaging it is necessary to distinguish between them.
Besides improving the phases (and the electron density maps) of the starting resolution range, it is possible to evaluate phases of higher resolution reflections by gradually increasing the resolution range. The procedure is called phase extension. The larger the number of molecules in an asymmetric unit, the better the results which can be obtained with phase extension, with however the condition that the initial phases are sufficiently correct for the procedure to converge properly ( Bricogne, 1974, Podjarny, 1990).
To perform real space electron density averaging some initial set of phases, equivalent areas and geometric transformations (rotation and translation parameters) are required.
In a molecular replacement procedure, the equivalent areas are defined from the initial model placement. Transformations between them can be easily constructed by superimposing the molecular models (as demonstrated in the cases of cathepsin B and riboflavin synthase).
In the case of a single or multiple isomorphous replacement procedure, it is possible to construct the transformation parameters from the heavy atom positions when they fulfill the local symmetry conditions. When they do not, it is necessary to construct the rotational parameters by an autocorrelation of a Patterson map and then to find the center of rotation and corresponding translational components by autocorrelating electron density (as demonstrated in the case of carbamoylsarcosine hydrolase). To recognize the boundaries of equivalent areas, it is recommended first to average the initial electron density map, whereby the uncorrelated areas are supposed to smear out, and in this averaged density to then find the borders (envelope) of the equivalent areas interactively at a graphical display by going through all the layers of an asymmetric unit (2-dimensional construction using a program such as X-CONTOUR (Buchberger, 1990)) or by preparing a 3-dimensional map representation of a whole asymmetric unit and determining the equivalent parts from its 3-dimensional skeletonized image (this can be done with MAIN).
The theoretical basis for electron density averaging or real space averaging was established in late 1960s and early 1970s (Rossman, 1972). Bricogne has written a review (Bricogne, 1974) of the method, its application and limitations. There it was shown that the averaging in direct space is equivalent to the procedure in reciprocal space and that averaging in direct space has advantages over reciprocal space procedures. The reason seems to lie in the greater computational inaccuracies in reciprocal space calculations. In 1976 he published a description of his program (Bricogne, 1976). Besides Bricogne's program, other attempts have also been made, but none of them (Johnson in Rayment et al., 1978; Nordman, 1980) is used so widely and in so many variations as Bricogne's. Recently, Lawrence (Lawrence, 1991) reviewed the method and its applications to {\it de novo} determined structures.
Averaging of electron density: 2 molecules
This procedure describes and performs averaging of electron density of two locally equivalent molecules that together form an asymmetric unit of a single crystal form. The procedure is intentionally not optimized, so it is easy to adapt it to any number of locally equivalent molecules. It is also assumed that your envelope as well as translational and rotational parameters can be obtained from molecular models.
The file do_all.com controls the run of the whole procedure. It first reads all neccessary data and set variables (file names, total number of avraging cycles (CYCLE_N), reads the data (cell constants, symmetry operations, molecular model and reflection file).
> set lev 2 > set vari FILE_CELL cb-inh-p21.cell > set vari FILE_SYMM >symm/p21.symm > set vari FILE_REFL cb-inh-p21.fobs > set vari FILE_ATOM hucathb_10t_2.xpl > set vari CYCLE_N = 4 > set vari MAT_ROT =3
> read file FILE_CELL cell > read file FILE_SYMM symm > read file FILE_REFL reflect init resol 3. 8. limit -99 -99 0 > read file FILE_ATOM atom pdb
The further procedure refers to MOL1 and MOL2 as segment names so be cautious when renaming your segment names. Segment name can contain only up to four characters (MOL1 is equal to MOL10)
> rename segm MOL1 sele segm name CBA1 CBB1 end > rename segm MOL2 sele segm name CBA2 CBB2 end
The rms_fit.com file determines the local symmetry transformations by optimizing the superposition of the two molecules and writes results (rotation matrix and translation vector in a form of MAIN command file that is later on applied within the averaging loop).
>
Here the unit cell is created.
> set vari MAP_CELL = 1 > set vari GRID_A = 128 > set vari GRID_B = 48 > set vari GRID_C = 128 > make map MAP_CELL from 0 init -9999 \ > grid GRID_A GRID_B GRID_C cell real
On the unit cell basis molecular envelopes for each of two molecular models are created within the file make_masks.com.
>
The unit cell is first filled with molecular model density and then (calc_a_map) retransformed into the first 2Fobs - Fcalc electron density map.
> make map MAP_CELL \ > atom dens select segm name MOL1 MOL2 end \ > atom symmetry >
The variable CYCLE is the electron density averaging cycle counter. The aver_loop.com procedure is completed when the CYCLE reached the total number of cycles (CYCLE_N).
> set vari CYCLE = 0 >
At the end the final map and structure factors are stored to a file.
> write over file aver_dens.xmap map MAP_CELL xpl > write over file aver_fcalc.refl reflect fobs fcalc polar defined
> quit
The rms_fit.com and make_masks.com only generate the neccessary data based on molecular models for the subsequent averaging. When molecular models are not available, it is neccesary to determine geometrical supperposition of localy equivalent regions by methods that optimize electron density superposition, and to create molecular envelopes on procedures based on solvent flattening and map skeletonisation (see description above and description of menu itmes SKELETON, SOLV_FLAT).
The 'aver_loop.com' is easily applicable to any kind of electron density procedure, It is relatively easy to expand the procedure to a larger number of molecules in asymmetric unit.
> set vari CYCLE = CYCLE + 1 > sh vari CYCLE > if CYCLE .gt. CYCLE_N return
> set vari MAP_DENS = MAP_CELL + 1
Rotating and adding density maps together. Remember that each molecular envelope should be an average of electron density of all of them. So call all 'mol_1_to_2.com' files followed by add_a_map.com' and store the averaged map on a disk. If you number of maps is small you may think of keeping them in memory all teh time, howvere, the procedure should be then slightly modified. (See below to the two crystal forms averaging.)
> set vari FILE_MASK mol1_mask.xmap > set vari FILE_DENS mol1_dens.xmap > read file FILE_MASK map xpl over MAP_DENS > make map MAP_DENS from MAP_CELL copy >make map MAP_DENS rescale > write over file FILE_DENS map MAP_DENS xplor
> set vari FILE_MASK mol2_mask.xmap > set vari FILE_DENS mol2_dens.xmap > read file FILE_MASK map xpl over MAP_DENS > make map MAP_DENS from MAP_CELL copy >make map MAP_DENS rescale > write over file FILE_DENS map MAP_DENS xplor
Unit cell density generation should call each locally averaged map one after another. The number of points which are incorporated into the unit cell should be more or less equal for all envelopes. If that is not the case, then you have substantial envelope overlap and you should return to the 'make_mask.com' procedure.
> make map MAP_CELL zero > make map MAP_CELL set -99999 99999 -9999.
Repeat this part until all averaged maps are processed.
> set vari FILE_DENS = mol1_dens.xmap >> set vari FILE_DENS = mol2_dens.xmap >
Calculate a new map and jump to the top ('rewind file').
>rewind file
The file 'add_a_map.com' read current molecular envelope, gets its rotated and translated density and adds it to the base map which is, when the density summ is complete, averaged.
> read file FILE_MASK map xpl over MAP_DENS + 1 > make map MAP_DENS + 1 from MAP_CELL \ > rota matr MAT_ROT \ > tran vect XTRAN YTRAN ZTRAN \ > center coor 0 0 0 > make map MAP_DENS from MAP_DENS + 1 add > return
The file 'aver_map_to_cell' incorporates an averaged density map into the unoccupied grid points of the unit cell by applying all symmetry operations.
> read file FILE_DENS map xpl over MAP_DENS > make map MAP_CELL from MAP_DENS cell > return
The file 'calc_a_map' calculates a new structure factor set, its r-value and then the map which eneters the next cycle of averaging:
> make map MAP_CELL set -10000 -9900 0.0 > fourier map MAP_CELL > reflect shell 12 r-values > refl set ampl phase fwork = fcalc * 1 > refl set ampl scale fcalc fwork = fobs * 2 - fcalc * 1 > make map MAP_CELL zero > make map MAP_CELL convert complex > reflect fill-map MAP_CELL defined > fourier map MAP_CELL back > make map MAP_CELL rescale > return
Averaging density between 2 different crystal forms
All command files you can fine in the directory 'main/doc/dens_mod/aver_dens/aver_2_crys/'. This procedure differs from the above described single crystal form averaging in the following:
Since MAIN is not capable to deal with several refelction sets and symmetry groups simoultaneously these have to be read each time when they are referenced. To make file manipulation easier (and error free) 'get_data_hex.com' and 'get_data_ortho.com' files were created. They are only seting values of string variable to apprpriate file names.
> set vari FILE_CELL MY_PROJECT:p6522/model/cell_1.dat > set vari FILE_SYMM >symm/p65_2_2.symm > set vari FILE_REFL MY_PROJECT:p6522/model/p622all.fob > set vari FILE_ATOM MY_PROJECT:p6522/model/model_26.xpl > set vari FILE_MASK mask_hex.xmap > set vari FILE_MAP cell_hex.xmap > return
The 'rms_fit.com' calculates the rotation and translational parameters of the procedure and saves the parameters into appropriate files referenced in the averageing part below.
>read file FILE_ATOM coor pdb > key MOL1 sele all end > > read file FILE_ATOM coor pdb > key MOL2 sele .not MOL1 end > > key ca sele atom name CA .a segm name CB_1 end > > rms coor all \ > sele ca .a MOL1 end \ > sele ca .a MOL2 end > save over file mol_ortho_to_hex.com rms > > rms coor all init 180. 0. 0. 10. 0. 0. \ > sele ca .a MOL2 end \ > sele ca .a MOL1 end > save over file mol_hex_to_ortho.com rms > > return
The 'make_masks.com' file is replaced by two command files 'make_mask_hex.com' and 'make_mask_ortho.com', where each of them defined molecular envelopes and also generates the firts molecular model based 2Fobs-Fcalc maps and writes them to files.
>
> read file FILE_CELL cell > read file FILE_SYMM symm > read file FILE_REFL reflect init resol 3.3 100. limit 0 0 0 friedel > read file FILE_ATOM atom pdb > set weight sele atom name H* end 0.0 > > key KEY_MASK sele all end
> set vari MAP_CELL = 1 > set vari MAP_MASK = 2 > set vari HOLE_SIZE = 8 > set vari ATOM_RAD = 3.5 > set vari GRID_ARO = 10
> make map MAP_CELL from 0 init -9999 grid 80 80 162 cell real > make map MAP_MASK from MAP_CELL init -9999 \ > around GRID_ARO sele KEY_MASK end real > make map MAP_MASK atom mask ATOM_RAD select KEY_MASK end
Fill the holes in envelope that have HOLE_SIZE grid points in diameter
> make map MAP_MASK + 1 from MAP_MASK init 9999 copy > make map MAP_MASK from MAP_MASK + 1 fill HOLE_SIZE
Generates symmetry equivelent atoms and subtracts their volume from the original model.
> symm sele all end around sele all end dist 6. cut > make map MAP_MASK set 9000 100000 100. > make map MAP_MASK atom mask ATOM_RAD / 2. sele segm name #* end > make map MAP_MASK set 9000 100000 -9999. > make map MAP_MASK set 0 1000 9999. > write over file FILE_MASK map MAP_MASK xplor
Finaly the electron density map is calculated. You can also shift this part into the 'do_aver.com' file as in the above case.
> make map MAP_CELL zero > make map MAP_CELL set -1 1 -9999. > make map MAP_CELL atom dens \ > atom symm select segm name MOL2 end > make map MAP_CELL rescale >write over file FILE_MAP map MAP_CELL xplor > > return
The 'do_aver.com' reads this files, initiates and performs the averaging.
> read file cell_hex.xmap map xpl over 1 > read file cell_ortho.xmap map xpl
> set vari RESOL_HEX = 3.3 > set vari RESOL_ORT = 3.3 > set vari CYCLE_N = 6 > set vari CYCLE = 0 > set vari MAT_ROT = 3 > set lev 2 >
> write over file cell_hex_aver.xmap map 1 xpl > write over file cell_ortho_aver.xmap map 2 xpl > write over file dens_hex_aver.xmap map 3 xpl > write over file dens_ortho_aver.xmap map 4 xpl
> set lev 0 > return
The 'aver_loop.com' starts with the loop couning and exits when sufficient number of cycleshas been performed.
> set vari CYCLE = CYCLE + 1 > sh vari CYCLE > if CYCLE .gt. CYCLE_N return
Rotating and adding density maps together:
> read file mask_hex.xmap map xpl over 3 > make map 3 from 1 copy > > read file mask_hex.xmap map xpl over 4 >make map 4 from 2 \ > rota matr 3 \ > tran vect XTRAN YTRAN ZTRAN \ > center coor 0 0 0 > > make map 3 from 4 add > make map 3 rescale >
> read file mask_ortho.xmap map xpl over 4 > make map 4 from 2 copy > read file mask_ortho.xmap map xpl >make map 5 from 1 \ > rota matr MAT_ROT \ > tran vect XTRAN YTRAN ZTRAN \ > center coor 0 0 0 > > make map 4 from 5 add > make map 4 rescale >
The new unit cell generation and a new 2Fobs-Fcalc map calculation for each crystal form independently:
>read file FILE_CELL cell > read file FILE_SYMM symm > read file FILE_REFL reflect init resol RESOL_HEX 100. limit 0 0 0 friedel > set vari MAP_CELL = 1 > make map MAP_CELL zero > make map MAP_CELL set -1 1 -9999. > make map MAP_CELL from 3 cell >
>read file FILE_CELL cell > read file FILE_SYMM symm > read file FILE_REFL reflect init resol RESOL_ORT 100. limit 0 0 0 friedel > set vari MAP_CELL = 2 > make map MAP_CELL zero > make map MAP_CELL set -1 1 -9999. > make map MAP_CELL from 4 cell > rewind file
Local histogram matching in contrary to global history matching, where histogram of the whole map is matched, calculates match of an density map with an atom locally for each point of the whole unit cell map.
The following procedure ">utils/dens_comb.com" generates electron density of a single atom (sulphur) placed at the coordinate system origin in a P1 cell with its' b-value equal to HIST_TEMP and calculates convolution of the single atom with the underlaying map using the convoluion theorem. (Structure factors of single atom map are multiplied with structure factors of the underlaying map and then backtransformed into a map.)
This procedure is intended to be called from other procedures as a subprocedue, which, after a structure factor calculation and its storage in the "fwork" array, multiplies it with the structure factors of the histogram. Product can be afterwards (after the return) transformed into a map that serves for inspection or further calculations.
This file is made to be called from a solvent flattening procedure and utilizes the MAP_SOLV for its structure factor calculation. In your own cases change the variable name or assign accordingly the MAP_SOLV and HIST_TEMP values.
main/utils/hist_match.com
> read atom > * > SGX
> set weight sele numb atom natom end 1.0 > set temp sele numb atom natom end HIST_TEMP > make map MAP_SOLV zero > make map MAP_SOLV conv real > make map MAP_SOLV atom dens sele numb atom natom end > delete atom sele numb atom natom end > four map MAP_SOLV > reflect set ampl phase fwork = fwork * 1.0 * fcalc * 1.0 > return
MAIN still does not include phase cmbination procedure. Herein a link to CCP4 SIGMAA program written by R. Read is described. First the phase information needs to be stored in the FCALC.
When your phase information is present in a molecular model then calculate its stracture factors:
make map MAP_FOFC conv real
make map MAP_FOFC zero
make map MAP_FOFC set -10 10 -9999.
make map MAP_FOFC atom symm \
atom dens \
sele work_set end
make map MAP_FOFC set -100000 -100 0.
fourier map MAP_FOFC
reflection r-values
If your phase information is stored in FWORK then generate FCALC.
> refl set ampl phase fcalc = fwork * 1.0
The FCALC are written to a file for further processing with CCP4 programs. The spawn command starts a shell script "main_2_sigmaa_2_main.ccp4" and preforms the phase combination. The example file is in "main/doc/dens_mod/phase_comb" directory.
> write over file main_fcalc.phase refl polar select defined end > spawn main_2_sigmaa_2_main.ccp4 > sigmma.out > read file sigmaa.phase refl over resol RESOL_LOW RESOL_HIGH friedel
The retrieved phases are stored in FOBS, so copy them to FWORK before a map calculation.