Seqfold



1       Introduction


What is SeqFold?

SeqFold is a fold recognition program which, given a novel protein sequence, helps to identify the most compatible structure from a library of known protein structures. The similarity measure is based on both sequential and structural compatibilities. The method is more sensitive to distant sequential relationships between two homologous proteins that cannot be picked up by conventional sequence comparison software.

The query sequence can be read into Insight II in any Insight-readable sequence format. A database of protein structures has been composed from structures existing in the Protein Data Bank/Brookhaven (Bernstein, 1997).

On output, SeqFold reports an ordered list of most compatible hits from the library, the list can be imported into the Insight II table. Corresponding alignments can be read to the Insight II sequence window to facilitate alignment manipulations and automatic model building.


Invoking SeqFold

In Insight II, click the MSI icon in the upper right corner and select Homology from the list of modules. The Homology pulldowns will appear on the lower menu bar. SeqFold is the last pulldown on the right and consists of three commands: Fold_Search, Fold_Browse and Fold_Load, designed to guide you through consecutive steps of the fold recognition process. Regular Insight II rules and conventions apply when you work with the SeqFold interface (e.g., all SeqFold commands may be run from the command line). Because SeqFold uses standard Insight objects, other relevant Insight II modules may be used to prepare objects for SeqFold input or to help analyze SeqFold output.


Running SeqFold

Saving intermediate files.

SeqFold creates and deletes a number of files on the fly, however all important input and output files created for or by SeqFold are preserved for the user inspection. SeqFold can be executed in the background and/or without Insight II running.

MultiView

This SeqFold analysis tool also may be executed apart from Insight II. MultiView can be configured to access inter/intra-net information related to high scoring SeqFold hits. See the Methodology chapter for more information.

Command logging and batch processing.

A record of all commands you execute while in Insight II is maintained at all times. This log is written to your current working directory in the file WBLOGFILE and is renamed to insight.log when you exit Insight II. The insight.log file may be used as a template to process fold recognition of multiple target sequences.

Hardware and installation

For information regarding hardware, operating system level, and installation, please refer to the System Guide for Insight II Products.

Note that to run the MultiView application, you must install the JRE (Java Runtime Environmnet). On the SGI platform, the JRE is installed during the installation of Insight II. However, IBM requires a separate JRE setup.


A note about implementation

SeqFold is implemented as a standalone program but it is seamlessly integrated with the Insight II interface. This makes it more convenient to run SeqFold jobs from Insight and load back the results for analysis and model building. However, when the speed or number of processed sequences is critical, SeqFold may be used as a command line program. The output file format is identical to that created using SeqFold from within Insight II.

Note that if you execute the standalone command "seqfold" at the command line without parameters, a comprehensive description of all available options is printed.

By default, SeqFold requires that sequence similarity matrix files and fold library files be located in the Insight II release tree. The corresponding path locations are fully customizable. Also, the directory for the output file and the output file name can be modified.




Last updated December 10, 1998 at 12:46PM PST.
Copyright © 1998, Molecular Simulations, Inc. All rights reserved.