Help page for

0. Help for aligning and consensus deriving
1. Parameters and tips for mapping
2. More about the accessiblity data
3. View result in 3D context interactively

0. Help for aligning and consensus deriving

0.1 Parameters for aligning mimotopes

MIMOX use ClustalW to align mimotopes. ClustalW aligns multiple sequences in 3 stages. First, aligns sequences pairwise; then constructs a dendrogram describing the approximate groupings of the sequences by similarity; at last, does multiple alignment using the dendrogram as a guide. Parameters for pairwise alignment control the speed and sensitivity of the initial alignments. Parameters for multiple alignment control the gaps in the final multiple alignments. ClustalW has 2 methods to do pairwise alignment. One is dynamic programming method, which is slow but accurate; another is the method of Wilbur and Lipman, which is extremely fast but approximate. The Wilbur and Lipman method uses two techniques to make pair alignment very fast: 1) only exactly matching fragments (k-tuples) are considered; 2) only the 'best' diagonals (the ones with most k-tuple matches) are used. MIMOX only use the fast pairwise alignment method to prevent the server from overloading. For more accurate alignment, running ClustalW or other similar software locally or using EBI ClustalW server. Now we list a brief explanation for parameters used in MIMOX mimotope alignment interface in the table below.

Parameters Brief Explanation

Word Size The size of exactly matching fragment (k-tuples) that is used. Increase for speed , decrease for sensitivity.

Top Diag Top Diagonals. The number of most k-tuple matches among all diagonals. Decrease for speed, increase for sensitivity.

Window Window Size. The number of diagonals around each of the 'best' diagonals that will be used. Decrease for speed, increase for sensitivity.

Pair Gap Gap Penalty for each gap in the fast pairwise alignments. It has little affect on the speed or sensitivity except for extreme values.

ScoreType Type of similarity scores calculated. PERCENT or ABSOLUTE.

Weight Matrix Weight matrix to determine the similarity of non-identical amino acids. BLOSUM (by Henikoff): maybe the best available for carrying out database similarity (homology searches). PAM (by Dayhoff): extremely widely used since the late 1970s. GONNET: derived using almost the same procedure as the Dayhoff one but are much more up to date and are based on a far larger data set. ID: identity matrix which gives a score of 1.0 to two identical amino acids and a score of zero otherwise, not very useful.

Gap Open gap opening penalty for multiple alignment

GapExt gap extension penalty for multiple alignment

OutOrder list output sequences ordered by INPUT or ALIGNED

Seq No turn OFF or turn ON aligned sequence segments length number

Tips: try the default parameters at first; then adjust the parameters if necessary.

Parameters	Brief Explanation
Word Size	The size of exactly matching fragment (k-tuples) that is used. Increase for speed , decrease for sensitivity.
Top Diag	Top Diagonals. The number of most k-tuple matches among all diagonals. Decrease for speed, increase for sensitivity.
Window	Window Size. The number of diagonals around each of the 'best' diagonals that will be used. Decrease for speed, increase for sensitivity.
Pair Gap	Gap Penalty for each gap in the fast pairwise alignments. It has little affect on the speed or sensitivity except for extreme values.
ScoreType	Type of similarity scores calculated. PERCENT or ABSOLUTE.
Weight Matrix	Weight matrix to determine the similarity of non-identical amino acids. BLOSUM (by Henikoff): maybe the best available for carrying out database similarity (homology searches). PAM (by Dayhoff): extremely widely used since the late 1970s. GONNET: derived using almost the same procedure as the Dayhoff one but are much more up to date and are based on a far larger data set. ID: identity matrix which gives a score of 1.0 to two identical amino acids and a score of zero otherwise, not very useful.
Gap Open	gap opening penalty for multiple alignment
GapExt	gap extension penalty for multiple alignment
OutOrder	list output sequences ordered by INPUT or ALIGNED
Seq No	turn OFF or turn ON aligned sequence segments length number

0.2 Requirements for input sequence or sequence file

MIMOX mimotope alignment interface only accept input sequence or uploaded sequence file in FASTA format. Briefly speaking, FASTA format sequence file start with a " >" symbol followed by the name and the description of the sequence; between two " >" started lines are lines of sequence data. Here is an example:

>mimoseq1
CSGLRNETFLRC

>mimoseq2
CEFFQQHMLRVPRC

>mimoseq3
CNMKLKLREMTQRC

You can find more real examples from the test data page. As the purpose of MIMOX server is to provide a web tool for analyzing phage display derived mimotopes, we set further requirements for input sequence or uploaded sequence file. These requirements will also prevent MIMOX server from overloading and abuse.
(1) only accept 3~100 mimotope sequences for aligning
(2) each ">" head line should be within 25 characters
(3) the longest sequence should be within 60 residues
(4) only support residue A C D E F G H I K L M N P Q R S T V W Y.
Tips: you can convert your sequence file into FASTA format through the online service powered by READSEQ.

0.3 Deriving consensus sequence from alignment

We use a simple statistical method to derive the consensus sequence from a set of aligned mimotopes. Briefly, the program counts the appearance of each amino acids at each position, calculate the percent frequency. If the frequency is above a threshold, the corresponding residue is considered as a motif residue at that position. The frequency threshold is 25% by default. The motif residues are connected ( if no motif residue is found at a given position, then X is used) to be the "consensus sequence". For example: CXX[LR][PR]TNXTTLRRC. If two or more motif residues are found at given position, they are put into [ ]. A 3D bar figure is also created accordingly. The Z axis represents the position of the aligned sequences. Now, you can map the whole or a segment of mimotope consensus sequence back to corresponding PDB structure.
Tips: check the result table carefully. Revise the consensus sequence manually if necessary. For example, you can group ILV, DE, RK, QN, ST and re-evaluate the motif residue at each position.

0.4 About JalView

We ultilize JalView (Release2.08.1) to manage the alignment. Jalview is a multiple alignment editor written in Java (Clamp M, Cuff J, Searle SM, Barton GJ. The Jalview Java Alignment Editor. Bioinformatics 2004, 12: 426-427). We embed Jalview Applet version into the web page. You must have installed java (version 1.2 or above) on your computer and have enabled the browser to run Java applet. If so, it is quite easy to view and edit the alignment. To convert the format of the alignment file, to draw the phylogenetic tree, and to extract the consensus sequence is also convenient through JalView. For more help, go to help page of JalView.

1. Parameters and tips for mapping

1.1 Requirement for input sequence

When mapping a mimotope sequence or a consensus sequence from a set of mimotopes, the sequence must be input according to IUPAC's one-letter system for amino acids. However, MIMOX only supports A, C, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, Y. The letters such as B, U, X, Z are illegal in MIMOX at present. And the sequence lenghth MIMOX supports is 2~18. Tips : it doesn't matter if you input the sequence in lower case or mix the sequence with space. MIMOX will trim all space and turn it to uppercase automatically. Besides, you can run MIMOX with sequence of just one mimotope, or a consensus sequence derived from a set of mimotopes, or even a virtual sequence you compose. It is up to you.

1.2 Requirement for protein structure file

The protein structure file must be in PDB format with a suffix either 'pdb' or 'ent'. At present, MIMOX only supports single chain. If your PDB file has several chains, for example, the protein you are interested in is an antigen-antibody complex; you should edit the PDB file at first. Tips: if the protein you are interested in has no structure file in PDB database, you can try to construct one through homology modeling.

1.3 Select candidate residue pickup mode

When MIMOX runs, it will search uploaded PDB structure for residues that match those of input sequence and pick them up into an array of array. MIMOX provides 3 modes (Strict, Similar, and Arbitrary) to pick up candidate residues. The strict mode means that only exactly matched residues are picked up. This is also the default value. The similar mode means that not only exactly the same residue, but also conservatively matched residues are picked up as candidate residues. The arbitrary mode means that all residues in the PDB file will be taken as candidate residues. In fact, it means that mismatch is allowed. However, we disable this option at present, because it will take too much time to complete the task. Tips: when you get no result or irrational result with strict mode, try similar mode.

1.4 Select method for calculating distance

MIMOX provides 3 methods (CA, CB, AHA) to calculate neighbor currently. When we say 'calculate neighbor', we mean the process of computing the diatance between two amino acids and determining if they are near enough to be neighbors. CA means that the distance betwwen 2 residues are calculated based on C alpha atoms. And this is the default method. CB means that the distance betwwen 2 residues are calculated based on C beta atoms. However, C alpha atom is still used when it is a glycine, because it does not have a C beta atom at all. AHA means that the distance betwwen 2 residues are calculated based on all heavy atoms. Tips: when you get no result with CA, give CB and AHA method a try.

1.5 Select distance factor or threshold

As stated previously, MIMOX must have a standard to determine if 2 residues are neighbors. This standard can be a distance threshold directly or a distance factor, which will then produce variable distance threshold between different atoms according to the following formula:

DT = DF × (vdwAtom1 + vdwAtom2)

DT: Distance Threshold, DF: Distance Factor, vdwAtom: Van der Waals radius of atom. When the distance between 2 residues are calculated based on C alpha or beta atoms, only distance threshold is valid, which ranges from 4.0Å to 12.0Å with a default value 7.0Å. When the distance between 2 residues are calculated based on all heavy atoms, only distance factor is valid, which ranges from 1.00 to 1.33 with a default value 1.11. Tips: when you get no result with short distance threshold, try a longer threshold. So does the distance factor.

2. More about the accessiblity data

2.1 Where does the accessiblity data come from?

Thanks Dr. Hubbard for providing us a free copy of NACCESS (VERSION 2.1.1), Mimox directly call it to compute accessiblity and then MIMOX just parse the result file. Briefly, the naccess program calculates the atomic accessible surface defined by rolling a probe of given size around a van der Waals surface. This program is an implimentation of the method of Lee and Richards (1971) J.Mol.Biol.55, 379-400. For more information on it, please visit the homepage of NACCESS

2.2 What do all those abbreviations in table header mean?

Briefly, ABS stands absolute accessibility; REL stands relative accessiblity. So:

Abbreviation Explanation

AllABS absolute accessiblity of all atoms of a residue

AllREL relative accessiblity of all atoms of a residue

SideABS absolute accessiblity of sidechain atoms of a residue

SideREL relative accessiblity of sidechain atoms of a residue

BoneABS absolute accessiblity of backbone atoms of a residue

BoneREL relative accessiblity of backbone atoms of a residue

NonpoABS absolute accessiblity of non polar sidechain atoms of a residue

NonpoREL relative accessiblity of non polar sidechain atoms of a residue

PolarABS absolute accessiblity of polar sidechain atoms of a residue

PolarREL relative accessiblity of polar sidechain atoms of a residue

Note that alpha carbons are classed as sidechain atoms, so that glycine can have a sidechain accessibility.

Abbreviation	Explanation
AllABS	absolute accessiblity of all atoms of a residue
AllREL	relative accessiblity of all atoms of a residue
SideABS	absolute accessiblity of sidechain atoms of a residue
SideREL	relative accessiblity of sidechain atoms of a residue
BoneABS	absolute accessiblity of backbone atoms of a residue
BoneREL	relative accessiblity of backbone atoms of a residue
NonpoABS	absolute accessiblity of non polar sidechain atoms of a residue
NonpoREL	relative accessiblity of non polar sidechain atoms of a residue
PolarABS	absolute accessiblity of polar sidechain atoms of a residue
PolarREL	relative accessiblity of polar sidechain atoms of a residue

2.3 What is the difference between ABS and REL?

As stated previously, absolute accessiblity is an atomic accessible surface area or a direct summed atomic accessible surface areas over part or all atoms of a residue or even the whole chain. So absolute accessiblity has a unit as square Angstrom. Relative accessibilities are calculated for each amino acid in the protein by expressing the various summed residue accessible surfaces as a percentage of the that observed in a ALA-X-ALA tripeptide. The tripeptides were built using the QUANTA molecular graphics package, in extended conformations, so as to expose the central X residue in the tripeptide as much as would normally be possible in a protein. Because of unusual bond angles, bond lengths and distorted geometry in real proteins, these values can often exceed 100% (especially for Proline where we used a real ala-pro-ala tripeptide built from a real protein). Also, they will have very large %asa values when the residues are next to a chain break, such as at the N- or C-terminus of the protein.

3. View result in 3D context interactively

3.1 Brief introduction

The result of MIMOX can be viewed interactively in context of corresponding PDB structures. The visualization function of MIMOX is powered by Jmol. Jmol is a free, open source, cross-platform molecule viewer. Jmol Java applet is used in MIMOX to view result in web pages, which can read scripts that are contained in Jmol checkboxes and buttons. These scripts are used to change the rendering of the candidate neighbor cluster and its context to illustrate important structural features. By default, all structures is displayed as backbone colored by secondary structure. Tips: if prompted, you should allow your browser to view blocked content.

3.2 Turn candidate neighbor cluster on/off

Checkboxes below Section "Residues and locations of the candidate cluster" are off by default. Each residues of the candidate neighbor cluster has a corresponding checkbox. To display candidate neighbor cluster partly or completely, check part or all checkboxes and the corresponding rsidues will "blink" out. The candidate neighbor cluster is displayed as spacefill and colored by CPK. Check off those checkboxes will turn it off. Tips : You should be patient! Be sure to check a checkbox on or off after the script execution has completed.

Figure 1: Turn candidate neighbor cluster on

3.3 Turn protein context on/off

Checkboxes below Section "Protein context of the candidate cluster" are off by default. To display the protein context in spacefill rather than backbone, check corresponding checkbox on and the corresponding protein chain will "build" up. Check the checkbox off, the chain will solve off.

Figure 2: Turn protein context on

3.4 Zoom in/out

Put you mouse on the structure image. Click the left mouse key (to get focus) and then roll the middle wheel of your mouse down and up. If your mouse has no wheel, you can press SHIFT on the keyboard and drag down and up to zoom in and out.

3.5 Spin and rotate

Put you mouse on the structure image. Press down the left mouse key and then move your mouse around.

3.6 Move structure

Put you mouse on the structure image. Press Shift and double click left mouse key and then drag.

3.7 Reset structure

Click the command button "Reset to Original Size and Position. Or put you mouse on the structure image, click the left mouse key (to get focus) and then press down SHIFT and double click left mouse key.

3.8 Get information of atom, residue, and chain

When atom is in spacefill mode, put you mouse to the atom you are interested in and hover for while. Then the information such as the name and the numbering of this atom, the residue and its numbering that the atom belongs to, the chain and its numbering that the residue belongs to will appear on the pointer.

Figure 3: Get information of atom

3.9 Measure distance

When the candidate neighbor cluster and its protein context are checked on, they will be displayed in spacefill mode. Under this condition, you can measure the distance between 2 atoms. Double click on the starting atom, and then move your pointer to the second atom. The distance between the 2 atoms will be displayed in magenta. Move pointer out the image will cancel the measurement. Double click on the second atom will label the distance measured. To delete the label, just do the same measurement again!

3.10 Measure angle

To measure the angle among 3 atoms is similar to measuring distance. Briefly, double click on the starting atom, and then click on the second atom. At last, point to the third atom to show or double click to label the angle. Move pointer out the image will cancel the measurement. To delete the label, just do the same measurement again!

Figure 4: Measure distance and angle

3.11 Measure dihedral

To measure dihedral (Torsion angle) is similar to measuring angle. Briefly, double click on the starting atom, and then click on the second and the third atom. At last, point to the fourth atom to show or double click to label the dihedral. Move pointer out the image will cancel the measurement. To delete the label, just do the same measurement again!

3.12 More complicate operation

Use the Jmol menu. Put you mouse on the structure image; click the right mouse key will open the Jmol menu. Put the mouse on the Jmol logo (at right bottom) and click the left mouse key will also open the Jmol menu.

HLAB | Center of Bioinformatics | UESTC | Chengdu, 610054, China [Feedback]