AGONOTES


The AgoNotes search algorithm

BLASTP

The BLASTP is used to search protein sequences against proteins of built database with MAKEBLASTDB algorithm. AgoNotes uses BLAST similarity search algorithm in version 2.6 which downloads from ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/ to search against the proteins containing PIWI domain.

MUSCLE

The MUSCLE is used to do multiple sequence alignment which downloads from http://www.drive5.com/muscle/downloads.htm.AgoNotes uses MUSCLE algorithm in version 3.8.31 to align these proteins containing PIWI domain.

Input Data

Only one-letter codes for 20 common amino acids and X letter are accepted, and any other characters will lead to an illegal character feedback. The input can be one or multiple proteins.Figure 1 shows the input field and parameters users have chosen for BLASTP program.

Figure 1. Screenshot of the input page

Figure 1. Screenshot of the input page

Input Examples

We have provided an example file possessing six proteins that two prokaryotic argonaute proteins, two eukaryotic argonaute proteins and two not argonaute proteins, such as Natronobacterium gregoryi argonaute, Rhodobacter sphaeroides argonaute, Mouse argonaute-2, Kluyveromyces polysporus argonaute, Human probable ATP-dependent RNA helicase DDX5, and Homo sapiens alanine—tRNA ligase, on the homepage. The result has been chosen to show a set of result that demonstrates the features available on the result pages.

Search Parameters

The E-value (expectation value) is used for BLASTP and the default value is 10. An E-value is the number of hits that would be expected to have a score equal to or better than this by chance alone. A good E-value is much less than 10, for example, an E-value of 0.01 would mean that on average about 1 false positive would be expected in every 100 searches with different query sequences. An E-value around 10 is what we expect just by chance. E-values are widely used as all you need to decide on the significance of a match is the E-value, but note that they vary according to the size of the target database.

Results

Identification of Ago proteins and classification

Directly click the example button, six examples provided by homepage into the input field, select E-value cut-offs, run AGONOTES and users will obtain a result page (Figure 3). Figure2 shows the waiting result page, which is a dynamic sequence like rainbow wave. Figure 3 shows a summary table consists of three parts. The first part and second part is respectively summary table of prokaryotic argonaute protein hits and summary table of eukaryotic argonaute protein hits. Users can click corresponding figures to get detail information of per protein in this two part. The last part is the list of not argonaute proteins. The bottom left of the table shows that the number of pAgo or eAgo or not Ago are identified. If a query sequence is annotated as corresponding types of Ago proteins, the content between the ">" character and the first space is displayed in the Protein column in the tables. Meanwhile, users can download the summary results by clicking Download button. However, there are three prerequisites for using this function:

1.Users need to be guaranteed to submit proteins in FASTA format;

2.Users need to make sure that every protein has its own name instead of blank;

3.Users need to ensure that the name of submitted proteins is unique.

Waitting page

Figure 2. Waitting page

Summary table of results

Figure 3. Summary table of results

Details of a single Ago protein

Figure 4,Figure 5 and Figure 6 show the detailed description of a single Ago protein. Figure 4 shows the six domains, the order is N, L1, PAZ, L2, MID and PIWI in first six colorful tracks, which every domain bar show the name and relative position. The seventh track represents the protein and its length. And the ruler, which adjusts its length based on protein size, is placed directly on the bottom the figure. In addition, users can save the figure as local figure. Figure 5 shows detailed information of the protein, respectively protein name, domain and sequence. Because the sequence may be too long, it is shown in popup window. Users can download these data by clicking Download button. Figure 6 shows the results of download file, and it is better to open by text editors.

Figure 4. The graph of argonaute protein domains

Figure 4. The graph of argonaute protein domains

Figure 5. List of domain hits and detailed information of every domain

Figure 5. List of domain hits and detailed information of every domain

Figure 6. The download file of Ago proteins

Figure 6. The download file of Ago proteins

Acknowledgements

We thank the authors of BLAST and MUSCLE algorithm which provide the source of core computational algorithm. We thank the authors of Perl Graphics programming due to figures of domains. We also thank the authors of Ago proteins containing domain information that provides the data.