ISPD

 

Background

Phages are short for bacteriophages. They are viruses that infect bacteria. Many phages such as M13 and fd are good expression vectors. In 1985, George P. Smith displayed foreign peptides on the virion surface by inserting the foreign DNA fragments into the filamentous phage gene III [1]. He also demonstrated that foreign peptides in fusion proteins on the virion surface were in immunologically accessible form and specific fusion phage could be enriched and isolated from a phage library of random inserts in a fusion-phage vector by one or more rounds of affinity selection, also known as biopanning [2]. In 1990, the technology successfully extended from displaying and screening polypeptide to displaying and screening antibody, making mankind enter a new era where specific high-affinity antibodies can be obtained without immunization [3]. Since these simulated natural selection and evolution processes, it is referred to as in vitro phage display. In 1996, Dr. Pasqualini intravenously injected the phage-displayed peptide library into mice, and then successfully screened out polypeptides targeted to specific organs or tumors [4]. The method was similar to the classic phage display, while it was conducted in animals, it is known as in vivo phage display. However, conventional methods for identifying target binders in a library by Sanger sequencing are low-throughput, labor-intensive, and provide a limited perspective (<0.01%) of the complete sequence space. Moreover, the small sample space can be dominated by target-unrelated peptides (TUPs) [5]. To overcome these challenges, deep sequencing approaches have been employed to assist the analysis of phage-displayed selection, enabling more comprehensive access to the phage display library sequence space. Since the pioneering work described above, phage display technology has been developed, refined and improved further by many scientists from various fields; and its applications have extended from epitope mapping [6], antibody engineering [3], organ targeting [4] to new material and new energy studies [7, 8] as well.


Although traditional phage display technology has proved powerful, it is deeply troubled by the following problems: (1) the experiment is labor-intensive, time-consuming and money-expending; (2) there are relatively severe biases in the naive library; (3) a large number of real binders may be lost during the experiment, and there may have TUPs in the panning results. Computer simulation and experimental results showed that as little as a 10% difference in growth rate among phage clones, it can also cause a significant variation in phage clone abundance after several rounds of amplification [9]. The loss in the diversity of libraries during amplification can hinder the identification of useful ligands for targets [9]. Results of next-generation sequencing of the naive library after one round of amplification demonstrated that the abundance of some peptides was up to 30, while the abundance of some peptides was only one, suggesting propagation advantage was an important source of false positive hits [9]. Briefly, it suffers from information loss and noise inclusion due to several intrinsic faults of phage libraries and panning systems.


Interestedly, computational methods can relieve the pains of phage display. We propose to construct phage libraries in silico, which can easily be non-redundant, complete and without bias. Furthermore, pannings can be simulated in silico using Support Vector Machines. Integrating in silico phage libraries and virtual panning programs, two model systems of in silico phage display have been implemented as free web servers. It highlights that the trend of the phage display technology has expanded from in vitro, in vivo and now to in silico and on line. It indicates that the paradigm of phage display has shifted from pure wet, "first experiment then computation" to "computing first and then validated by experiment".


ISPD

The tools for in silico phage display were hosted in the in silico phage display (ISPD) suite. Among them, iLib is a tool for generating a random peptide library. The iPLib tool which produces a random peptide library based on position. iPan is a virtual panning program, which can identify streptavidin-binding peptides.


iLib

The iLib tool in the ISPD suit is designed to generate a random peptide library with specific amino acid composition. You can select composition of the library or set amino acid composition in percent (example 6.60). And you can also set the length of the peptide and the size of the library.


iPLib

The iPLib in the ISPD suit is a tool to generate a peptide library with specific amino acid composition or dipeptide composition. You just need to upload your amino acid composition or dipeptide composition in percent (example 6.60) in positional order. In each position, composition is given out in alphabetical order.


iPan

The iPan tool in the ISPD suite is a virtual panning program that can be used to foretell if phages bearing your peptides might bind to streptavidin. This tool was developed on the basis of an ensemble model trained by SVM using optimized dipeptide composition (ODPC). The datasets were built through the following steps:

   Streptavidin-binding peptides:

  1. Collect all peptides obtained from completely random library in the MimoDB Database version 4.0 that can bind to streptavidin.
  2. Delete the terminal cysteine (C) of peptides if they are from cysteine-restricted library.
  3. Remove the redundant peptides.
  4. Exclude sequences harboring ambiguous residues ("X", "B" and "Z") or non-alpha characters.

   Non-streptavidin-binding peptides:

  1. Collect all peptides obtained from completely random library in the MimoDB Database version 4.0.
  2. Delete the terminal cysteine (C) of peptides if they are from cysteine-restricted library.
  3. Remove the redundant peptides.
  4. Exclude sequences harboring ambiguous residues ("X", "B" and "Z") or non-alpha characters.
  5. Remove the sequences which are same with that of the positive dataset.

After the above procedures, 199 streptavidin-binding peptides and 15,266 non-streptavidin-binding peptides were obtained. The negative samples remarkably outnumbered the positive samples. Therefore, down-sampling strategy was proposed to overcome the challenge by randomly picking out 199 peptides from the negative samples. To diminish random errors, such procedure was repeated ten times. The only one positive dataset with 199 peptides was paired with the ten negative sub-datasets above, respectively. As a consequence, ten pairs of sub-datasets were generated and each pair was made up of 199 peptides with specific affinity to streptavidin and 199 peptides without affinity to streptavidin. Four types of features, amino acid composition (AAC), optimized amino acid composition (OAAC), dipeptide composition (DPC) and optimized dipeptide composition (ODPC), were used to encode individual peptide sequence. The 5-fold cross-validation results show that ODPC feature can achieve the best performance with an accuracy of 89.2% and a MCC of 0.79.


References

  1. Smith GP: Filamentous fusion phage: novel expression vectors that display cloned antigens on the virion surface. Science 1985, 228(4705): 1315-1317.
  2. Pande J, Szewczyk M, Grover A: Phage display: concept, innovations, applications and future. Biotechnol. Adv. 2010, 28(6):849-58.
  3. McCafferty J, Griffiths AD, Winter G, Chiswell DJ: Phage antibodies: filamentous phage displaying antibody variable domains. Nature 1990, 348(6301): 552-554.
  4. Pasqualini R, Ruoslahti E: Organ targeting in vivo using phage display peptide libraries. Nature 1996, 380(6572): 364-366.
  5. Menendez A, Scott JK: The nature of target-unrelated peptides recovered in the screening of phage-displayed random peptide libraries with antibodies. Anal Biochem 2005, 336(2): 145-157.
  6. Scott JK, Smith GP: Searching for peptide ligands with an epitope library. Science 1990, 249(4967): 386-390.
  7. Lee YJ, Yi H, Kim WJ, Kang K, Yun DS, Strano MS, Ceder G, Belcher AM: Fabricating genetically engineered high-power lithium-ion batteries using multiple virus genes. Science 2009, 324(5930): 1051-1055.
  8. Nam YS, Magyar AP, Lee D, Kim JW, Yun DS, Park H, Pollom TS, Jr., Weitz DA, Belcher AM: Biologically templated photocatalytic nanostructures for sustained light-driven water oxidation. Nat Nanotechnol 2010, 5(5): 340-344.
  9. Derda R, Tang SK, Li SC, Ng S, Matochko W, Jafari MR: Diversity of phage-displayed libraries of peptides during panning and amplification. Molecules 2011, 16(2): 1776-1803.