CD47Binder is an ensemble support vector machine (SVM) based predictor for identifying CD47 binding peptides. The framework of CD47Binder is illustrated in the following figure. For a given peptide, it will be predicted by ten submodels separately. CD47Binder then uses the averaging voting method and makes final prediction based on the average probability value.
We evaluated the generalization ability of CD47Binder using four independent test datasets as well as independent sample t-test, chi-square test, and Mann-Whitney U test. First, we used CD47Binder to predict the independent test dataset TestDataset_1 and saved the prediction probability. We then performed t-test, U-test, and chi-square test for the top 5%-bottom 5%, top 10%-bottom 10%, top 20%-bottom 20%, top 30%-bottom 30%, top 40%-bottom 40%, and top 50%-bottom 50% of the predicted scores. Then, we repeated the above process for TestDataset_2, TestDataset_3, and TestDataset_4. By summarizing the results of all three tests, we found that the predicted scores corresponding to peptides with large copy numbers were significantly greater than those corresponding to peptides with small copy numbers, and the p values of the three tests was less than 0.001.
CD47Binder allows users to submit peptide sequences in fasta or plain text format. Predictive results of CD47Binder are displayed in a table. For the prediction results, the predictor will first conduct hard voting. Sequences with more than the number of votes (default is 5 (0.5*10)) will be predicted as CD47 binding peptides, otherwise they will be predicted as non-CD47 binding peptides. When the number of votes equals the threshold set by the user, soft voting will be conducted. When the average probability value is greater than or equal to the threshold value set by the user (default is 0.5), the sequence will be predicted as a CD47 binding peptide, otherwise it will be predicted as a non-CD47 binding peptide. You can adjust the threshold according to your own needs.
Number: the serial number of the query sequence;
Query Sequence: the sequence of the query peptide;
Length: the length of the query sequence;
Voting: the number of SVM-based submodels that identify the query peptide is a CD47 binding peptide;
Probability: the probability value that the query sequence is predicted to be an CD47 binding peptide;
Yes/No: the "Yes/No" column shows the prediction result, when the "Probability" is greater than or equal to "tp", the column is displayed as "Yes", which indicates that the sequence is predicted to be an CD47 binding peptide; otherwise "No".