The benchmark dataset contained 155 cas protein sequences as positive samples and 155 non-cas protein sequences as negative samples. The sequence similarity between positive and negative samples was less than 40%. After feature selection based on dipeptide composition, the optimal feature subset contained 167 features.
The CASPredict is released under the MIT license.
Contact us
If you find any bug in the CASPredict, please let us know. Any other problems or questions, do not hesitate to contact Bifang He (bfhe@gzu.edu.cn). We appreciate your feedback very much.