IV-KAPhE performs well in cross-validation and significantly outperforms previously published methods on an external validation set.
a) Naïve Bayes+ posterior probability, GO BP semantic similarity, and STRING experimental score had the greatest importance when training the Random Forest models. Error bars show standard error across cross-validation runs. b) Predictive performance of individual quantitative features, as assessed by average macro-precision and macro-recall across 10 folds of the training data, reveals GO BP semantic similarity and STRING experimental score as being the most predictive individual features. c) Cross-validation evaluated via macro-averaged precision, recall and F1 all reflect strong performance by IV-KAPhE. d) IV-KAPhE’s coverage of the external test data set is similar to LinkPhinder’s but is lower than that of GPS 5.0. e) Kinase-specific F1 scores reveal IV-KAPhE’s consistently strong performance across most kinases, with similar performance for S/T and Y kinases, compared to other methods. f) IV-KAPhE outperforms the simpler PSSM-based and Naïve Bayes+ methods as well as other previously published methods in kinase-substrate assignment of an external validation set. Points indicate the scores for simple assignments (GPS) or the scores at nominal cutoffs for quantitative predictions (cutoffs—IV-KAPhE: 0.5, PSSM: 0.75, Naïve Bayes+: 0.5, LinkPhinder: 0.672 , NetworKIN 3.0: 1.0 ). Error bars show the 95% confidence intervals at these points. g) IV-KAPhE has a higher macro-averaged F1 score than the other methods. Points and color assignments are as in (e). Bands indicate the 95% confidence interval. h) IV-KAPhE similarly outperforms the other methods in Receiver Operating Characteristic (ROC) curve analysis for this balanced test set. Points and color assignments are as in (e). Error bars show 95% confidence intervals. i) Focusing on multi-label assignment for sites in the test set with known kinases, the macro-averaged false discovery rate (FDR; i.e. rate of novel assignments) dominates the average true positive rate (TPR). The curves are similar for most methods. At its nominal cutoff, IV-KAPhE has the highest FDR, but it is matched by the highest TPR.