Direct validation of imputed non-synonymous SNP alleles.

A) Genetically variant peptides (GVPs) that contained single amino-acid polymorphisms (SAPs) were identified in both European-American cohorts (EA1 and EA2) and collated for each subject. Imputed nsSNP alleles (Gene Name = GN, SNP accession number = rs#, allele nucleotide = nuc) were directly compared to the genotype resulting from direct Sanger sequencing (S1 Methods). Correctly imputed nsSNP alleles (TP, true positives) are indicated by a blue square. Imputed alleles that were incorrectly predicted (FP, false positive) are indicated by red squares. Alleles that were identified using Sanger sequencing, but did not contain a resulting GVP in the matching proteomic dataset (FN, false negative) are indicated by light green squares. Alleles absent in both subjects DNA and in resulting proteomic datasets (TN, true negatives) are indicated by white squares[49]. Failed Sanger sequencing determination of nsSNP allelic status is indicated by grey. B) The effectiveness of each SAP-containing peptide to impute nsSNP alleles was also quantified. The sensitivity of each genetically variant peptide, measured as the proportion of nsSNP-alleles that are correctly detected and imputed (TP/(TP+FN)), was calculated as a percentage (log10(%). The positive predictive value (PPV) of genetically variant peptide-based SNP imputations was calculated as the percentage of correct validated SNP imputations of all imputations (TP/(TP + FP); log10(%))[49]. C)