Universal dependence of RP and DFT on protein length.
The relationship, on a log-log scale, between the CO measures RP, RC and DFT and protein length, L. Upper panel (A–C) display human proteins indicating strong correlation of RP (A) and DFT (C) but not RC (B), ρ indicated the Pearson correlation coefficient. A clear linear boundary in RC is due to its lower bound 3/L. Linear regression analysis shows excellent power-law fits of RP and DFT dependence on L. Data was binned to 50 equally spaced intervals along the y-axis. ‘X’ symbols denote the average of L in each bin, error (SD) on the mean is at the size of the symbol and therefore not shown. The blue line is the result of a linear regression fit. Middle Panel (D–F) shows a superposition of RP-L data for all species (D) and the quality of its linear regression fits in (E,F). Slopes increase from Eukaryote to Prokaryotes (E) coupled with a decrease in the goodness of fit (F). Lower panel (G–I) is the same type of analysis for DFT-L dependence. Note that the slope trends are opposite. The ratio of the RP-L and DFT-L slopes is close to −1 in all species: it is −1.11±0.05 in eukaryotes. In prokaryotes, excluding 9 outliers, the ratio is −0.85±0.05.