Restriction sites are associated with elevated GC content.
(a) Restriction enzymes tend to target sequences with GC content higher than the genomic average. (b) Bases immediately flanking AT-rich restriction sites (≥ 75% AT, n = 214 genomes) have an elevated mean GC content. This signature mostly decays within 50bp of the recognition site. This pattern is particularly striking when looking at the first flanking base. Error bars represent bootstrapped 95% confidence intervals of the mean.