posted on 2020-05-12, 17:26authored byJoël Klein, Manon Neilen, Marcel van Verk, Bas E. Dutilh, Guido Van den Ackerveken
The yellow bar indicates the region between 40 and 55% GC, based on reads >1 kb. (A) PacBio reads before CAT-filtering show a bimodal distribution with a presumed peak of contaminating sequences with a GC content of ~40%. (B) PacBio reads after CAT-filtering show a distribution consisting of a single peak with a GC content around ~46%. (C). GC content of the Pfs1 contigs from the pre-assembly before filtering shows additional peaks at around 30 and 60 GC%, indicating that there are many contaminant contigs. (D) GC content of the Pfs1 contigs after filtering of the reads with the CAT tool shows that the additional peaks are no longer present and have thus been successfully filtered out.