CRISPRbuilder-TB: ``CRISPR-builder for tuberculosis''. Exhaustive reconstruction of the CRISPR locus in mycobacterium tuberculosis complex using SRA

Affiliation auteurs!!!! Error affiliation !!!!
TitreCRISPRbuilder-TB: ``CRISPR-builder for tuberculosis''. Exhaustive reconstruction of the CRISPR locus in mycobacterium tuberculosis complex using SRA
Type de publicationJournal Article
Year of Publication2021
AuteursGuyeux C, Sola C, Nous C, Refregier G
JournalPLOS COMPUTATIONAL BIOLOGY
Volume17
Paginatione1008500
Date PublishedMAR
Type of ArticleArticle
ISSN1553-734X
Résumé

Mycobacterium tuberculosis complex (MTC) CRISPR locus diversity has long been studied solely investigating the presence/absence of a known set of spacers. Unveiling the genetic mechanisms of its evolution requires a more exhaustive reconstruction in a large amount of representative strains. In this article, we point out and resolve, with a new pipeline, the problem of CRISPR reconstruction based directly on short read sequences in M. tuberculosis. We first show that the process we set up, that we coin as ``CRISPRbuilder-TB'' (), allows an efficient reconstruction of simulated or real CRISPRs, even when including complex evolutionary steps like the insertions of mobile elements. Compared to more generalist tools, the whole process is much more precise and robust, and requires only minimal manual investigation. Second, we show that more than 1/3 of the currently complete genomes available for this complex in the public databases contain largely erroneous CRISPR loci. Third, we highlight how both the classical experimental in vitro approach and the basic in silico spoligotyping provided by existing analytic tools miss a whole diversity of this locus in MTC, by not capturing duplications, spacer and direct repeats variants, and IS6110 insertion locations. This description is extended in a second article that describes MTC-CRISPR diversity and suggests general rules for its evolution. This work opens perspectives for an in-depth exploration of M. tuberculosis CRISPR loci diversity and of mechanisms involved in its evolution and its functionality, as well as its adaptation to other CRISPR locus-harboring bacterial species. Author summary In this article, we tackle the bioinformatical issue of the reconstruction of the Mycobacterium tuberculosis complex CRISPR locus using short read sequences without requiring genome assembly. We first show that many complete genomes, as found in public databases and often reconstructed by de novo assemblies, often contain errors on this locus as well as on other repeated sequences. We provide an in-depth description of our new method, designated as `CRISPRbuilder-TB', and we show that our method provides much more exhaustive and reliable information (on DR variants, spacer diversity, global structure) than Crass and CRISPR_detector. The new and unsuspected genomic diversity we detected is described in a companion paper. Scripts are available to adapt the tool to other species.

DOI10.1371/journal.pcbi.1008500