RSAT matrix-clustering: dynamic exploration and redundancy reduction of transcription factor binding motif collections - Aix-Marseille Université Accéder directement au contenu
Article Dans Une Revue Nucleic Acids Research Année : 2017

RSAT matrix-clustering: dynamic exploration and redundancy reduction of transcription factor binding motif collections

Résumé

Transcription factor (TF) databases contain multitudes of binding motifs (TFBMs) from various sources, from which non-redundant collections are derived by manual curation. The advent of high-throughput methods stimulated the production of novel collections with increasing numbers of motifs. Meta-databases, built by merging these collections, contain redundant versions, because available tools are not suited to automatically identify and explore biologically relevant clusters among thousands of motifs. Motif discovery from genome-scale data sets (e.g. ChIP-seq) also produces redundant motifs, hampering the interpretation of results. We present matrix-clustering, a versatile tool that clusters similar TFBMs into multiple trees, and automatically creates non-redundant TFBM collections. A feature unique to matrix-clustering is its dynamic visualisation of aligned TFBMs, and its capability to simultaneously treat multiple collections from various sources. We demonstrate that matrix-clustering considerably simplifies the interpretation of combined results from multiple motif discovery tools, and highlights biologically relevant variations of similar motifs. We also ran a large-scale application to cluster ∼11 000 motifs from 24 entire databases, showing that matrix-clustering correctly groups motifs belonging to the same TF families, and drastically reduced motif redundancy. matrix-clustering is integrated within the RSAT suite (http://rsat.eu/), accessible through a user-friendly web interface or command-line for its integration in pipelines.
Fichier principal
Vignette du fichier
gkx314.pdf (8.65 Mo) Télécharger le fichier

Dates et versions

hal-01624366 , version 1 (10-11-2021)

Licence

Paternité - Pas d'utilisation commerciale

Identifiants

Citer

Jaime Abraham Castro-Mondragon, Sébastien Jaeger, Denis Thieffry, Morgane Thomas-Chollier, Jacques Van helden. RSAT matrix-clustering: dynamic exploration and redundancy reduction of transcription factor binding motif collections. Nucleic Acids Research, 2017, 45 (13), pp.e119--e119. ⟨10.1093/nar/gkx314⟩. ⟨hal-01624366⟩
92 Consultations
88 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More