Inferred regulons are consistent with regulator binding sequences in E. coli

Qiu, Sizhe and Wan, Xinlong and Liang, Yueshan and Lamoureux, Cameron R. and Akbari, Amir and Palsson, Bernhard O. and Zielinski, Daniel C. and Laxman, Sunil (2024) Inferred regulons are consistent with regulator binding sequences in E. coli. PLOS Computational Biology, 20 (1). e1011824. ISSN 1553-7358

[thumbnail of journal.pcbi.1011824.pdf] Text
journal.pcbi.1011824.pdf - Published Version

Download (3MB)

Abstract

The transcriptional regulatory network (TRN) of E. coli consists of thousands of interactions between regulators and DNA sequences. Regulons are typically determined either from resource-intensive experimental measurement of functional binding sites, or inferred from analysis of high-throughput gene expression datasets. Recently, independent component analysis (ICA) of RNA-seq compendia has shown to be a powerful method for inferring bacterial regulons. However, it remains unclear to what extent regulons predicted by ICA structure have a biochemical basis in promoter sequences. Here, we address this question by developing machine learning models that predict inferred regulon structures in E. coli based on promoter sequence features. Models were constructed successfully (cross-validation AUROC > = 0.8) for 85% (40/47) of ICA-inferred E. coli regulons. We found that: 1) The presence of a high scoring regulator motif in the promoter region was sufficient to specify regulatory activity in 40% (19/47) of the regulons, 2) Additional features, such as DNA shape and extended motifs that can account for regulator multimeric binding, helped to specify regulon structure for the remaining 60% of regulons (28/47); 3) investigating regulons where initial machine learning models failed revealed new regulator-specific sequence features that improved model accuracy. Finally, we found that strong regulatory binding sequences underlie both the genes shared between ICA-inferred and experimental regulons as well as genes in the E. coli core pan-regulon of Fur. This work demonstrates that the structure of ICA-inferred regulons largely can be understood through the strength of regulator binding sites in promoter regions, reinforcing the utility of top-down inference for regulon discovery.

Item Type: Article
Subjects: Asian STM > Biological Science
Depositing User: Managing Editor
Date Deposited: 23 Mar 2024 10:28
Last Modified: 23 Mar 2024 10:28
URI: http://journal.send2sub.com/id/eprint/3185

Actions (login required)

View Item
View Item