Motif analysis in co-expression networks reveals regulatory elements in plants: The peach as a model

2020 
Identification of functional regulatory elements encoded in plant genomes is a fundamental need to understand gene regulation. While much attention has been given to model species as Arabidopsis thaliana, little is known about regulatory motifs in other plant genera. Here, we describe an accurate bottom-up approach using the online workbench RSAT::Plants for a versatile ab-initio motif discovery taking Prunus persica as a model. These predictions rely on the construction of a co-expression network to generate modules with similar expression trends and assess the effect of increasing upstream region length on the sensitivity of motif discovery. Applying two discovery algorithms, 18 out of 45 modules were found to be enriched in motifs typical of well- known transcription factor families (bHLH, bZip, BZR, CAMTA, DOF, E2FE, AP2-ERF, Myb-like, NAC, TCP, WRKY) and a novel motif. Our results indicate that small number of input sequences and short promoter length are preferential to minimize the amount of uninformative signals in peach. The spatial distribution of TF binding sites revealed an unbalanced distribution where motifs tend to lie around the transcriptional start site region. The reliability of this approach was also benchmarked in Arabidopsis thaliana, where it recovered the expected motifs from promoters of genes containing ChIPseq peaks. Overall, this paper presents a glimpse of the peach regulatory components at genome scale and provides a general protocol that can be applied to many other species. Additionally, a RSAT Docker container was released to facilitate similar analyses on other species or to reproduce our results.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    79
    References
    1
    Citations
    NaN
    KQI
    []