A Neural Model of RNA Splicing: Learning Motif Distances with Self-Attention and Toeplitz Max Pooling
2021
Alternative RNA splicing is an important regulator of tissue development and specificity, and a relevant mechanism in cancer progression. There exists a strong motivation to disentangle the rules that govern RNA splicing, in part because this knowledge may one day yield new clinically relevant diagnostic tools and therapeutics. It is no easy to task to reverse engineer how the splicesome machinery choreographs the removal and addition of RNA elements following transcription. Here, we propose an interpretable neural network called the Toeplitz ATtention Architecture (TATA), which learns distance-dependent motif interactions through a novel Toeplitz max pool layer that captures the relative distance between interacting CNN filters. TATA is a completely transparent ``clear-box99 solution: every model parameter is human-interpretable. We validate TATA on simulated data, then apply it to real data to identify putative cis-regulatory elements that interact with primary RNA splice sites.
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
41
References
0
Citations
NaN
KQI