Completing the ENCODE3 compendium yields accurate imputations across a variety of assays and human biosamples.

Jacob Schreiber,Jeff A. Bilmes,William Stafford Noble

Completing the ENCODE3 compendium yields accurate imputations across a variety of assays and human biosamples.

2020

Jacob Schreiber
Jeff A. Bilmes
William Stafford Noble

Recent efforts to describe the human epigenome have yielded thousands of epigenomic and transcriptomic datasets. However, due primarily to cost, the total number of such assays that can be performed is limited. Accordingly, we applied an imputation approach, Avocado, to a dataset of 3814 tracks of data derived from the ENCODE compendium, including measurements of chromatin accessibility, histone modification, transcription, and protein binding. Avocado shows significant improvements in imputing protein binding compared to the top models in the ENCODE-DREAM challenge. Additionally, we show that the Avocado model allows for efficient addition of new assays and biosamples to a pre-trained model.

Keywords:

Transcription (biology)
Compendium
Imputation (statistics)
ENCODE
Epigenomics
Genetics
Epigenome
Biology
Chromatin
Histone
Human genetics
Computational biology

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations