On the efficient digital code representation in DNA-based data storage

2020 
Deoxyribonucleic acid (DNA), the life molecule, is composed of four nucleotides: Adenine, Guanine, Cytosine, and Thymine. The combinations of these nucleotides in the DNA encode the 20 amino acids that generate the structure of living organisms. These discrete components, jointly with the characteristics and functions of DNA, allow understanding the DNA as a digital component. Thus, when DNA is considered an organic digital memory, it becomes a compelling data storage medium given its superior density, stability, energy efficiency, longevity, and lack of foreseeable technical obsolescence compared with conventional electronic media. Various challenging experiments have demonstrated that digital information can be written in DNA, stored, and accurately read. Besides, due to the digital DNA nature, there is a trend to associate the DNA information (6 bits per amino acid) with typical digital codes for information representation (8 bits). Therefore, we propose to use a series of 48 bits to encode the digital information of a host into DNA representation. This representation is appropriate in end-to-end digital communication systems since (i) it introduces a digital code regardless of the computer's architecture, and (ii) it can be used as a "common format" for "bio host-bio transmitter" with both the advantages of DNA as a storage medium and the effective methods to compress DNA information to save the transmission medium bandwidth.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    35
    References
    1
    Citations
    NaN
    KQI
    []