Encoding Based on htDNA-chip®

Encoding Based on htDNA-chip®

The overview of the encoding in DNA storage

In the era of global information, traditional storage technologies have been slightly tested. DNA must realize reliable information encoding. Only when DNA becomes a viable competitor of traditional storage technology is the first step for DNA to realize information storage. The most important thing is to produce nucleotide strings cheaply and quickly. For DNA, it is composed of four different monomers, consisting of a deoxyribonucleic acid having a skeleton with phosphoric acid. First, DNA is a polymer biological macromolecule having a monomer sequence, a certain chemical structure, and a plurality of small molecular monomers connected together. Since there are different types of monomers in a chain of DNA, you can assign a different type of monomer. This shows the function of DNA has natural storage information. Second, DNA is the genetic material of most organisms, which can encode the genome of a biological species in a small cell. This genetic information is reflected in two aspects: one is the different type of base type, and the second is the relative order between the base. In vitro, if the base assignment A & T = 0, G & C = 1, the chemical signal can be converted into a digital signal. DNA can be used as a binary data storage material.

How to encode DNA in vitro?

In vivo, DNA research adjusts human life activities by automatically encoding genetic substances to re-translating proteins. So how do DNA storage in vitro encode the data we need. Scientists use a lot of short DNA sequences rather than long DNA sequences to encode data, which is similar to the principle of hard disk writing. In the hard disk, the data is written to a small hard disk block called a sector, which can reduce the difficulties and costs of writing or reading data. Firstly you should put the files that you need to be stored in HTML format, and then compile these files into binary data that can be read by a computer composed of 0 and 1. Subsequently, the four bases (A, T, G, C) are converted into binary data and the sequence of these codes is determined. One primer or two bases can be used to encode one character. The specific encoding method is as follows.

Encoding in the process of DNA data storageFig.1 Encoding in the process of DNA data storage(Rutten, M.; et al. 2018)

What can htDNA-chip® platform do in DNA encoding?

Our silicon-based htDNA-chip® technology platform can help with your coding design, allowing you to design any space and size of DNA. The DNA encoding process actually corresponds to the algorithm of a program. The algorithm construction process is the design process used to store the DNA chain of information. At present, due to the limitations of traditional DNA synthesis and sequencing methods, the encoding process has the following problems. Firstly, the code corresponding to the DNA strand needs to be clear and include methods for error identification and correction. Secondly, the algorithm design needs to achieve efficient DNA utilization, because the price and length of DNA are considered. Thirdly, the encoding algorithm needs to use simple and straightforward codes, because it is impossible to encode all data on a piece of DNA. The chain length of synthetic DNA is limited, so the data can only be divided into small segments, and each small segment corresponds to a small piece of data. Different from traditional DNA synthesis ideas, htDNA-chip® can give you more design space for your coding, including the following aspects.

htDNA-chip in the process of DNA encoding in DNA data storageFig.2 htDNA-chip® in the process of DNA encoding in DNA data storage

  • Ensure the accuracy of DNA products
  • Provide high-throughput DNA synthesis 
  • Realize the synthesis of long DNA chains

All services are available on a 24/7/365 basis. If you are interested in CD BioSciences' htDNA-chip® platform, please contact us.

Reference

  1. Rutten, M.; et al. Encoding information into polymers. Nature Reviews Chemistry. 2018, 2: 365-381.
For research use only. Not intended for any clinical use.