DNA As A Flash Drive: Why Write Digital Data Into The Genome - Alternative View

Table of contents:

DNA As A Flash Drive: Why Write Digital Data Into The Genome - Alternative View
DNA As A Flash Drive: Why Write Digital Data Into The Genome - Alternative View

Video: DNA As A Flash Drive: Why Write Digital Data Into The Genome - Alternative View

Video: DNA As A Flash Drive: Why Write Digital Data Into The Genome - Alternative View
Video: How we can store digital data in DNA | Dina Zielinski 2024, May
Anonim

The growth in the volume of digital information prompts scientists to look for more compact ways of recording and storing it. And what could be more compact than DNA? RIA Novosti, together with an expert, figured out how to encode words with nucleotides and how much data one molecule contains.

Reasons-codes

DNA is a sequence of nucleotides. There are only four of them: adenine, guanine, thymine, cytosine. To encode information, each of them is assigned a digit-code. For example, thymine - 0, guanine - 1, adenine - 2, cytosine - 3. Coding begins with the fact that all letters, numbers and images are converted into a binary code, that is, a sequence of zeros and ones, and they are already converted into a sequence of nucleotides, that is, a quaternary code.

Before you encode data in DNA, you need to translate them into a digital code / Illustration by RIA Novosti. Alina Polyanina
Before you encode data in DNA, you need to translate them into a digital code / Illustration by RIA Novosti. Alina Polyanina

Before you encode data in DNA, you need to translate them into a digital code / Illustration by RIA Novosti. Alina Polyanina.

Only three nucleotides can be used to build a code (ternary code), and the fourth is to break sequences into parts. There is an option with the construction of bases in the form of a binary code, when two of them correspond to zero, and two correspond to one.

Several techniques are used for reading. One of the most common is that a chain of a DNA molecule is copied using bases, each of which has a color label. Then a very sensitive detector reads the data, and the computer uses the colors to reconstruct the nucleotide sequence.

“The DNA molecule is very capacious. Even in bacteria, it usually contains about a million bases, and in humans, as many as three billion. That is, each human cell carries a volume of information comparable to the capacity of a flash drive. And we have trillions of such cells. A huge amount of data can be recorded in DNA, but writing and reading from such a medium is still too slow and costly,”says Alexander Panchin, Ph. D., senior researcher at the Institute for Information Transmission Problems named after A. A. Kharkevich, Russian Academy of Sciences.

Promotional video:

Recording density grows

In June 1999, the journal Nature published an article by American scientists who developed a technique for sending secret messages using DNA. They synthesized a molecule by incorporating a nucleotide sequence formed using a quaternary code. The secret DNA in the mixture was sent to another laboratory. Its employees, using special chemical keys, found the desired molecule and extracted information from it.

“In general, there are two approaches to recording data on DNA. The first is when you synthesize completely new DNA using a chemical synthesizer. At the command of the computer, nucleotides are added to the solution in a certain order, and the required base chain gradually "grows". In the second case, data is encoded in the already existing DNA of an organism,”explains Panchin.

In May 2010, the group of Craig Venter, who first mapped the human genome, published a paper on the creation of an artificial bacterium. They took a bacterial cell purified from the genome as a basis and placed the formed base sequence there. The result is a new bacterium, quite active and alive, which differs from the usual only in that its DNA was created by hand. In addition, the team demonstrated a sense of beauty by writing their names and quotes from classics using a quaternary code in the bacterial DNA.

In 2012, a group led by molecular biologist George Church took a more fundamental approach and DNA-coded a 52,000-word book Regenesis: How Synthetic Biology Will Reinvent Nature and Ourselves, several pictures and one Java program. They used binary code. The total amount of data was 658 kilobytes. The information density was found to be almost 1018 bytes per gram of molecules. For comparison, a 1012-byte hard drive weighs about a hundred grams. The main disadvantage of this method is the instability of the recorded information.

“The DNA molecule tends to mutate, which reduces the reliability of data storage. Especially if the carrier of DNA is a living cell capable of division: when DNA is duplicated, errors creep in especially often. Data storage reliability will increase if you have thousands of copies of the same message. Or just store the DNA, say, in the freezer. At low temperatures, the ability of a molecule to mutate is significantly reduced,”explains the expert.

In addition, information is sometimes lost when reading. Errors can be of a chemical nature, when an incorrect base is attached to an element, or purely calculated, that is, depending on the computer.

Expensive, reliable

In March 2017, Science magazine published an article by American scientists who managed to write 2 * 1017 bytes per gram of DNA. Biologists emphasize that they have not lost a single byte. Simply put, what we recorded is what we got at the exit.

For an ordinary user, a "genetic flash drive" is not yet available, because it is very expensive to store information on it, and the read / write speed is low. Scientists estimate that reading just one megabyte requires about three and a half thousand dollars and several hours of time.

The undoubted advantages of recording information on DNA include the enormous storage density of data, as well as the stability of the carrier - however, only at low temperatures.