July 20, 2020

Technologies series: DNA data storage

BY Soyon Park, Preston Llewellyn

Share on twitter
Share on whatsapp
Share on facebook
Share on linkedin
Share on email
Share on reddit

Report Contents

Listen to our reports with a personalized podcasts through your Amazon Alexa or Apple devices audio translated into several languages

en flag
zh flag
de flag
pt flag
es flag
Press play to listen
( 4 mins)

Vast volumes of data can be stored, condensed into DNA.


What it is

DNA data storage is the archiving and retrieving of data to and from synthetic strands of DNA. Data centres are considered by many, including the world’s top asset managers, as the ‘beating heart’ of global cities. 1 However, they face a challenge: the pace of new data generation is rapidly outpacing the current capacity to store it. 2 Fortunately, the solution is within us (or in us). Providing unrivalled storage density ̶ the capacity to store all the world’s information in a size of a shoebox 3 – as well as stability and longevity, 4 relevancy, 5 and energy usage, 6 DNA data storage has attracted the attention of multinational corporations and governments. 7

How it works

The first step is to convert information 8 into binary digits (bits). Storage and retrieval of the digital data has five principal stages:

  1.  Encoding. A computer algorithm maps strings of bits into DNA sequences by translating the binary numbers into nucleotides A, T, C, and G that form the basic structure of DNA. 9 10
  2. Synthesis. The sequences are written into DNA molecules, and physical strands are built.
  3. Storage. The DNA is next physically conditioned and organised into a library for long-term storage. It is stored either in vivo, 11 or more generally, in vitro 12 in a frozen solution, or dried or eluted into a storage vessel. 13
  4. Retrieval. Upon a data request, the selected DNA is physically retrieved from the DNA pool in a process called random access, similar to that employed in traditional digital storage. 14
  5. Read. Finally, the selected DNA is sequenced using automated sequencing instruments to generate a set of corresponding ‘reads’, 15 which recovers the original digital data.

Applications

  • Commercial data storage. With declining production costs, increasing automation, and upscaling of use, DNA storage will become increasingly commercially viable. 16 (See Figs 1 and 2).
  • Life sciences and medicine. In vivo DNA data storage permits logging of historical molecular events and improved monitoring of cellular behaviour. 17
  • Storing heritable information. Time-tested by nature; DNA offers a half-life of 500 years. 18
  • Computer architecture. Given the approaching limits to Moore’s Law, 19 biochemical systems as integral parts of computing systems is worth exploring. 20
  • DNA sonography and cryptography: preserving data privacy through encrypting information.

Implications and issues

  • ESG. Storing data in DNA is energy and space efficient, 22 using less power than a light bulb per year. 23 Once it is commercially viable, pressure to migrate to this storage method is set to grow.
  • Social wealth. Even if DNA does not become the ubiquitous storage, it stands to be used for long-term, new-scale information generation and preservation, enabling richer intergenerational knowledge transfer. 24
  • Data-driven innovation and the digital industry. The next-generation data storage market, estimated to be worth over $144bn by 2022, 25 will spur the digital economy, 26 upon which so many products and services are built, 27 and accounts for some 15.5% of the world economy.
  • Infrastructure. The world’s 500,000 data centres cover some 285 million square feet. With such big storage-density differentials ̶ one gram of DNA stores as much as 42 billion USB sticks 28 the building of new recovery-enhancing infrastructure beckons. 29
  • Employment. The genomics industry is also already growing rapidly. New types of job will be created. ‘Technological unemployment’ will however be a risk. 30 31
  • Business costs. The technology stands to reduce logistics and other costs significantly. 32
  • Risk of obsolescence. Current systems may become obsolete; businesses with legacy systems may find it difficult to capitalize on new opportunities.

Technologies series: DNA data storage 1

More by Soyon Park, Preston Llewellyn