Futurism is powered by Vocal.
Vocal is a platform that provides storytelling tools and engaged communities for writers, musicians, filmmakers, podcasters, and other creators to get discovered and fund their creativity.
How does Vocal work?
Creators share their stories on Vocal’s communities. In return, creators earn money when they are tipped and when their stories are read.
How do I join Vocal?
Vocal welcomes creators of all shapes and sizes. Join for free and start creating.
To learn more about Vocal, visit our resources.Show less
The computing memory required to store ongoing data in biomedicine is now estimated to exceed the computing challenges of running YouTube and Twitter.
Data storing technology has constantly evolved to keep up with advancing human knowledge and scientists are now pioneering next-generation efforts for the future of archive technology:
The Evolving Storage Systems
In response to ongoing demands for massive data repositories in genomics (study of genes) alone, researchers have shown possibility of creating DNA-based archives for large-scale, long-term information deposition. This unusual technique has its origins in the serendipitous findings of intact mammoth and horse genomes decoded from bone, almost 60,000 to 700,000 years after its accidental preservation - to indicate a measure of just how long-term.
The current global data crunch reiterates the need for an alternative repository - with storage capacities set to reach 44 trillion Gigabytes by 2020 and exceed the supply of microchip-grade silicon by 2040. As a memory material, DNA ideally stores information of life for life, is able to pack information densely and enable data retrieval. The quaternary code of DNA written with an alphabet of four nucleotides: cytosine (C), thymine (T), adenine (A) and guanine (G) can be utilized for programmable genetic engineering with a storage density that exceeds electronic memory.
Architecture of DNA and Comparative Storage
Architecturally, four nucleotide bases of DNA form complementary base pairs to self-assemble a double helix. In theory, 2 bits of data can be stored per nucleotide with observed half-life over 500 years. Comparative storage capacities indicate genomic density within a bacteria is equivalent to a storage capacity of ~10^19 bits per cubic centimeter; if this capacity were transferred to a realistic information storage system, the projected global storage needs of ~10^22 bits would be met by just ~1 kg of DNA. Of course achieving that potential isn’t easy, so ongoing research in DNA-based architecture of archival systems has formed a new field, with key contributions made to its advancement.
The method was first demonstrated in 1988 by artist Joe Davis and Harvard researchers, when the team encoded an artificial DNA sequence with an ancient Germanic rune representing life and the female earth named Microvenus. The 2D image of interest was first mathematically converted to a 35-bit binary code, DNA bases assigned to a specific pattern translating bits to nucleotides and a DNA sequence synthesized. The sequence was stored in Escherichia coli bacteria for multiplication and decoded to reconstruct the image, as detailed in the Art Journal :
Recap Method 1: DNA-Based Information Storage
Subsequent storage systems included the key cyclical steps outlined, with methods optimized as technology advanced. Bacteria assisted multiplication for instance was replaced with in-lab instrumentation such as polymerase chain reaction (PCR), to multiply synthetic DNA. Also, synthesize = DNA writing process and sequence = DNA reading.
The small scale of work (35-bits) stemmed from the difficulty of writing and decoding long perfect DNA sequences. In 2011, George Church, a molecular geneticist at Harvard Medical School, began DNA data storage experiments based on this concept, but used a novel encoding scheme that stored a larger scale, html draft of the entire book Regenesis into DNA. In total, the content for storage was a 5.27-megabit stream encoded into DNA and amplified via the polymerase chain reaction to enable reading:
Recap Method: Archiving a Book in DNA
- All of Shakespeare’s sonnets
- An audio clip of Martin Luther King’s ‘I have a dream’ speech
- A classic scientific paper on the structure of DNA published in Nature in 1953
- A photo of the research institute in which this work was done (EBI)
- A file of the code used to enable the complex process
Recap Method: Goldman Encoding
Demonstrating New Protocol - Test Images Stored
The larger 200 MB archive created by the same team reportedly includes 1) a video by the band OK Go! 2) Universal Declaration of Human Rights in 100 languages, 3) the top 100 books of Project Gutenberg and 4) a seed database.
At this point, if a user wished to read a single file of the DNA-based archive the entire system had to be decoded. Computer scientist Olgica Milenkovic at the University of Illinois developed a new method that enabled random-access to the preferred data in storage and also enabled re-writing of stored information. As proof-of-concept, they encoded Wiki pages of 6 Universities into DNA and selectively edited context of 3 schools.
The main barrier to commercialization of this process is its extremely high cost, largely due to the DNA writing process (synthesis) than the reading process (sequencing). It costs less to “read” the genome since completion of the Human Genome Project (HGP) in 2003. Some improvements in cost reduction may similarly result from the Human Genome write project (HGP-write), launched by Church and others with the aim of reducing costs by more than 1000-fold within the next 10 years.
The most recent developments in the field were in March 2017, when scientists at the Columbia University and New York Genome Center developed a DNA archive via a new coding strategy named DNA fountain.
The archive consisted of 1) a full computer operating system, 2) an 1895 French film “Arrival of a train at La Ciotat”, 3) an Amazon gift card of $50, 4) a computer virus, 5) a pioneer plaque and 6) a 1948 study by information theorist Claude Shannon.
Movie Stored and Retrieved from DNA Molecules
This archive has the highest storage capacity reported so far at 215 petabytes (PB), 100x the magnitude of storage density reported by previous research teams. This storage system can be copied an unlimited number of times for virtually unlimited data retrieval. Computer scientist Yaniv Erlich recaps the newest advancement in conclusion:
This article is primarily based on a report by George M. Church, Yuan Gao and Sriram Kosuri published in Science Magazine September 2012, available via | doi 10.1126/science.1226355