Skip to main content

Scientists encode the novel ‘Wonderful Wizard of Oz’ in DNA

 

A few years ago, Harvard scientists successfully managed to encode a low-resolution GIF of a horse galloping into the DNA of an e.coli bacteria. Now, researchers have shown off the next level of DNA encoding: By storing the entire L. Frank Baum novel “The Wonderful Wizard of Oz” (the basis for the classic 1939 Hollywood movie of almost the same name) in the form of DNA information.

Recommended Videos

“We start with the digital version of the text,” Stephen Jones, a research scientist who collaborated on the project, told Digital Trends. “We send that information to our program, which spits out a bunch of DNA sequences, made of A,C,G and Ts. Each sequence is used to make actual pieces of DNA. Those pieces could be stored in some pretty rough conditions for thousands to even millions of years, much like we’ve seen with sequenced dinosaur DNA.”

Coding and decoding the text

Should someone, as Jones said, get the “burning desire” to read the novel in Esperanto, the constructed international auxiliary language it was translated into, they would take these DNA pieces and read back their sequence using a DNA sequencer. The sequence would then go through the algorithm developed by the team, which would translate it back into a digital version readable on computer. “So basically, a computer’s zeros and ones get turned into DNA’s As, Cs, Gs and Ts for storage, then the process is reversed when you’re ready to read,” Jones said.

Carrying out digital-to-DNA conversion has been possible for a long time. But the excitement of this work is the way that the conversion takes place. Digital and DNA storage have different issues, with digital storage being sensitive to electricity, temperatures, water, and more. DNA is more robust in these areas, but is prone to parts being accidentally deleted or added to during the encoding process.

“Academics and big companies like Google and Microsoft have been trying to figure a way around this for a long time,” Jones explained. “Usually, people just read enough copies of the DNA information that if one gets messed up, they can depend on another. You can imagine that process is very inefficient.”

An algorithm to overcome the problems

To overcome this, the team’s encoding algorithm has some neat qualities. To begin with, the information in each DNA sequence helps correct errors in every other DNA sequence’s information so that they build upon each other. The method also accounts for those deletions or additions, is flexible enough that it can be made stronger when a piece of information is really important (a character name in “Wizard of Oz,” for example) and weaker when the information doesn’t matter so much (a random word in the novel), and will specifically avoid DNA sequences known to be problematic like a string of A’s in a row. Finally, the method encrypts the information as it’s converted to DNA sequence, adding a layer of protection and privacy that could be useful with data more sensitive than a 120-year-old public domain novel.

“A top [real-world] use would be for long-term storage when you must keep the information, but use it infrequently,” Jones said, giving the example of historical banking data for years past. “Tech companies would see value for dormant accounts that no one’s using, but they don’t want to delete. There could [additionally] be a huge cost savings during storage. Storing DNA takes almost no energy — especially compared to keeping data servers plugged in and happy.”

This is a problem that at least one DNA storage company is working on, although it’s likely several years away from being viable. Nonetheless, work like this is a reminder that science is getting closer all the time.

A paper describing the work was recently published in the journal PNAS.

Luke Dormehl
I'm a UK-based tech writer covering Cool Tech at Digital Trends. I've also written for Fast Company, Wired, the Guardian…
Google just gave vision to AI, but it’s still not available for everyone
Gemini Live App on the Galaxy S25 Ultra broadcast to a TV showing the Gemini app with the camera feature open

Google has just officially announced the roll out of a powerful Gemini AI feature that means the intelligence can now see.

This started in March as Google began to show off Gemini Live, but it's now become more widely available.

Read more
This modular Pebble and Apple Watch underdog just smashed funding goals
UNA Watch

Both the Pebble Watch and Apple Watch are due some fierce competition as a new modular brand, UNA, is gaining some serous backing and excitement.

The UNA Watch is the creation of a Scottish company that wants to give everyone modular control of smartwatch upgrades and repairs.

Read more
Tesla, Warner Bros. dodge some claims in ‘Blade Runner 2049’ lawsuit, copyright battle continues
Tesla Cybercab at night

Tesla and Warner Bros. scored a partial legal victory as a federal judge dismissed several claims in a lawsuit filed by Alcon Entertainment, a production company behind the 2017 sci-fi movie Blade Runner 2049, Reuters reports.
The lawsuit accused the two companies of using imagery from the film to promote Tesla’s autonomous Cybercab vehicle at an event hosted by Tesla CEO Elon Musk at Warner Bros. Discovery (WBD) Studios in Hollywood in October of last year.
U.S. District Judge George Wu indicated he was inclined to dismiss Alcon’s allegations that Tesla and Warner Bros. violated trademark law, according to Reuters. Specifically, the judge said Musk only referenced the original Blade Runner movie at the event, and noted that Tesla and Alcon are not competitors.
"Tesla and Musk are looking to sell cars," Reuters quoted Wu as saying. "Plaintiff is plainly not in that line of business."
Wu also dismissed most of Alcon's claims against Warner Bros., the distributor of the Blade Runner franchise.
However, the judge allowed Alcon to continue its copyright infringement claims against Tesla for its alleged use of AI-generated images mimicking scenes from Blade Runner 2049 without permission.
Alcan says that just hours before the Cybercab event, it had turned down a request from Tesla and WBD to use “an icononic still image” from the movie.
In the lawsuit, Alcon explained its decision by saying that “any prudent brand considering any Tesla partnership has to take Musk’s massively amplified, highly politicized, capricious and arbitrary behavior, which sometimes veers into hate speech, into account.”
Alcon further said it did not want Blade Runner 2049 “to be affiliated with Musk, Tesla, or any Musk company, for all of these reasons.”
But according to Alcon, Tesla went ahead with feeding images from Blade Runner 2049 into an AI image generator to yield a still image that appeared on screen for 10 seconds during the Cybercab event. With the image featured in the background, Musk directly referenced Blade Runner.
Alcon also said that Musk’s reference to Blade Runner 2049 was not a coincidence as the movie features a “strikingly designed, artificially intelligent, fully autonomous car.”

Read more