Skip to main content
  1. Home
  2. Emerging Tech
  3. Features

Scientists are using A.I. to create artificial human genetic code

Add as a preferred source on Google
profile of head on computer chip artificial intelligence
Digital Trends Graphic / Digital Trends

Since at least 1950, when Alan Turing’s famous “Computing Machinery and Intelligence” paper was first published in the journal Mind, computer scientists interested in artificial intelligence have been fascinated by the notion of coding the mind. The mind, so the theory goes, is substrate independent, meaning that its processing ability does not, by necessity, have to be attached to the wetware of the brain. We could upload minds to computers or, conceivably, build entirely new ones wholly in the world of software.

This is all familiar stuff. While we have yet to build or re-create a mind in software, outside of the lowest-resolution abstractions that are modern neural networks, there are no shortage of computer scientists working on this effort right this moment.

Recommended Videos

What is altogether less familiar is the work being carried out by researchers at Estonia’s University of Tartu and France’s Paris-Saclay University.

Rather than just trying to re-create an approximation of the mind in software, they’ve turned to a different problem: Can you use an algorithm to generate genetic code for people that have never existed? Could you apply the same generative adversarial network (GAN) technology that allows A.I. models like BigSleep to spit out compellingly realistic generated images and use it, instead, to create fake DNA that, in the vein of Turing’s work, is indistinguishable from that of a flesh-and-blood person?

Artificial genetic data

“Creating artificial genetic data that are realistic enough, without directly copying the sequences, is a very hard problem,” Flora Jay, a researcher specializing in machine learning and population genetics at the University of Paris-Saclay University, told Digital Trends. “Genetic data is of high dimension, and you cannot just eyeball what’s important or not. We thus turned to cutting-edge techniques [being] applied to the computer vision, text, music, or protein world. These generative networks — GANs and [restricted Boltzmann machines] — are designed so that they can progressively and automatically learn how to create artificial genetic sequences.”

A GAN, a class of machine-learning framework coined by researcher (and current Apple employee) Ian Goodfellow, uses a combative, tug-of-war approach to improve its generative outcomes. It consists of two neural networks: A “generator” and a “discriminator” which pass outputs between one another.

GAN model
Yelmen et al. 2021

The generator’s job is to create something, be it an A.I. painting or a chunk of code representing an artificial genome in the form of ones and zeroes. The discriminator, like a bot version of J.K. Simmons’ perfectionist music instructor in the movie Whiplash, then critiques its efforts and sends this back to the generator. The generator learns from this feedback, while the discriminator similarly gets ever better at guessing what’s been created by the generator and what is the genuine article. Eventually, the generator is so good at creating fake versions of whatever it is attempting that the discriminator can be fooled. It’s no longer able to differentiate real from fake.

“One of the main problems here is assessing the quality of artificial genomes,” Burak Yelmen, a Ph.D. student at the University of Tartu’s Institute of Genomics, told Digital Trends. “You can look at an image and decide if it looks real, but this is not possible for genomes. [The] majority of the analyses we performed in our study was to see whether the artificial genome chunks we generated really looked like the real ones.”

Don’t worry, though. Despite a growing mass of articles about highly dubious gene tampering designed to rewrite the human code, this work is not about trying to “write” new parentless humans who could be created with the aid of supercomputers.

A chromosome emerges from random digital noise
Burak Yelmen

“To be clear, the objective of our work is to better understand and encode the existing genetic diversity of thousands or millions of people around the world, not to create artificial cells,” Jay said. “The neural networks are trained on this existing diversity, so the generated genomic regions do not carry additional novel mutations that could easily disrupt the functionality of a sequence — and they include, untouched, the segments that are conserved across human populations.”

Jay noted that, at the whole genome scale, it is “difficult to say” whether a specific combination of millions of generated nucleotides could indeed be “functional.” In other words, don’t expect to compile and run this code, expecting a fully formed person (or their blueprints) to emerge at the other end. Instead, the purpose is something altogether less sinister and, potentially, more useful.

All about data privacy

“There is an immense amount of data in biobanks and it keeps increasing every day,” said Yelmen. “However, genomic data is sensitive data and accessing these biobanks can be difficult for researchers due to ethical concerns. The main goal of our work is to create high-quality surrogates of existing genome banks and provide a solution to this accessibility barrier within a safe ethical framework. It is important to note that our study was a first step: There is still work to do.”

Added Jay: “The idea behind our study is to start investigating whether releasing artificial genomes instead of the real ones could preserve the privacy of genome donors, while providing useful information to the population genetics community. [Possible] applications of artificial genomes could range from better understanding of our evolutionary past to providing insights in medical genetics, including a wider range of diversity.”

In some ways, the work is reminiscent of the trend, seen a couple of years ago, in which GANs were used to create images of imaginary people, animals, and more as epitomized by the generative website ThisPersonDoesNotExist.com. Only this time, of course, it involves actual genetic code, rather than simple pictures.

A paper describing the project, titled “Creating artificial human genomes using generative neural networks,” was recently published in the journal PLOS Genetics.

Luke Dormehl
I'm a UK-based tech writer covering Cool Tech at Digital Trends. I've also written for Fast Company, Wired, the Guardian…
Apple Books apparently has the same knockoff problem as Amazon
WSJ's Joanna Stern says copycat AI books based on her work continue to pop up on the platform.
updated book and AI photo

Apple Books has long been viewed as a cleaner alternative to Amazon's Kindle Store. But if a new investigation is anything to go by, it may be fighting the same battle against AI-generated junk. In a recent YouTube Shorts video, The Wall Street Journal's Joanna Stern revealed that fake, AI-generated versions of her book have repeatedly appeared on Apple Books, despite being reported and removed.

Joanna Stern says fake copies keep coming back

Read more
Your next EV battery could start life as a plastic water bottle
Penn State researchers have found a way to turn discarded PET plastic into battery-grade graphite.
Kid holding plastic bottles

Plastic bottles usually end up being recycled into lower-value products, buried in landfills, or worse, polluting the environment. But researchers at Penn State University believe they could one day power electric vehicles, smartphones, and even renewable energy storage systems after discovering a way to convert discarded plastic into high-quality battery graphite.

Turning plastic waste into battery-grade graphite

Read more
Anthropic’s most powerful AI is making a comeback, but only for a select few
The U.S. government has approved the limited return of Mythos 5 as Fable 5 edges closer to a wider release.
Claude Fable 5 and Claude Mythos 5 Official Render

Anthropic's AI restrictions may finally be starting to thaw. After being forced offline earlier this month over U.S. government security concerns, the company's most advanced AI models are slowly making a comeback. According to a new report from Axios, Anthropic has already restored Mythos 5 for a limited number of trusted users, while Fable 5 could return as early as next week if ongoing discussions with federal agencies continue to progress.

Mythos returns first, while Fable waits in the wings

Read more