Skip to main content

Marimba-playing robot uses deep-learning AI to compose and perform its own music

Robot Composes, Plays Own Music Using Deep Learning (with notes)
When the inevitable robot invasion happens, we now know what the accompanying soundtrack will be — and we have to admit that it’s way less epic than the Terminator 2: Judgment Day theme. Unless you’re a massive fan of the marimba, that is!

That assertion is based on research coming out of the Georgia Institute of Technology, where engineers have developed a marimba-playing robot with four arms and eight sticks that is able to write and perform its own musical compositions. To do this, it uses a dataset of 5,000 pieces of music, combined with the latest in deep learning neural network-based AI.

“This is the first example of a robot composing its own music using deep neural networks,” Ph.D. student Mason Bretan, who first began working on the so-called Shimon robot seven years ago, told Digital Trends. “Unlike some of the other recent advances in autonomous music generation from research being done in academia and places like Google, which is all simulation done in software, there is an extra layer of complexity when a robotic system that lives in real physical three-dimensional space generates music. It not only needs to understand music in general, but also to understand characteristics about its embodiment and how to bring its musical ‘ideas’ to fruition.”

Robot Composes, Plays Own Music Using Deep Learning

Training Shimon to generate new pieces of music involves first coming up with a numerical representation of small chunks of music, such as a few beats or a single measure, and then learning how to sequence these chunks. Two separate neural networks are used for the work — with one being an “autoencoder” that comes up with a concise numerical representation, and the second being a long short-term memory (LSTM) network that models sequences from these chunks.

“These sequences come from what is seen in human compositions such as a Chopin concerto or Beatles’ piece,” Bretan continued. “The LSTM is tasked with predicting forward, which means given the first eight musical chunks, it must predict the ninth. If it is able to successfully to do this, then we can provide the LSTM a starting seed and let it continue to predict and generate from there. When Shimon generates, it makes decisions that are not only based off this musical model, but also include information about its physical self so that its musical decisions are optimized for its specific physical constraints.”

It’s pretty fascinating stuff. And while the idea of a music-generating bot might sound of interest only to people studying music, the bigger questions it raises about computational creativity are only going to get more important as time goes on.

“Though we are focusing on music, the more general questions and applications pertain to understanding the processes of human creativity and decision-making,” Bretan said. “If we are able to replicate these processes, then we are getting closer to having a robot successfully survive in the real world, in which creative decision-making is a must when encountering new scenarios and problems each day.”

Editors' Recommendations