Skip to main content

Marimba-playing robot uses deep-learning AI to compose and perform its own music

Robot Composes, Plays Own Music Using Deep Learning (with notes)
When the inevitable robot invasion happens, we now know what the accompanying soundtrack will be — and we have to admit that it’s way less epic than the Terminator 2: Judgment Day theme. Unless you’re a massive fan of the marimba, that is!

That assertion is based on research coming out of the Georgia Institute of Technology, where engineers have developed a marimba-playing robot with four arms and eight sticks that is able to write and perform its own musical compositions. To do this, it uses a dataset of 5,000 pieces of music, combined with the latest in deep learning neural network-based AI.

“This is the first example of a robot composing its own music using deep neural networks,” Ph.D. student Mason Bretan, who first began working on the so-called Shimon robot seven years ago, told Digital Trends. “Unlike some of the other recent advances in autonomous music generation from research being done in academia and places like Google, which is all simulation done in software, there is an extra layer of complexity when a robotic system that lives in real physical three-dimensional space generates music. It not only needs to understand music in general, but also to understand characteristics about its embodiment and how to bring its musical ‘ideas’ to fruition.”

Robot Composes, Plays Own Music Using Deep Learning

Training Shimon to generate new pieces of music involves first coming up with a numerical representation of small chunks of music, such as a few beats or a single measure, and then learning how to sequence these chunks. Two separate neural networks are used for the work — with one being an “autoencoder” that comes up with a concise numerical representation, and the second being a long short-term memory (LSTM) network that models sequences from these chunks.

“These sequences come from what is seen in human compositions such as a Chopin concerto or Beatles’ piece,” Bretan continued. “The LSTM is tasked with predicting forward, which means given the first eight musical chunks, it must predict the ninth. If it is able to successfully to do this, then we can provide the LSTM a starting seed and let it continue to predict and generate from there. When Shimon generates, it makes decisions that are not only based off this musical model, but also include information about its physical self so that its musical decisions are optimized for its specific physical constraints.”

It’s pretty fascinating stuff. And while the idea of a music-generating bot might sound of interest only to people studying music, the bigger questions it raises about computational creativity are only going to get more important as time goes on.

“Though we are focusing on music, the more general questions and applications pertain to understanding the processes of human creativity and decision-making,” Bretan said. “If we are able to replicate these processes, then we are getting closer to having a robot successfully survive in the real world, in which creative decision-making is a must when encountering new scenarios and problems each day.”

Editors' Recommendations

Luke Dormehl
I'm a UK-based tech writer covering Cool Tech at Digital Trends. I've also written for Fast Company, Wired, the Guardian…
Digital Trends’ Tech For Change CES 2023 Awards
Digital Trends CES 2023 Tech For Change Award Winners Feature

CES is more than just a neon-drenched show-and-tell session for the world’s biggest tech manufacturers. More and more, it’s also a place where companies showcase innovations that could truly make the world a better place — and at CES 2023, this type of tech was on full display. We saw everything from accessibility-minded PS5 controllers to pedal-powered smart desks. But of all the amazing innovations on display this year, these three impressed us the most:

Samsung's Relumino Mode
Across the globe, roughly 300 million people suffer from moderate to severe vision loss, and generally speaking, most TVs don’t take that into account. So in an effort to make television more accessible and enjoyable for those millions of people suffering from impaired vision, Samsung is adding a new picture mode to many of its new TVs.
[CES 2023] Relumino Mode: Innovation for every need | Samsung
Relumino Mode, as it’s called, works by adding a bunch of different visual filters to the picture simultaneously. Outlines of people and objects on screen are highlighted, the contrast and brightness of the overall picture are cranked up, and extra sharpness is applied to everything. The resulting video would likely look strange to people with normal vision, but for folks with low vision, it should look clearer and closer to "normal" than it otherwise would.
Excitingly, since Relumino Mode is ultimately just a clever software trick, this technology could theoretically be pushed out via a software update and installed on millions of existing Samsung TVs -- not just new and recently purchased ones.

Read more
AI turned Breaking Bad into an anime — and it’s terrifying
Split image of Breaking Bad anime characters.

These days, it seems like there's nothing AI programs can't do. Thanks to advancements in artificial intelligence, deepfakes have done digital "face-offs" with Hollywood celebrities in films and TV shows, VFX artists can de-age actors almost instantly, and ChatGPT has learned how to write big-budget screenplays in the blink of an eye. Pretty soon, AI will probably decide who wins at the Oscars.

Within the past year, AI has also been used to generate beautiful works of art in seconds, creating a viral new trend and causing a boon for fan artists everywhere. TikTok user @cyborgism recently broke the internet by posting a clip featuring many AI-generated pictures of Breaking Bad. The theme here is that the characters are depicted as anime characters straight out of the 1980s, and the result is concerning to say the least. Depending on your viewpoint, Breaking Bad AI (my unofficial name for it) shows how technology can either threaten the integrity of original works of art or nurture artistic expression.
What if AI created Breaking Bad as a 1980s anime?
Playing over Metro Boomin's rap remix of the famous "I am the one who knocks" monologue, the video features images of the cast that range from shockingly realistic to full-on exaggerated. The clip currently has over 65,000 likes on TikTok alone, and many other users have shared their thoughts on the art. One user wrote, "Regardless of the repercussions on the entertainment industry, I can't wait for AI to be advanced enough to animate the whole show like this."

Read more
4 simple pieces of tech that helped me run my first marathon
Garmin Forerunner 955 Solar displaying pace information.

The fitness world is littered with opportunities to buy tech aimed at enhancing your physical performance. No matter your sport of choice or personal goals, there's a deep rabbit hole you can go down. It'll cost plenty of money, but the gains can be marginal -- and can honestly just be a distraction from what you should actually be focused on. Running is certainly susceptible to this.

A few months ago, I ran my first-ever marathon. It was an incredible accomplishment I had no idea I'd ever be able to reach, and it's now going to be the first of many I run in my lifetime. And despite my deep-rooted history in tech, and the endless opportunities for being baited into gearing myself up with every last product to help me get through the marathon, I went with a rather simple approach.

Read more