Skip to main content

Newly developed AI system can accurately judge a book by its cover

The tech world sure loves to disrupt conventional wisdom. Its latest victim? The old adage that you should never judge a book by its cover.

With disproving that sentiment in mind, researchers at Japan’s Kyushu University have trained a neural network to be able to predict which genre a book falls into simply by studying its cover.

Recommended Videos

“The purpose of this work is to determine if machines can learn the meaning behind book covers without textual clues,” researcher and paper co-author Brian Kenji Iwana told Digital Trends. “For this study, we took book cover images and classified them by genre using an artificial neural network. We also look at some of the hidden design rules of the covers found by the network.”

judging-a-book-graph
Kyushu University

For their dataset, Iwana and colleague Seiichi Uchida used a total of 137,788 book covers for titles available for sale on Amazon. These fell into 20 different categories, and was simplified slightly by only using the primary category a book was listed under, in instances where it fell under multiple genre headings.

Please enable Javascript to view this content

Eighty percent of this data was then used to train the four-layer neural network the pair used, thereby leaving 20 percent for validating and testing it.

More than 40 percent of the time, the algorithm was able to place the correct genre within its three best guesses, while it predicted the right genre first guess upward of 20 percent of the time.

Unfortunately, the pair didn’t research how well humans do at the classification task (which is relatively straightforward for a genre like cookery books, but tougher when it comes to broader genres like biographies or memoirs). However, the results of the algorithm show significantly better results than just a random guess.

“The idea came from our previous work with font and document recognition,” Iwana said. “We are particularly interested in pushing the field of machine learning into tasks that traditionally require human feelings, such as impression and design.”

There are multiple possible applications for this research. It could, for instance, be used to help classify digitized books in cases where labelled data is lacking. It could also (creative-minded designers beware!) be used to help find “rules” that more easily visually describe what a book is about — helpful for both machines and bookstore-browsing humans alike.

Longer term, it even opens up the possibility of algorithms being able to generate cover concepts by themselves.

“Our work shows that it’s possible to use machines to learn the relationship between book covers and genre,” Iwana concluded. “This can lead to tools used to help authors design book covers or to automate genre prediction. It’s one step closer to bringing machine learning into the field of design.”

Luke Dormehl
Former Digital Trends Contributor
I'm a UK-based tech writer covering Cool Tech at Digital Trends. I've also written for Fast Company, Wired, the Guardian…
The U.S. finally put its foot down on AI image copyright
Théâtre D'opéra Spatial, a Midjourney image that won first prize in a digital art competition

AI generated works of art may be eligible to win awards at state fairs, but they are not protected under American copyright law, according to new guidance issued by the U.S. Copyright Office (USCO) on Wednesday.

The report details ways in which AI-generated video, images, and text may, and may not, be copyright protected. It finds that while generative AI is a new technology, its outputs largely fall under existing copyright rules meaning that no new laws will need to be enacted to address the issue. Unfortunately for AI content creators, the protections that are available are thin.

Read more
Perplexity’s new AI agent can perform multi-step tasks on your Android device
Running Perplexity on OnePlus Pad 2.

Perplexity announced Thursday that it is beginning to roll out an agentic AI for Android devices, called Perplexity Assistant, which will be able to independently take multi-step actions on behalf of its user.

"We are excited to launch the Perplexity Assistant to all Android users," Perplexity CEO Aravind Srinivas wrote in a post to X on Thursday. "This marks the transition for Perplexity from an answer engine to a natively integrated assistant that can call other apps and perform basic tasks for you."

Read more
Everything you need to know about AI agents and what they can do
a hollow man under light

The agentic era of artificial intelligence has arrived. Billed as "the next big thing in AI research," AI agents are capable of operating independently and without continuous, direct oversight, while collaborating with users to automate monotonous tasks. In this guide, you'll find everything you need to know about how AI agents are designed, what they can do, what they're capable of, and whether they can be trusted to act on your behalf.
What is an agentic AI?
Agentic AI is a type of generative AI model that can act autonomously, make decisions, and take actions towards complex goals without direct human intervention. These systems are able to interpret changing conditions in real-time and react accordingly, rather than rotely following predefined rules or instructions. Based on the same large language models that drive popular chatbots like ChatGPT, Claude, or Gemini, agentic AIs differ in that they use LLMs to take action on a user's behalf rather than generate content.

AutoGPT and BabyAGI are two of the earliest examples of AI agents, as they were able to solve reasonably complex queries with minimal oversight. AI agents are considered to be an early step towards achieving artificial general intelligence (AGI). In a recent blog post, OpenAI CEO Sam Altman argued that, “We are now confident we know how to build AGI as we have traditionally understood it,” and predicted, "in 2025, we may see the first AI agents ‘join the workforce’ and materially change the output of companies.”

Read more