What is an AI token?

A presenter at Google IO shows information on a new AI project. — Google

Google recently announced that Gemini 1.5 Pro would increase from a 1 million token context window to 2 million. That sounds impressive, but what in the world is a token anyways?

Contents

What is an AI token?
How do AI tokens work?
What are the different types of AI tokens?
What are the benefits of tokens?

At its core, even chatbots need help processing the text they get so they can understand concepts and communicate with you in a human-like fashion. This is accomplished using a token system in the generative AI space that breaks down data so it is more easily digestible by AI models.

An infograph highlighting Gemini's 1 million token long context window capability. — Google

An AI token is the smallest unit a word or phrase can be broken down into when being processed by a large language model (LLM). Tokens account for words, punctuation marks, or subwords, which allow models to efficiently analyze and interpret text and, subsequently, generate content in a similar unit-based fashion. This is similar to how a computer will convert data into Unicode zeros and ones for easier processing. Tokens allow a model to determine a pattern or relationship within words and phrases so they can predict future terms and respond in the context of your prompt.

When you input a prompt, the phrase and words are too long for a chatbot to interpret as is – they must be broken down into smaller pieces before the LLM can even process the request. They are converted into tokens, then the request is submitted and analyzed, and a response is returned to you.

The process of turning text into tokens is called tokenization. There are many tokenization methods, which can differ based on variants, including dictionary instructions, word combinations, language, etc. For example, the space-based tokenization method splits words up based on the spaces between them. The phrase “It’s raining outside” would be split into the tokens ‘It’s’, ‘raining’, ‘outside’.

How do AI tokens work?

The general token conversion breakdown followed in the generative AI space denotes that one token equals approximately four characters in English — or 3/4 of a word — and 100 tokens equals approximately 75 words. Other conversions suggest one to two sentences equals about 30 tokens, one paragraph equals about 100 tokens, and 1,500 words equals about 2,048 tokens.

Whether you’re a general user, a developer, or an enterprise, the AI program you’re using is employing tokens to perform its tasks. Once you begin paying for generative AI services, you’re paying for tokens to maintain the service at its optimum level.

Most generative AI brands also have basic rules around how tokens function on their AI models. Many companies have token limitations, which put a cap on the number of tokens that can be processed in one turn. If the request is larger than the token limit on an LLM, the tool won’t be able to complete a request in a single turn. For example, if you input a 10,000-word article for translation into a GPT with a 4,096-token limit, it won’t be able to process it fully to give a detailed answer because such a request would require at least 15,000 tokens.

However, companies have quickly been advancing the capabilities of their LLMs, adding to the token limitation with new versions. Google’s research-based BERT model had a maximum input length of 512 tokens. OpenAI’s GPT-3.5 LLM, which runs the free version of ChatGPT, has a max of 4,096 input tokens, while its GPT-4 LLM, which runs the paid version of ChatGPT, has a max of 32,768 input tokens. This equates to approximately 64,000 words or 50 pages of text.

Google’s Gemini 1.5 Pro which provides audio functionality to the brand’s AI Studio has a standard 128,000 token context window. The Claude 2.1 LLM has a limit of up to 200,000 context tokens. This equates to approximately 150,000 words or 500 pages of text.

What are the different types of AI tokens?

There are several types of tokens used in the generative AI space that allow LLMs to identify the smallest units available for analysis. Here are some of the main tokens that are of interest to an AI model.

Word Tokens are words that represent single units on their own, such as “bird,” “house,” or “television.”
Sub-word Tokens are words that can be truncated into smaller units, such as splitting Tuesday into “Tues” and “day.”
Punctuation Tokens take the place of punctuation marks, including commas (,), periods (.), and others.
Number Tokens take the place of numerical figures, including the number “10.”
Special Tokens can note several unique instructions within executing queries and training data.

What are the benefits of tokens?

There are several benefits to tokens in the generative AI space. Primarily, they act as a connector between human language and computer language when working with LLMs and other AI processes. Tokens help models process large amounts of data at once, which is especially beneficial in enterprise spaces that use LLMs. Companies can work with token limits to optimize the performance of AI models. As future LLM versions are introduced, tokens will allow models to have a larger memory through higher limits or context windows.

Other benefits of tokens lie in the training aspects of LLMs. Since they are small units, they can be used to make it easier to optimize the speed of processing data. Due to the predictive nature of tokens, they have a greater understanding of concepts and improve sequences over time. Tokens assist in implementing multimodal aspects such as images, videos, and audio into LLMs alongside text-to-speech chatbots.

Tokens also have some data security and cost-efficiency benefits, due to their Unicode setup protecting vital data and truncating longer text into a simplified version.