TL;DR OpenAI has unveiled GPT-4o mini, a scaled back model variant based on GPT-4o.

The new model is smarter, more knowledgeable, and cheaper than GPT-3.5 Turbo.

Free ChatGPT users stand to benefit the most as ChatGPT will now use the GPT-4o mini model instead of GPT-3.5 Turbo.

Ever since ChatGPT’s release in late 2022, the chatbot has relied on the GPT-3.5 language model for its responses by default. That finally changes today as OpenAI has announced a new GPT-4o mini model that replaces GPT-3.5 Turbo as the base ChatGPT model. As its name suggests, GPT-4o mini is a scaled back variant of the full-fledged GPT-4o model released this May. Despite the apparent size reduction, GPT-4o mini delivers better responses than GPT-3.5 Turbo for cheaper.

OpenAI’s announcement says that GPT-4o mini performs universally better than the GPT-3.5 Turbo it replaces. In benchmarks shared by the AI startup, the new lightweight model delivers anywhere from 10% to 70% better scores.

The biggest accuracy jumps can be observed in math, coding, and chain-of-thought or reasoning benchmarks. This should result in fewer hallucinations and more grounded responses from the chatbot. It also sports a more recent knowledge cut-off date of October 2023.

Even though GPT-4 released in early 2023, access to it was locked behind the ChatGPT Plus paywall for about a year. It wasn’t until the release of GPT-4o in May 2024 that free users could use a GPT-4 class model with additional features like web browsing for up-to-date information.

If you’re a free user today, ChatGPT will offer a handful of GPT-4o responses before switching to a less sophisticated (and cheaper to operate) language model. We’ve seen several AI companies adopt a similar strategy with their flagship products — Anthropic’s Pro Search and Claude 3.5 Sonnet offer similarly limited responses for free every few hours.

Today’s announcement means that ChatGPT will continue to offer a limited number of GPT-4o responses for free. After you exhaust that limited quota, however, the chatbot will now default to GPT-4o mini instead of GPT-3.5 Turbo.

While OpenAI hasn’t revealed how it achieved GPT-4o mini’s impressive performance, it’s clearly much more efficient than any of its previous models.

The new language model will cost developers 60% less than GPT-3.5 Turbo, or about 60 cents per one million output tokens and 15 cents per one million input tokens. GPT-4o mini is also an order of magnitude cheaper than GPT-4o, which starts at $5 per one million tokens. Each token is approximately one English word — the term refers to the smallest unit of data understood by language models and can vary depending on the language.

