Back in May 2023, Google revealed it was taking two of its research teams — Brain Team and DeepMind — and putting them together to create a single unit called Google DeepMind. This new team would be responsible for working on Google’s next-generation AI model, Gemini . The firm is now launching three versions of Gemini, with two being made available starting today.

In a blog post, Google officially introduced its new AI architecture, Gemini. Described as having state-of-the-art performance, Google states that Gemini was built from the ground up to be multimodal. As the company explains:

Until now, the standard approach to creating multimodal models involved training separate components for different modalities and then stitching them together to roughly mimic some of this functionality. These models can sometimes be good at performing certain tasks, like describing images, but struggle with more conceptual and complex reasoning. We designed Gemini to be natively multimodal, pre-trained from the start on different modalities. Then we fine-tuned it with additional multimodal data to further refine its effectiveness. This helps Gemini seamlessly understand and reason about all kinds of inputs from the ground up, far better than existing multimodal models — and its capabilities are state of the art in nearly every domain.

Gemini will reportedly come in three different sizes to run efficiently for varying needs. The largest and most capable version is called Gemini Ultra and it’s said to be designed for highly complex tasks. Below that is Gemini Pro, which is designed to be used across a range of devices. The third version of the AI — Gemini Nano — is meant to be the most efficient model for on-device tasks. Google says it optimized three different sizes for the first version of Gemini, which could mean other sizes could eventually come in the future.