Affiliate links on Android Authority may earn us a commission. Learn more.
What is Midjourney AI and how does it work?
What if you could conjure an image straight out of your imagination? You now can within a matter of minutes, thanks to AI image generators like Midjourney. It doesn’t matter if you lack artistic skills or haven’t even held a paintbrush in your life. Artificial intelligence can create digital art within a matter of seconds — all you need is a bit of text that describes the image you have in mind. But how does Midjourney actually work? Here’s everything you need to know.
What is Midjourney?
Midjourney is an example of generative AI that can convert natural language prompts into images. It’s only one of many machine learning-based image generators that have emerged of late. Despite that, it has risen to become one of the biggest names in AI alongside DALL-E and Stable Diffusion.
With Midjourney, you can create high-quality images from simple text-based prompts. You don’t need any specialized hardware or software to use Midjourney either as it works entirely through the Discord chat app. The only downside? You’ll have to pay at least a little bit before you can start generating images. That’s unlike much of the competition, which generally provides at least a few image generations for free.
Still, the barrier to entry with Midjourney is quite low and anyone can use it to generate real-looking images within a matter of minutes. The results can range from uncanny to visually stunning, depending on the prompt.
Midjourney can generate stunning and convincing-looking images from a simple text description.
In some cases, images from Midjourney have even deceived experts in photography and other domains. Likewise, you may have seen some extremely convincing AI-generated images on social media. Examples range from Pope Francis dressed in a puffer jacket to Trump supposedly getting arrested days before the actual event. But we’ve also seen some creative generations like a Star Wars scene in the style of Wes Anderson (pictured above).
Unlike DALL-E, which is backed by ChatGPT’s creator OpenAI, Midjourney describes itself as a self-funded and independent project. Moreover, it hasn’t received any external funding to date. On the other hand, OpenAI has raised as much as $10 billion from Microsoft and a handful of other investors. So given Midjourney’s humble roots, its results are quite impressive.
How does Midjourney work?
We don’t know everything about Midjourney’s inner workings because it’s closed-source and runs on proprietary code. That said, we know enough about the underlying technology to offer a general explanation.
Midjourney relies on two relatively new machine learning technologies, namely large language models and diffusion models. You may already be familiar with the former if you’ve used generative AI chatbots like ChatGPT. A large language model first helps Midjourney understand the meaning of the words you type into your prompts. This is then converted into what is known as a vector, which you can imagine as a numerical version of your prompt. Finally, this vector helps guide another complex process known as diffusion.
Midjourney uses a diffusion model to turn random noise into beautiful art.
Diffusion has only become popular within the past decade or so, which explains the sudden barrage of AI image generators. In a diffusion model, you have a computer gradually add random noise to its training dataset of images. Over time, it learns how to recover the original image by reversing the noise. The idea is that with enough training, such a model can learn how to generate entirely brand-new images.
So what does it look like from the perspective of an AI image generator? When you enter a text prompt like “white cats set in a post-apocalyptic Times Square,” it starts off with a field of visual noise. You can think of this first step as equivalent to television static. The image doesn’t look like anything you’ve asked for at this point. However, a trained AI model then uses latent diffusion to subtract the noise in steps. Eventually, it will yield a picture that resembles objects and ideas in the real world.
As a side note, this is also why you typically need to wait a minute or two for an AI-generated image to fully develop. If you stop the process earlier, you’ll get a noisy image that hasn’t gone through enough denoising steps.
How much does Midjourney cost?
While we’ve seen chatbots like ChatGPT and Bing Chat offer nearly unlimited usage for free, the same cannot be said for image generators. Virtually all of them have some limits in place, with Midjourney not even offering a free trial. This is because each image generation task requires a lot of computing power, specifically graphics processing units (GPUs). Furthermore, each GPU has finite video memory, which is used in large amounts for the denoising process.
So with that in mind, it’s not surprising that a state-of-the-art AI image generator will cost you some money. We have a dedicated guide on Midjourney’s pricing and subscription tiers, but you’ll have to pay a minimum of $10 per month. That nets you 3.3 hours of GPU time, good for roughly 200 image generations. The most expensive plan, meanwhile, gets you 60 hours of fast GPU time at $120 per month.
Midjourney costs a minimum of $10 per month, but you'll find better value in the higher-end plans.
Midjourney’s higher-end plans grant you unlimited images in Relaxed mode, but you’ll have to wait as long as 10 minutes. If you don’t need the absolute best quality, we recommend checking out the many Midjourney alternatives. While most free options haven’t caught up to Midjourney yet, they’re still plenty of fun to use.
Midjourney was trained on existing image samples, including art from various sources, to generate brand-new pictures. Some artists believe that AI image generators have infringed on their copyright by using their work for training. However, the other side argues that the training process falls under the category of fair use.
No, Midjourney cannot create a full video. But if you only want a process video of Midjourney’s image generation process, you can add the –video parameter to the end of your prompts.
Midjourney uses a machine learning technique known as diffusion, but it’s unclear if it’s partially based on the open-source Stable Diffusion model.
No, Midjourney is a closed-source and proprietary tool developed by a San Francisco-based research startup. It aims to turn profitable.
Midjourney is owned by an independent research firm with the same name. The image generator was founded in San Francisco by David Holz, who also co-founded the hand-tracking company Leap Motion a decade prior.