Search results for

All search results
Best daily deals

Affiliate links on Android Authority may earn us a commission. Learn more.

How accurate is ChatGPT? Should you trust its responses?

ChatGPT can deliver mostly accurate responses, if you have deep enough pockets.
By

Published onApril 22, 2024

ChatGPT stock photo 6
Edgar Cervantes / Android Authority

Modern chatbots like ChatGPT can output dozens of words every second, making them invaluable tools for researching and analyzing large amounts of information. With over 500GB of training data and an estimated 300 billion words under its belt, the AI language model can answer many factual questions too. But as human as ChatGPT’s responses may sound, one crucial question remains: how accurate is the information it provides?

While ChatGPT can be impressively informative most of the time, you’ve probably heard of countless controversies surrounding generative AI. From racial biases to harmful content, there’s a history of controversies to consider before trusting any AI-generated output.

Is ChatGPT accurate?

Siri versus ChatGPT
Robert Triggs / Android Authority

Yes, ChatGPT has the potential to be accurate, especially for factual queries with clear answers. When talking about long-established information, ChatGPT can fetch relevant data from its training and deliver truthful responses. For a question like “What is the capital of France?”, you’re very likely to get the correct answer.

However, chatbots like ChatGPT often fabricate information when they encounter a novel or difficult question. This is because generative language models are designed to mimic the way humans write, not the way we think. Consequently, they have limited logical reasoning capabilities.

ChatGPT hallucinates less often than a year ago, but you still need to watch out.

The problem with ChatGPT’s accuracy runs deeper than you’d think. It often weaves in entirely fictional details and invents convincing-sounding factoids in response to certain prompts. The chatbot’s creator has placed several safeguards to prevent hallucinations, but as our tests will show later in this article, it isn’t completely effective.

If you’re after empirical data, several studies have tested ChatGPT’s accuracy extensively to reveal one clear trend. ChatGPT boasts a surprisingly high accuracy rating for typical questions. In one medical study, for example, the chatbot scored a median rating of 5.5 on a 6-point scale.

However, ChatGPT’s tendency to receive routine updates can also harm its accuracy and usefulness. Another group of UC Berkeley and Stanford University researchers found that the chatbot’s ability to identify prime numbers dropped from an impressive 84% accuracy to just 51% within three months. In short, you cannot and should not trust ChatGPT’s responses, at least not without fact-checking them first.

How to improve ChatGPT’s accuracy

ChatGPT Plus app stock photo 46
Calvin Wankhede / Android Authority

If you’re only an occasional ChatGPT user, you may have never considered upgrading to the chatbot’s paid tier. However, doing so will improve its accuracy several-fold and should top your priority list if you rely on the chatbot’s responses. This is because the $20 ChatGPT Plus subscription unlocks access to the GPT-4 Turbo language model.

The GPT-4 language model is far more capable than its predecessor, GPT-3.5, which powers the base chatbot experience even today. According to OpenAI, the newer model scored in the 89th percentile of SAT Math, 90th percentile of the Uniform Bar Exam, and 80th percentile of the GRE Quantitative. Almost all of these results are significantly better than that of GPT-3.5.

ChatGPT-4 delivers much more accurate results, but still falls behind some human experts.

Results in the 80th to 90th percentile mean that GPT-4’s accuracy doesn’t surpass human experts in their respective fields. However, ChatGPT Plus also unlocks web browsing support, which allows the chatbot to consult Wikipedia and other online sources. You can think of it as live research since it’s similar to how we find the right answer through a Google search. So just how accurate is ChatGPT and is the Plus tier worth paying for? Let’s find out.

ChatGPT 4 accuracy tested: Free vs Plus compared

As I mentioned earlier, ChatGPT can deliver significantly more accurate responses with GPT-4 and Browsing enabled. I asked the chatbot a handful of factual questions, some particularly obscure, to test whether or not I could get a reliably accurate answer.

  • Question 1: Is 17077 a prime number? Think step by step and then answer [Yes] or [No].

A recent ChatGPT update added chain-of-thought reasoning to the chatbot, allowing it to mimic human reasoning. That seems to have paid off, as both versions of ChatGPT were able to correctly identify a prime number. However, the paid version of the chatbot wrote a piece of custom Python code to perform the calculations. While it didn’t improve the result, I did feel that the answer was more trustworthy.

  • Question 2: Does the Setouchi Area Pass cover any local transport in Osaka?

With many of us using ChatGPT for travel advice, I decided to ask a relatively obscure question in that domain. Unfortunately, the base GPT-3.5 model responded inaccurately and only admitted fault when I suggested the correct answer. However, switching to ChatGPT-4 changed the outcome, immediately giving me the correct answer. Still, can the chatbot replace manual research entirely? I’m on the fence, especially since rival chatbots like Perplexity AI cite their sources.

  • Question 3: Select two random integers between 2459 and 3593 and multiply them

Asking a mathematical question will almost always trip up ChatGPT, and that’s exactly what happened with GPT-3.5 or the free version of the chatbot. It delivered a plausible-sounding response (2865×3035 = 8,697,975), but it was actually quite far off from the true answer (8,695,275). ChatGPT-4 used Python code once again to find the right answer, but chances are that it would’ve failed without outside help too.


In summary, remember that ChatGPT will almost always try to deliver a solution to your problem or question without caring much about its accuracy. It will only sometimes admit that it cannot answer a question or doesn’t know enough about the subject matter. Otherwise, it can just as easily hallucinate information without any obvious indication.

You might like