Gemini 3 Flash is smart — but when it doesn’t know, it ma...

AI Platforms & Assistants

Gemini 3 Flash is smart — but when it doesn’t know, it makes stuff up anyway News By Eric Hal Schwartz published 22 December 2025

Google's latest AI model would rather bluff than admit confusion

When you purchase through links on our site, we may earn an affiliate commission. Here’s how it works.

(Image credit: Google) Share Share by:

Copy link
Facebook
X
Whatsapp
Reddit
Pinterest
Flipboard
Threads

Share this article 0 Join the conversation Follow us Add us as a preferred source on Google

Gemini 3 Flash often invents answers instead of admitting when it doesn’t know something
The problem arises with factual or high‑stakes questions
But it still tests as the most accurate and capable AI model

Gemini 3 Flash is fast and clever. But if you ask it something it doesn’t actually know – something obscure or tricky or just outside its training – it will almost always try to bluff its way through, according to a recent evaluation from the independent testing group Artificial Analysis.

It seems Gemini 3 Flash hit 91% on the “hallucination rate” portion of the AA-Omniscience benchmark. That means when it didn’t have the answer, it still gave one anyway, almost all the time, one that was entirely fictional.

AI chatbots making things up has been an issue since they first debuted. Knowing when to stop and say I don't know is just as important as knowing how to answer in the first place. Currently, Google Gemini 3 Flash AI doesn’t do that very well. That's what the test is for: seeing whether a model can differentiate actual knowledge from a guess.

Hallucination Honesty

Part of the issue is simply that generative AI models are largely word-prediction tools, and predicting a new word is not the same as evaluating truth. And that means the default behavior is to come up with a new word, even when saying "I don't know" would be more honest.

Get daily insight, inspiration and deals in your inboxContact me with news and offers from other Future brandsReceive email from us on behalf of our trusted partners or sponsorsBy submitting your information you agree to the Terms & Conditions and Privacy Policy and are aged 16 or over.

OpenAI has started addressing this and getting its models to recognize what they don’t know and say so clearly. It’s a tough thing to train, because reward models don’t typically value a blank response over a confident (but wrong) one. Still, OpenAI has made it a goal for the development of future models.

And Gemini does usually cite sources when it can. But even then, it doesn’t always pause when it should. That wouldn’t matter much if Gemini were just a research model, but as Gemini becomes the voice behind many Google features, being confidently wrong could affect quite a lot.

There’s also a design choice here. Many users expect their AI assistant to respond quickly and smoothly. Saying “I’m not sure” or “Let me check on that” might feel clunky in a chatbot context. But it’s probably better than being misled. Generative AI still isn't always reliable, but double-checking any AI response is always a good idea.

Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews, and opinion in your feeds. Make sure to click the Follow button!

And of course you can also follow TechRadar on TikTok for news, reviews, unboxings in video form, and get regular updates from us on WhatsApp too.

TOPICS AI

Eric Hal SchwartzSocial Links NavigationContributor

Eric Hal Schwartz is a freelance writer for TechRadar with more than 15 years of experience covering the intersection of the world and technology. For the last five years, he served as head writer for Voicebot.ai and was on the leading edge of reporting on generative AI and large language models. He's since become an expert on the products of generative AI models, such as OpenAI’s ChatGPT, Anthropic’s Claude, Google Gemini, and every other synthetic media tool. His experience runs the gamut of media, including print, digital, broadcast, and live events. Now, he's continuing to tell the stories people want and need to hear about the rapidly evolving AI space and its impact on their lives. Eric is based in New York City.

Show More Comments

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.

Logout Read more Dumber AI

Think you can trust ChatGPT and Gemini to give you the news? Here's why you might want to think again

I tested Gemini 3 Flash against Google Search Gemini on a smartphone.

Google launches Gemini 3 Flash — and claims it's as fast as using traditional Search Google Gemini on Android Auto

Thinking of taking a break from ChatGPT? Google Gemini is ready to impress A screenshot of the Google AI Mode main page

Should you use Google AI Mode or is boring old Search better? ChatGPT-5.1 vs Gemini 3

I tested Gemini 3 and ChatGPT 5.1 head-to-head on what really matters Latest in AI Platforms & Assistants Humanoid HMND 01 Alpha Bipedal

This gift-wrapping robot is quite funny, actually Multiple personalities

ChatGPT’s new settings range from corporate calm to chaotic bestie Claude Chrome

Claude's Chrome extension delivers convenience at the cost of privacy white robot hand extending to you

A robot just learned 1,000 tasks in a single day — and it’s a big deal for everyday AI

Pinned chats in ChatGPT are here – and so is a mildly annoying restriction Microsoft 50th Anniversary event Windows 95 and Mustafa on stage

Microsoft AI CEO admits Gemini 'can do things that Copilot can’t do' Latest in News Crucial X9 Pro

Large External SSDs are now cheaper than internal ones as 4TB SATA SSD face extinction due to negligible price difference Mullvad VPN app logo on screen

Mullvad VPN boosts WireGuard speeds and stability with new Rust-based engine Pirate skull cyber attack digital technology flag cyber on on computer CPU in background. Darknet and cybercrime banner cyberattack and espionage concept illustration.

Pirate skull cyber attack digital technology flag cyber on on computer CPU in background. Darknet and cybercrime banner cyberattack and espionage concept illustration.

HPE tells customers to patch OneView immediately as top-level security flaw spotted Equal Justice Under Law engraving above entrance to US Supreme Court Building

Equal Justice Under Law engraving above entrance to US Supreme Court Building

Federal judge blocks Louisiana’s social media age verification law – here's why Security padlock and circuit board to protect data

Motherboards from Gigabyte, MSI, ASUS, ASRock at risk from new UEFI flaw attack - here's what we know Lara Croft in Tomb Raider: Legacy of Atlantis.

'We put the most pressure on ourselves' — Tomb Raider studio head on remaking one of the most iconic games of all time LATEST ARTICLES

1Gemini 3 Flash is smart — but when it doesn’t know, it makes stuff up anyway
2Watch out, Nvidia - Qualcomm acquires Alphawave Semi in latest addition to its AI data center push
3TechRadar Gaming's favorite gaming devices of 2025: personal picks from all the year's gear
4Large External SSDs are now cheaper than internal ones as 4TB SATA SSD face extinction due to negligible price difference
5Arm sheds billions in market capitalization after Qualcomm hints at RISC-V adoption with Ventara Micro acquisition

Gemini 3 Flash is smart — but when it doesn’t know, it makes stuff up anyway

Hallucination Honesty

Related Articles

Google confirms Gemini will fully replace Assistant on phones in 2026

I’ve seen Avatar: Fire and Ash, here’s why it’s the best film in the franchise

It may be entry-level, but Mercedes’ CLA is an EV game changer