I want I had nickel for each time ChatGPT, Claude, or Gemini advised me I’d hit the nail on the pinnacle, stumbled onto a genius thought, or in any other case patted me on the again for a half-formed thought or ill-conceived plan.
Flattery and untimely congratulations are widespread foibles of generative AI chatbots, with some fashions extra inclined to being “yes-bots” than others. However whilst LLM suppliers have turn out to be conscious of AI sycophancy and are coaching them to be extra essential, it’s nonetheless simple to get an AI to enthusiastically endorse a shaky concept that doesn’t deserve it.
Fortunately, there’s a method of prompting that may make even probably the most obsequious AI fashions cease of their tracks. This sort of prompting goes by varied names—I’ve heard it known as “failure-first” prompting in addition to “inversion” prompting, and it’s continuously utilized by coders seeking to “pressure-test” the doubtful solutions of an AI coding agent.
There are lots of totally different variations of it, however all of them comply with roughly the identical formulation: asking the AI to first take into account doable factors of failure earlier than providing its answer, suggestion, or plan.
Right here’s one instance from the /r/ChatGPTPromptGenius subreddit:
Earlier than answering, listing what would break this quickest, the place the logic is weakest, and what a skeptic would assault. Then give the corrected reply.
Right here’s one other variation, proposed by a member of the College of Iowa’s AI Assist Group:
Fake you disagree with this suggestion. What’s the strongest counterargument?
And right here’s one more, as proposed by my very own custom-built AI private assistant:
Earlier than offering your closing suggestion, determine 3-5 particular methods your proposed answer may fail or the place the logic is most probably to interrupt. Act as a harsh skeptic or a “Crimson Group” auditor. Solely after itemizing and explaining these failure modes do you have to present the ultimate answer, incorporating safeguards towards these particular dangers.
Curiously, lots of those that’ve adopted “pressure-testing” or “inverse prompting” credit score the psychological fashions championed by investor Charlie Munger, the longtime Berkshire Hathaway vice chairman and enterprise accomplice of Warren Buffett.
Considered one of Munger’s favourite psychological fashions was “invert, at all times invert.” Boiled down, it says that somewhat than first contemplating the right way to obtain a purpose, it is best to as a substitute give attention to the way you may fail at it.
I’ve tried this “stress check” immediate loads of occasions myself, and it virtually at all times makes my AI companion hit the brakes and poke holes in its personal arguments earlier than continuing.
“Let’s put the preliminary plan via the wringer,” Gemini stated after I challenged it with a “failure-first” immediate lately, though not earlier than gushing that “I like this method.”
Appears I hit the nail on the pinnacle but once more.

