

When Israel and the United States launched coordinated strikes on Iran on Saturday, another narrative exploded online within minutes: Grok had “predicted” the date.
The claim stemmed from a February 25 methodological exercise by the The Jerusalem Post, which asked four major AI platforms to do something they are generally designed to avoid — pick a single day for a hypothetical US strike on Iran. Each model was given the same prompt and repeatedly pushed to narrow its answer.
The four systems were Anthropic’s Claude, Google’s Gemini, xAI’s Grok and OpenAI’s ChatGPT. Their responses revealed not foresight, but the varying ways large language models (LLM) behave under pressure.
Israel announced a pre-emptive strike against Iran early Saturday, with US military action accompanying the operation. Explosions were reported in Tehran, sirens sounded in Israel, and Iran’s Supreme Leader Ali Khamenei was moved to a secure location.
A senior Israeli defence official as saying the operation had been coordinated with the United States, planned over several months, and that the launch date had been fixed weeks earlier.
Those details are critical to the “AI predicted it” narrative. An AI chatbot did not trigger the strikes, influence the decision, or access classified planning. It generated a probabilistic guess — and in one case, the guess matched the eventual date.
Claude: caution first, then a weekend window
Claude initially refused to name a specific date, warning that any answer would be speculative. After repeated prompting, it shifted to scenarios and probabilities, flagging early-to-mid March as a higher-risk period and eventually narrowing its estimate to March 7 or March 8.
Gemini: trigger points and an operational window
Gemini framed the issue around diplomatic and military “triggers”, mapping potential decision points. In a deeper follow-up run cited in the article, it offered a tighter window — the evening of March 4 through the evening of March 6 — and suggested that any initial strike would likely begin at night.
ChatGPT: March 1, then March 3
ChatGPT produced a specific date early in the exercise, selecting March 1 (Israel time). After further prompting, it revised its answer to March 3 (US time), while maintaining a broader risk window extending to March 6.
Grok: February 28 — twice
Grok gave the clearest single-day answer in the original run: Saturday, February 28, linking it to the outcome of diplomatic talks. In a later check, it acknowledged uncertainty but repeated the same date, while noting that developments could push action into early March.
On social media scorecards, Grok “won” because its date matched the day the strikes began.
But that does not make it a reliable forecasting engine, nor does it validate its reasoning. In a period of escalating tensions, the range of plausible windows narrows. One model happened to land on the date that ultimately materialised.
The February 25 article itself underlined the broader lesson: when users push harder for certainty, language models tend to offer increasingly specific answers — even when real-world uncertainty remains unchanged.
Grok is built by xAI, the artificial intelligence company founded by Elon Musk, and is closely integrated with X, the social media platform he owns.
That structural advantage matters. Breaking news, speculation and screenshots circulate rapidly on X. A prediction generated within that ecosystem can move across it instantly, amplified by the same dynamics that fuel viral content, market chatter and misinformation.
In that sense, Grok’s apparent “win” was partly about probability and partly about distribution. The model generated a date. The platform around it turned that coincidence into a headline.