Google launches new AI agent shortly after GPT-5.2

Google on Thursday rolled out a fully “reimagined” version of its research agent, Gemini Deep Research, now rebuilt on its flagship foundation model, Gemini 3 Pro.

Unlike earlier versions that were primarily focused on generating long-form research reports, the upgraded Deep Research agent can now be embedded directly into third-party apps. This major shift is enabled by Google’s new Interactions API, designed to give developers far more control in the emerging era of agentic AI.

A Research Agent Built for Massive Context and Complex Tasks

The revamped Deep Research tool is engineered to process huge volumes of information and reason through large, multi-step prompts. According to Google, enterprises are already using the system for high-stakes tasks ranging from corporate due diligence to drug toxicity and safety research.

Google also plans to integrate this deep research capability across its own suite of products, including Google Search, Google Finance, the Gemini app, and NotebookLM. The move signals Google’s vision for a future where users don’t manually “Google” information anymore—their AI agents do it for them.

Minimizing Hallucinations With Gemini 3 Pro

A major part of the upgrade hinges on Gemini 3 Pro, which Google describes as its “most factual” model to date, trained specifically to reduce hallucinations in long-running, multi-step reasoning tasks.

Hallucinations pose a critical challenge for agentic workflows: a single incorrect step in a decision chain can compromise an entire research output. Reducing that risk is central to Google’s pitch for Deep Research as a reliable, end-to-end reasoning agent.

New Benchmark: DeepSearchQA

To demonstrate its progress, Google introduced yet another benchmark—DeepSearchQA—built to evaluate agents on complex, multi-step information-seeking tasks. The company has open-sourced the benchmark for public testing.

It also tested the agent on:

Humanity’s Last Exam—an independent benchmark of obscure and challenging knowledge tasks
BrowserComp—a test for browser-based autonomous agent behavior

As expected, Google’s Deep Research led on Google’s own benchmark and performed strongly on Humanity’s Last Exam. But OpenAI’s ChatGPT 5 Pro remained a surprisingly close competitor and edged out Google on BrowserComp.

But the Benchmarks Became Outdated Within Hours

The competitive comparisons didn’t stay current for long. On the same day Google announced Deep Research, OpenAI launched GPT-5.2, internally codenamed Garlic. OpenAI claims the new model outperforms rivals—including Google—across a suite of standard and proprietary benchmarks.

A Telling Moment in the Google–OpenAI Rivalry

The timing raised eyebrows across the industry. As anticipation builds around OpenAI’s Garlic release, Google strategically dropped a major AI announcement of its own. The move underscores the escalating rivalry between the two companies as they race to define the future of agentic AI.

Google unveils its most advanced AI research agent just hours after OpenAI launches GPT-5.2

A Research Agent Built for Massive Context and Complex Tasks

Minimizing Hallucinations With Gemini 3 Pro

New Benchmark: DeepSearchQA

But the Benchmarks Became Outdated Within Hours

A Telling Moment in the Google–OpenAI Rivalry

Written by Hajra Naz

Iran Tech Expo Goes Viral After ‘Advanced AI Robots’ Turn Out To Be Humans In Costumes

Trump Approves AI Chip Sales to China, Balancing Security and Trade