Google Agent Gives AI Crawling a Clearer Identity

For years, we have viewed the web through a relatively simple lens: there are humans who visit our sites to consume content, and there are bots that visit to index it. We’ve built our entire infrastructure, from SEO strategies to security firewalls, around this binary. But that boundary is dissolving.

The shift isn't just about AI writing content or summarizing pages; it's about AI acting as a proxy for the human. We are moving from an era of "search and click" to an era of "delegate and receive." When a user asks an AI to handle a task for them, the AI becomes the visitor. This changes the fundamental relationship between the website owner and the visitor, and Google's latest move makes this shift official. This connects with Agentic Web Is Splitting into Two Bets when the same signal needs a clearer operating decision. The same pattern also shows up in Agent Transactions Change AI Visibility, where the practical question is how the signal becomes visible.

The Arrival of the AI Agent

On March 20, 2026, Google updated its official list of web fetchers with a new entry that deserves our attention. It isn't another crawler designed to map the web, nor is it a training bot designed to feed a Large Language Model. It is called Google Agent.

To understand what this is, we have to distinguish it from Googlebot. Googlebot is the librarian; it moves through the web continuously, indexing pages so they can be found in search results. Google Agent, however, is more like a personal assistant. It only visits a website when a human specifically asks it to perform a task.

Imagine a user asking their AI assistant to research a specific product, compare pricing across three different stores, or even fill out a registration form. In these scenarios, Google Agent is the entity that actually lands on the page. It is the technical manifestation of a user's intent. We are already seeing this in action with Project Mariner, Google's experimental AI browsing tool, which serves as the first primary product utilizing this user agent.

This distinction is subtle but critical. One is an autonomous process of discovery (crawling), and the other is a targeted action on behalf of a person (agency). When the visitor is an agent, the goal isn't to "index" the page for the future, it's to "use" the page in the present.

Why Robots.txt No Longer Applies

This is where things get complicated for those of us who rely on traditional access controls. For decades, robots.txt has been the standard way to tell bots, "You are not welcome here." However, Google has categorized Google Agent as a user triggered fetcher.

This category isn't new, it includes tools like Feedfetcher for RSS, NotebookLM for document analysis, and Google Read Aloud for text to speech. The common thread is that a human initiated the request. Google’s logic here is straightforward: if you type a URL into a browser like Chrome, the browser fetches the page regardless of what the robots.txt file says. Because Google Agent is acting as the user's proxy, Google believes it should operate under the same rules as a human browser.

Consequently, Google Agent generally ignores robots.txt directives. If you have blocked AI bots in your robots file, you might find that Google Agent is still visiting your pages because it views itself as a representative of a human user, not an autonomous crawler.

It is worth noting that Google is taking a different path here than other AI labs. For instance, OpenAI's ChatGPT User and Anthropic's Claude User also act as user triggered fetchers, but they generally respect robots.txt. If you block them, the AI will typically tell the user it cannot access the page. Google has made a different call, prioritizing the user's ability to access information via their agent over the site owner's robots.txt preferences.

For website owners, this creates a significant gap in control. If you have content that absolutely must be restricted from AI agents, you can no longer rely on a simple text file in your root directory. You will need to implement server side authentication or more strong access controls, the same kind of tools you would use to keep an unauthorized human visitor out of a private area of your site.

Cryptographic Identity and Web Bot Auth

While the user agent string is the visible part of this update, there is a deeper, more technical development happening in the background. Google is experimenting with a protocol called Web Bot Auth, using the identity https://agent.bot.goog.

To understand why this matters, we have to acknowledge a fundamental flaw in how the web works: user agent strings are easily spoofed. Anyone can tell their script to identify as "Googlebot" or "Google Agent" to bypass security filters. This has led to a constant arms race between scrapers and firewall administrators.

Web Bot Auth, an IETF draft standard, attempts to solve this by introducing a digital passport for bots. Instead of just claiming to be an agent, the bot holds a private key and publishes a public key in a directory. Every HTTP request is cryptographically signed. When the request hits your server, the website can verify the signature and know with absolute certainty that the visitor is who they claim to be.

Google isn't the only player here, companies like Amazon (via AgentCore Browser), Cloudflare, and Akamai already support this protocol. However, Google's adoption provides the critical mass necessary for this to become a web standard. As AI agent traffic increases, the ability to distinguish between a legitimate AI agent acting for a real human and a malicious scraper pretending to be an agent becomes a security necessity.

What This Means For Your Website

We are now looking at a three tier model of web traffic. Understanding these tiers matters for anyone managing a digital presence:

Crawlers: (e.g., Googlebot, GPTBot) These are autonomous. Their goal is indexing and training. They are governed by robots.txt. Agents: (e.g., Google Agent, ChatGPT User) These are proxies. Their goal is task completion (comparing prices, booking appointments, researching). They may ignore robots.txt. Humans: The end users who interact with the browser directly.

Because agents are task oriented, they interact with your site differently than a crawler does. A crawler wants to read your text; an agent might try to fill out a contact form or navigate a multi step checkout process. This introduces several practical requirements for site owners.

Practical Takeaways for Site Management

1. Audit Your Logs
Start monitoring your server logs for the user agent string containing compatible; Google Agent. You need to understand how often these agents are visiting, which specific pages they are targeting, and what they are attempting to do. This data will tell you if your site is being "used" by AI assistants more than it is being "read" by humans.

2. Review CDN and Firewall Rules
Many security tools are configured to aggressively block any traffic that doesn't look like a standard browser. If your firewall is too strict, you might be blocking Google Agent before it even reaches your server. If you want your site to be accessible to AI assistants, ensure that Google's published IP ranges are permitted.

3. Test Your Functional Flows
Since agents can submit forms and navigate processes, you should test your critical paths. If your booking or checkout flows rely on complex JavaScript patterns or non standard interactions that confuse automated systems, the agent will fail. While the agent is sophisticated, semantic HTML and clear labels remain the best way to ensure these "proxy visitors" can successfully complete tasks for your customers.

4. Re evaluate Access Control
Accept that robots.txt is no longer a complete solution for access control. For any content that is truly sensitive or proprietary, move toward authenticated access or server side restrictions.

The Hybrid Web is Already Here

Not long ago, the idea of AI agents browsing the web alongside humans was a theoretical prediction discussed at tech conferences. Now, it has a formal user agent string, a set of published IP ranges, and a cryptographic identity protocol. It is no longer a prediction; it is a logged reality.

The web hasn't split into two separate versions, one for humans and one for machines. Instead, they have merged. Every page you publish now serves both audiences simultaneously. The challenge for us is to stop thinking of "bots" as a monolithic group of intruders or indexers, and start seeing them as a new class of visitor.

Google Agent is a signal that the web is becoming a place of execution, not just a place of information. When the "visitor" to your site is an AI acting on behalf of a customer, the quality of your technical infrastructure becomes just as important as the quality of your content. The hybrid web is here, and it's time we started optimizing for it.

The Arrival of the AI Agent

Why Robots.txt No Longer Applies

Cryptographic Identity and Web Bot Auth

What This Means For Your Website

Practical Takeaways for Site Management

The Hybrid Web is Already Here

Related posts

Google, Microsoft Back Draft AI Agent Discovery Spec

Google’s Agent Friendly Checklist Is the Accessibility Audit Restated

What Google’s UCP Tells Us About Agent Ready Websites

B2B Brands Rank in Google but Appear in Just 3% of AI Overviews

Comments