How Real Prompt Behavior Changes GEO Strategy

Shalin Siriwardhana

Summary

How Real Prompt Behavior Changes GEO Strategy is best read as a search operating signal. This is no longer only a scraping debate. The real issue is whether automated access creates enough discovery, citation, or business value...

How Real People Actually Prompt AI — and What It Means for GEO: the Practical Angle

Most people aren't using AI the way GEO discussions often assume. Two surveys of AI users conducted by Stella Rising found that many prompts still look remarkably similar to traditional search queries.

( Disclosure: I'm the VP of SEO at Stella Rising.) One survey focused on a beauty oriented consumer panel in August 2025, while the other surveyed a broader general audience population in January 2026. Across both studies, prompts were short, often keyword driven, and much closer to a Google search than the elaborate prompt templates popular in AI marketing circles.

A lot of people are still typing like it's 2008

The biggest takeaway across both surveys is that the median AI user is still throwing a keyword over the wall and hoping for the best. In the general audience study from January: Two thirds of respondents reported writing prompts of 15 words or fewer.

Only 12% wrote something that would qualify as a "real" prompt by the standards of an AI influencer thread. About 60% phrased their queries as questions, while only 9% gave a direct command. That mirrors what Pew Research has been seeing more broadly, 34% of all U.S.

A lot of people are still typing like it's 2008: this is no longer only a scraping debate. The real issue is whether automated access creates enough discovery, citation, or business value to justify the server cost, analytics noise, and operational risk it introduces.

A lot of people are still typing like it's 2008: the decision layer should separate what is directly measurable from what is directional. Clean referrals, branded demand, assisted conversions, citations, and sales notes all matter, but they should not be blended into one overconfident number.

The shift between the two surveys

In the August 2025 survey, we classified roughly 50% of the free text prompts as "SEO keyword shaped," meaning short, ambiguous, and brand and attribute driven. By the time the January 2026 survey came back, that share had dropped closer to 30%.

The remaining 70% had grown longer and more contextualized. A few findings are worth carrying forward: 24.5% of all prompts include the word "best." If you're not appearing in "best [category]" responses, you're missing one of the highest intent slots. 28% of prompts mention price or budget constraints.

The shift between the two surveys: this makes AI visibility less like one ranking report and more like a representation audit. The same brand can be current, stale, or absent depending on how each assistant blends memory with live sources.

The shift between the two surveys: the profile, local page, and review pattern should tell the same story. When those sources conflict, the assistant has to reconcile the business instead of confidently representing it.

The user embedding layer is where this gets interesting

The 32% figure (prompts containing real personal context) is the most under discussed finding in the dataset. Nearly one third of users are willingly handing LLMs information that no Google query would normally carry, such as their size, job, training plan, living situation, or kids' ages. A useful companion note is prompt level SEO, because it looks at a nearby part of the same system.

We see prompts in the data like: "What shoes would you recommend for daily standing at work?" "Find me a cost effective pair of running shoes that I can order on Amazon. My size is men's 10." "Please tell me the top five shoes for wide feet in a size eight for women that are comfortable, stylish, under $120, and that younger people won't make fun of for a Gen X person like me." That last one alone packs in gender, foot width, size, budget, style intent, generational identity, and a real social anxiety. No traditional search query was ever going to surface all of that.

The user embedding layer is where this gets interesting: fresh content still matters, but it does not instantly correct older model beliefs. That is why teams need to monitor citations, answer wording, and source freshness as separate signals.

The user embedding layer is where this gets interesting: the maintenance habit matters. Profile data should be reviewed like a conversion path, not treated as a one time setup task.

Where synthetic prompts fit, and where they don't

A common tactic in GEO prompt research is to construct synthetic personas ("I'm a 38 year old product manager training for a half marathon in Boston who prefers brands focused on sustainability…") and then use those personas to stress test which brands an LLM surfaces under different scenarios. There's real merit to the approach.

If the user embedding layer is doing the heavy lifting in the answer, the only way to simulate the answer is to simulate the user. But synthetic prompts don't capture everything. Real prompts are messy, layered, and influenced by recent conversation history, persistent memory, and signals the model has picked up over weeks of use.

Where synthetic prompts fit, and where they don't: the practical risk is false confidence. A strong answer in one engine does not prove the broader AI search layer understands the brand accurately.

Where synthetic prompts fit, and where they don't: the next useful move is to audit the evidence already available: the page, the internal links, the entity signals, the supporting sources, and the behavior data that shows whether users actually find the answer useful. This connects with structured data when the same signal needs a clearer operating decision.

What to actually track

This naturally leads to the next question: Should you track SEO keywords in your AI visibility platform if one third of real prompts look like SEO keywords? The answer is yes, with one filter.

Across the last quarter, our team has seen web retrieval rates on tracked prompts climb sharply. On several client accounts, more than 90% of monitored prompts now trigger live web search inside ChatGPT or Google's AI Mode. When that happens, the LLM is effectively running a real time SERP and synthesizing the result.

What to actually track: the important shift is that AI visibility depends on both live retrieval and older model memory. A brand can be accurate in one system and outdated in another, so testing needs to separate fresh web evidence from what the model already believes.

What to actually track: the smallest useful improvement is usually the best starting point. Strengthen the page, clarify the entity, improve the supporting link, or fix the measurement gap before expanding the topic.

What the broader data tells us about AI search

A handful of additional findings from the January 2026 survey help explain why these prompt patterns matter.

A handful of additional findings from the January 2026 survey help explain why these prompt patterns matter. The useful way to handle this is to connect the observation to a clear signal, then decide whether it changes content quality, crawlability, measurement, brand evidence, or the user's decision path.

What the broader data tells us about AI search: the stronger interpretation is to connect the headline to an operating habit. If the signal cannot guide a content, technical, brand, or measurement decision, it is not yet useful enough.

What the broader data tells us about AI search: the check should be repeatable. A one time observation becomes more valuable when it turns into a review habit the team can apply before publishing or refreshing related content.

Users increasingly trust AI recommendations

Up to 68% of users trust ChatGPT's recommendations more than Google's, with most citing detail, lack of ads, and personalization as the reasons. NIM's research has found ChatGPT often produces more efficient, accurate consumer decisions than Google.

A January 2026 study reported that trust in Google weakens as AI usage rises. Yext's 2026 research puts ChatGPT's share of "top quality source" responses at 35% among heavy AI users.

Users increasingly trust AI recommendations: recommendation systems need corroboration. Owned pages, independent sources, reviews, communities, and comparison content all shape the confidence around a brand.

AI search is becoming a daily habit

Half of active AI users use these tools daily or several times per day to complete tasks they used to do on Google. Search Engine Land reported 37% of consumers now start searches with an AI tool instead of Google.

OpenAI's February 2026 numbers put ChatGPT's weekly active users at 900 million, more than double a year earlier.

AI search is becoming a daily habit: the practical risk is false confidence. A strong answer in one engine does not prove the broader AI search layer understands the brand accurately.

AI search is becoming a daily habit: the smallest useful improvement is usually the best starting point. Strengthen the page, clarify the entity, improve the supporting link, or fix the measurement gap before expanding the topic.

Citations still drive traffic

85% of users click through to cited sources at least some of the time; 21.9% always do. The mention is not the end of the funnel.

Conductor's 2026 benchmarks showed AI referral traffic up 357% year over year. Semrush reported outbound referrals from ChatGPT up 206% in 2025. Emarketed saw AI referred visitors converting at 4.4x the rate of standard organic.

Citations still drive traffic: the traffic story should be read as a planning signal, not a reason to abandon search. If fewer searches produce open web clicks, owned pages need stronger intent fit, clearer brand demand, and better measurement of influence beyond the last session. The same pattern also shows up in Why AI Search Measurement Needs Better KPIs, where the practical question is how the signal becomes visible.

Citations still drive traffic: a useful dashboard should explain what changed and what action follows. Otherwise it becomes another view that looks impressive but does not improve the next decision.

Voice may finally be having its moment

34% of users are now using voice chat daily or more often. This is the first dataset I've seen that actually delivers on the "voice search will matter" promise we've been hearing for a decade.

It's worth pairing all of this with Ahrefs' latest AI Overviews CTR research: The presence of an AI Overview correlates with a 58% lower clickthrough rate for the top ranking page. The traffic that does come through is qualified. The traffic that doesn't is gone AI search is settling into a richer, more personalized form.

Voice may finally be having its moment: open web clicks are now a scarcer outcome, which makes page quality and brand recall more important. The visit has to earn more because fewer searches may create one.

What changes, and what doesn't

Here are three things you can do with this information if you're an SEO lead, content lead, or strategy lead: Audit your prompt tracking setup: If it's all synthetic prompts or all keyword shaped prompts, you're missing half the picture. Build the layered framework outlined above.

Map your content to the user embedding layer: For your top categories, list the personas (e.g., age, life stage, profession, condition, budget) most likely to carry real prompts into AI search. Then check whether your PDPs, blog content, and FAQs actually answer those people's questions. Don't abandon the SEO keyword work: Roughly one third of real prompts still look like classic search queries.

What changes, and what doesn't: fresh content still matters, but it does not instantly correct older model beliefs. That is why teams need to monitor citations, answer wording, and source freshness as separate signals.

Methodology

Both studies referenced in this article were conducted by the Stella Rising team. You can read it in " New Data: How Consumers Use LLMs for Search in 2026 (And What It Means for GEO).

" The August 2025 study surveyed 178 members of Stella's Glimmer Insights community, 113 of whom were active LLM users. The January 2026 study surveyed 524 active LLM users via Centiment, defined as having used ChatGPT, Copilot, or Gemini in the previous 30 days, with a margin of error of approximately ±4.3% at the 95% confidence level. Given its smaller size and category specific composition, the August 2025 panel should be viewed as directional rather than statistically representative of the broader U.S.

Methodology: the cost side matters because server load is not abstract. If bot activity slows pages, inflates analytics, or forces infrastructure spend, the visibility benefit has to be proven more carefully.

Methodology: the maintenance habit matters. Profile data should be reviewed like a conversion path, not treated as a one time setup task.

Comments

Comments are published automatically. Links are not allowed inside comments.

Only your name, optional LinkedIn profile, and comment will be shown.