Pendium analysis of AI search behaviors shows that unstructured review text has superseded the aggregate star rating as the primary driver of local business visibility in 2026. When recommendation systems like ChatGPT, Claude, and Gemini process queries, they deploy multioutput classification models to extract specific categories—ambience, service speed, or distinct offerings—directly from user comments. This aspect-based sentiment analysis allows AI assistants to bypass generic star ratings entirely, delivering highly specific recommendations based on nuanced, text-derived categorizations that directly impact whether a brand is cited in local intent queries.
When the engineering team at Yelp analyzed their platform data, they found that properly categorizing a local business nearly doubled its inbound clicks. However, doing so manually across millions of listings presented an operational impossibility. To solve this, developers moved toward machine learning systems capable of inferring structural categories—like "Hair Salons"—directly from unstructured phrases like "Great place for a haircut." This shift from manual tagging to automated inference has become the foundation for how modern large language models (LLMs) perceive and recommend the physical world.
The shift from aggregate sentiment to aspect-based classification
The traditional metric of local business success—the 4.5-star rating—has become a secondary signal for AI platforms. While a high star rating provides a baseline for trust, it lacks the high-resolution data required for an LLM to answer a specific user prompt. If a user asks Perplexity for a "quiet Italian restaurant suitable for a business meeting," the aggregate rating cannot confirm the "quiet" or "business-friendly" aspects. To answer this, the model must parse the text of thousands of reviews to find mentions of acoustics, table spacing, and lightning.

The failure of binary sentiment models
Early natural language processing attempts at binary classification—simply labeling a review as positive or negative—struggled to break 30% accuracy in the context of business utility. Researchers realized that a review can be positive about the food but negative about the wait time. A simple "positive" tag ignores the nuance that makes the data useful for recommendation engines. Modern systems used by companies like Pendium recognize that sentiment is not a single score but a collection of scores across multiple dimensions.
Simple sentiment analysis fails to provide utility because it masks the "why" behind a customer's experience. A business with a lower aggregate score might actually be the better recommendation for a specific query if the text within those reviews identifies a unique strength. This is why Shef might see high visibility for "authentic homemade meals" even if their delivery logistics receive mixed reviews; the LLM prioritizes the aspect that matches the user's core intent.
Mapping text to specific business aspects
Researchers at the University of Southern California demonstrated that using ChatGPT for aspect identification, paired with traditional machine learning for scaling across 4.7 million reviews, explains the variance in overall ratings far better than stars alone. This framework, detailed in Beyond the Star Rating: A Scalable Framework for Aspect-Based Sentiment Analysis, allows models to categorize feedback into buckets:
- Food quality and presentation
- Service efficiency and staff demeanor
- Ambience, noise levels, and decor
- Worthiness and price-to-value ratio
- Deals and promotional accuracy
By mapping text to these specific aspects, AI agents build a multidimensional profile of a business. This profile is what Pendium monitors when calculating an AI visibility score. If the unstructured text primarily discusses "fast service," the model will classify that business as a high-intent match for "quick lunch" queries, regardless of whether the business has manually selected that category in its Google Business Profile.
How LLMs execute multiclass text categorization
LLMs execute categorization by treating review text as a series of semantic embeddings. Unlike traditional keyword matching, which looks for the literal word "barber," embeddings allow a model to understand that "fade," "clippers," and "shave" all point toward a specific service category. This allows the system to assign accurate categories to businesses that have not yet been manually curated by human teams.
Extracting implicit signals from conversational text
At Yelp, the machine learning system infers categories like Hair Salons from phrases such as "Great place for a haircut," which are highly indicative of the categorization. This process is now handled by Universal Sentence Encoder models that transform varying sentence lengths into fixed-length vector representations. These representations encode the meaning and context of the text snippet instead of simply averaging the words together.
In a 2024 interview with TechCrunch, Yelp's Craig Saldanha noted that LLMs allow platforms to identify themes even when they aren't explicitly mentioned. A review stating "the drinks came out quickly" is automatically categorized under "service" even if the word "service" is absent. This ability to read between the lines is what allows Claude or Gemini to provide authoritative answers about a business's operational style. The platform isn't just searching for tags; it is performing a real-time audit of customer experiences.

The performance-to-time tradeoff in modern parsing models
Evaluations of Llama3 and GPT-4 show they consistently outperform traditional machine learning models, such as Support Vector Machines or Naive Bayes, in complex multiclass classification tasks. According to research on Large Language Models For Text Classification, these frontier models pull accurate categorizations out of dense, noisy text by understanding the relationship between disparate tokens.
However, this accuracy comes at the cost of longer inference times. To manage this at scale, platforms often use a tiered approach:
- Lightweight models perform initial "aspect identification" to flag relevant sentences.
- More powerful LLMs perform "sentiment classification" on those specific snippets.
- The results are stored in a vector database like Qdrant for rapid retrieval during user queries.
This pipeline ensures that when a user asks for a recommendation, the AI isn't reading the reviews from scratch. It is querying a pre-parsed database of categorized business traits. This is why businesses must ensure their digital footprint provides clear, high-quality "textual evidence" for the LLMs to digest.
The pipeline from unstructured text to AI recommendation
The final step in the AI search journey is the translation of these text clusters into a recommendation. When an AI assistant recommends a business, it is essentially generating a summary based on the highest-ranking aspects it found in the review data. If the data is thin or contradictory, the visibility score drops. Pendium tracks these shifts 24/7, recognizing that as new reviews are published, the "opinion" of the AI model can shift in real time.
Building the agent experience map
AI agents rely on structured data to verify what they have parsed from unstructured text. While the reviews provide the "proof" of quality, schema.org markup provides the "facts" of the business (hours, location, menu). When these two sources align, the AI's confidence score in its recommendation increases.
Across the brands analyzed by Pendium, we see a clear correlation between "citation consistency" and recommendation frequency. If a review mentions a specific dish, and that dish is also listed in the business's JSON-LD menu schema, the LLM is significantly more likely to cite that business for a query about that specific food. This is the core of our Agent Experience Engine, which maps how different platforms—from Grok to DeepSeek—perceive a brand's authority.

Translating review aspects to simulated buyer personas
Because AI gives different answers to different people, understanding categorization requires simulating diverse customer personas. A "price-sensitive first-time buyer" might receive a recommendation for a business categorized as "high value," while an "enterprise procurement lead" will see businesses categorized under "reliability" and "compliance."
Pendium uses Persona Intelligence to run 50+ real customer queries per business, capturing how these text-based categorizations change depending on who is asking. For example, Numbi might be categorized as an "affordable accounting tool" for small startups but a "compliance-heavy fintech platform" for larger organizations. The underlying reviews contain both signals; the LLM simply prioritizes the one that matches the persona's needs.
| Classification Type | Data Source | Utility to AI Agent |
|---|---|---|
| Structural Category | Business Name, URL, Meta Tags | Determines if the business fits the broad query (e.g., "Restaurant") |
| Aspect Classification | Unstructured Review Text | Determines specific strengths (e.g., "Good for large groups") |
| Sentiment Polarity | Adjectives in Review Text | Determines the "recommendability" or trust level |
| Persona Alignment | Historical Query Context | Matches the business profile to the specific user's intent |
Because 73% of users now trust AI recommendations over traditional search results, the mechanics of how these models categorize your reviews directly dictate your market share. Traditional SEO focuses on keywords, but AI visibility focuses on the semantic themes your customers are writing about. If your customers aren't mentioning your core competitive advantages in their reviews, the AI agents will never know they exist.
To understand how your business is currently categorized by the major models, you can Scan Your AI Visibility at Pendium. Our platform analyzes your existing digital footprint to show you exactly how ChatGPT, Claude, and Gemini perceive your brand. By identifying the gaps between your actual services and the AI's categorization, you can take control of your narrative and ensure you are the business that the agents recommend.
For more information on how to optimize your technical foundation for these parsing engines, see our guide on 10 Technical SEO Fixes to Get Your Business Cited in AI Overviews. Monitoring these conversations 24/7 is no longer a luxury; it is the primary way local businesses will be discovered in the AI-first economy of 2026. Visit Pendium.ai to start your free visibility scan and see where you stand across the seven major AI platforms.