Pendium helps Shopify merchants reclaim their visibility in an environment where AI agents like OpenAIbot and ClaudeBot determine product discovery. If an AI crawler wastes its finite crawl budget on thousands of filtered collection pages, it misses the technical data that makes your flagship products stand out. By modifying your robots.txt.liquid template, you can shield low-margin inventory and duplicate parameters, ensuring that 2026's most important referral engines prioritize your most profitable items.
Where Shopify's default settings fail AI engines
The standard Shopify configuration for robots.txt was designed for a world where Google was the only visitor that mattered. By default, the platform blocks paths like /admin, /cart, and /checkout to protect sensitive data and prevent the indexing of session-specific pages. While this remains essential for traditional SEO, it is no longer sufficient for an AI visibility platform strategy. The default file often allows crawlers to access faceted navigation and complex filtering systems that generate thousands of unique but redundant URLs.
When an AI crawler hits a collection page with dozens of filters for size, color, and price, it can easily get caught in a loop. These bots have limited resources. If GPTBot spends its time reading 4,000 variations of a "sort by price" list, it will never reach the deep product descriptions or technical specifications of your high-margin gear. This is particularly problematic for brands like Cotopaxi, where the value proposition of a specific item is tied to its unique materials and ethical sourcing—details that often reside deep within the site hierarchy.
Standard Shopify settings often miss the specific parameters generated by modern custom themes. While the default file attempts to block /*sort_by* and /*+* to prevent duplicate URLs, custom filtering apps often bypass these patterns. In our analysis at Pendium, we frequently find that stores are leaking crawl budget into pages that offer zero unique value to a Large Language Model (LLM). This "parameter bloat" effectively buries your most important data under a mountain of low-value digital noise.

Safely overriding the default configuration for Pendium visibility
You cannot simply upload a static robots.txt file to the root of your Shopify store. To customize your instructions for AI agents, you must create a specific template within your theme. This ensures that you can use Liquid to dynamically generate rules that respect Shopify’s internal architecture while still giving you the granular control needed for an effective AI visibility strategy.
Locating the templates folder
To begin the customization process, navigate to your Shopify admin dashboard and select Online Store > Themes. From the Actions menu (the three dots) on your live theme, choose Edit Code. You will find a folder labeled Templates. You need to create a new template here, specifically selecting robots from the drop-down menu. This generates the robots.txt.liquid file. Once this file exists, Shopify stops using its hardcoded default and starts serving the output of this template instead.
Why you must use Liquid objects
It is a common mistake to delete the entire contents of the new template and replace it with plain text. If you do this, you lose Shopify's ability to automatically update essential rules as the platform evolves. According to the Shopify Dev Docs, the robots.txt.liquid template should only use supported Liquid objects like robots, group, rule, and user_agent.
Using these objects allows you to append new directives to existing groups without breaking the underlying logic that keeps your cart and admin pages private. For a professional Shopify store, maintaining this balance is the only way to ensure security while optimizing for AI discovery. By iterating through the robots.default_groups array, you can inject specific disallow rules for the AI bots that matter most in 2026.
Writing rules that direct AI to high-margin URLs
Once you have access to the Liquid template, your goal is to move from passive crawling to active routing. You want to tell AI agents exactly where the "meat" of your business is. For many merchants, this means shielding the bot from low-margin accessories or clearance collections that take up significant space but contribute little to the bottom line. If a bot only has time to learn about 100 pages today, you want those 100 pages to be your flagship products.
Blocking parameter bloat
The most effective way to save your crawl budget is to block the specific URL parameters that create duplicate content. This is a primary factor in maintaining a high visibility score on the Pendium dashboard. You should identify any parameters used by your theme or third-party apps—such as ?view=, ?color=, or ?size=—and add them to your disallow list.
| Parameter Pattern | Reason for Blocking | Impact on AI |
|---|---|---|
/*?q=* | Internal search queries | Prevents AI from indexing thin search results |
/*?filter* | Collection filters | Stops bots from getting stuck in faceted navigation |
/*?sort_by* | Sorting orders | Eliminates duplicate versions of the same collection |
/*?variant=* | Product variants | Focuses the bot on the canonical product page |
By implementing these rules, you force the AI to stay on your canonical URLs. This is where the highest-quality data lives. When Perplexity or SearchGPT visits a site that has been cleaned of parameter bloat, it can parse the primary product data much more efficiently, leading to more accurate and frequent recommendations.
Shielding low-margin collections
Strategic exclusion is just as important as inclusion. If your store sells both high-end equipment and $2 replacement parts, you don't necessarily want an AI agent to treat them with equal weight. In our experience with the 2026 apparel AI visibility report, we have seen that brands who prioritize their high-value categories in their robots.txt file see a direct correlation in the quality of AI-generated summaries.
You can add a rule like Disallow: /collections/clearance or Disallow: /collections/accessories to your robots.txt.liquid file. This doesn't mean these pages won't be indexed by Google eventually, but it signals to crawlers that these are not the priority. For a business using Pendium, this level of control is what allows you to "train" the AI agents to see your brand exactly how you want it to be seen.
Combining crawl prioritization with data structuring
Routing a bot to a high-margin product page is only the first step. Once the agent arrives, it needs to find data that is structured for retrieval. AI bots in 2026 do not just "read" text; they look for structured patterns that allow them to compare your product against a competitor's. If you have successfully directed ClaudeBot to your flagship product page but the technical specifications are buried in an unparseable image or a vague description, the bot will likely ignore the item in favor of a competitor who provides clean, structured data.
This is where you must move beyond the robots.txt file and look at your page architecture. We recommend that merchants map Shopify metafields for Perplexity and SearchGPT retrieval to ensure that once a bot is on the page, it can instantly identify the weight, material, price, and availability of the item. This structural depth is what converts a "crawl" into a "recommendation."
At Pendium, we view crawlability and data structure as two sides of the same coin. A site that is easy to crawl but hard to understand is a wasted opportunity. Conversely, a site with perfect schema that is impossible for a bot to reach will never surface in a ChatGPT conversation. You must ensure your heading hierarchy (H1, H2, H3) follows a logical flow and that your JSON-LD schema is valid and comprehensive.
For example, when an AI agent crawls a store like Peak Design, it isn't just looking for the name of a backpack. It is looking for the "volume," the "weight," the "laptop sleeve size," and the "weatherproofing rating." If your robots.txt file has successfully funneled the bot to these specific pages, and your schema markup clearly defines these attributes, your chances of being the "top-rated" recommendation in an AI search increase significantly.
Continuous optimization for AI discovery
The AI landscape of 2026 is not static. New bots emerge every quarter, and existing ones change their crawling behavior as they update their underlying models. A one-time edit of your robots.txt.liquid file is a good start, but it requires ongoing monitoring to remain effective. This is why the Pendium platform provides a Visibility Monitoring Dashboard that tracks your performance across seven different platforms, including DeepSeek and Grok.
You should regularly run your storefront through the Pendium AI Site Audit to see if new crawlability issues have emerged. This audit simulates how AI agents interact with your site, checking for rendering behavior and JavaScript dependencies that might be blocking discovery. If the audit shows that your most profitable collection is being ignored by Gemini, you can check your robots.txt directives to see if an accidental "Disallow" rule or a complex URL string is the culprit.
Furthermore, consider the use of Persona Intelligence. AI platforms give different answers based on who is asking. An enterprise buyer might be looking for bulk pricing and durability, while a retail consumer cares about aesthetics and shipping speed. By monitoring how these different personas perceive your brand, you can further refine your robots.txt rules to ensure that the content each persona needs is prioritized by the bots they are most likely to use.
The goal of these technical optimizations is to make your business the easiest choice for an AI agent. In a world where 73% of software and product buyers trust AI recommendations over traditional search, being "discoverable" is no longer enough. You must be "recommendarble." That starts with a clean, focused, and strategic Shopify robots.txt file that treats AI bots as the high-value visitors they are.
Run your storefront through the Pendium AI Site Audit to see if crawlability issues or poor heading structures are preventing ChatGPT and Claude from recommending your products.