Loading...
New

ZonGuru Managed Services + FREE Growth Audit:

Amazon COSMO (Common Sense Knowledge Generation and Serving System) is Amazon's AI-powered knowledge graph that maps the real-world relationships between products, customer intentions, and shopping contexts - enabling Amazon's search and recommendation systems to understand why customers buy, not just what they search for. Presented by Amazon researchers at the ACM SIGMOD 2024 conference in Santiago, Chile, COSMO spans 18 major product categories, contains millions of knowledge assertions generated by large language models and validated through human-in-the-loop annotation, and has already demonstrated a 60% improvement in search relevance (macro F1) and a 0.7% sales uplift during A/B testing on 10% of US traffic - a figure that translates to hundreds of millions of dollars in annualised revenue. For Amazon sellers, COSMO represents the shift from keyword-based discoverability to intent-based discoverability: listings that explain who a product is for, what problem it solves, and in what real-world situation it belongs will outperform listings that simply match search terms.

This guide explains how COSMO works under the hood, how it relates to Amazon's Rufus AI shopping assistant and the traditional A9/A10 algorithm, what data sources it draws from, and what sellers need to do differently in response.

What Does COSMO Stand For and What Is Its Purpose?

COSMO stands for Common Sense Knowledge Generation and Serving System. It is a large-scale AI framework developed by Amazon's applied science team - led by researchers Changlong Yu and Zheng Li (Applied Scientists, Amazon Stores), in collaboration with Professor Yangqiu Song at the Hong Kong University of Science and Technology (HKUST) - to bridge the gap between how traditional e-commerce search algorithms understand products and how real humans think about their purchases.

The core problem COSMO solves is straightforward. Traditional e-commerce knowledge graphs catalogue factual product attributes: brand, colour, size, material, weight. These are useful for matching exact queries like "Nike Air Max size 10 black." But they fail to capture the commonsense reasoning that drives most real shopping decisions. When a customer searches for "shoes for a wedding," they mean formal, hard-soled footwear - not trainers. When a pregnant woman searches for shoes, she needs slip-resistant options, even if she never types the words "slip-resistant" or "safety." These are commonsense inferences that any human shop assistant would make instantly, but that keyword-based search engines cannot.

COSMO exists to encode these commonsense connections at industrial scale. It builds a vast, interconnected knowledge graph of entity-relation-entity triples - such as <slip-resistant shoes, used_for_audience, pregnant women> or <camera case + screen protector, capable_of, protecting camera> - and feeds this contextual intelligence into Amazon's search relevance, product recommendation, and navigation systems. The COSMO research paper, published in the Companion Proceedings of SIGMOD 2024, describes it as the first industry-scale knowledge system to leverage large language models for constructing high-quality commonsense knowledge graphs in e-commerce.

How Does Amazon COSMO Work? The Technical Architecture Explained

COSMO operates through a four-stage pipeline that converts raw customer behaviour data into structured commonsense knowledge. Understanding this pipeline is essential for sellers because it reveals exactly what types of information COSMO values - and, by extension, what your listings need to communicate.

Stage 1: User Behaviour Sampling. COSMO begins by analysing two types of customer behaviour data from the Amazon Store: search-buy pairs (what customers searched for and subsequently purchased) and co-buy pairs (products purchased together in the same shopping session). The system samples millions of these behaviour pairs across Amazon's most popular product categories. To reduce noise, COSMO applies initial filtering - for example, removing co-purchase pairs where the two products belong to categories that are too far apart in Amazon's product taxonomy, which would suggest the co-purchase was coincidental rather than intentional.

Stage 2: LLM-Powered Knowledge Generation. The filtered behaviour pairs are fed to large language models, which are prompted to explain the commonsense relationship between the query and the purchase (or between co-purchased products). The LLM uses four foundational relationship types drawn from ConceptNet, a well-established commonsense knowledge base: usedFor, capableOf, isA, and cause. From the initial outputs, Amazon's researchers identify recurring finer-grained relationships and codify them using canonical formulations: used_for_function, used_for_event, used_for_audience, used_for_location, and others. The LLM is then re-prompted using these more specific relationship templates to produce higher-quality assertions.

Stage 3: Human-in-the-Loop Filtering and Refinement. LLMs frequently generate empty or circular rationales - explanations like "customers bought them together because they like them." COSMO addresses this through a multi-layered quality control pipeline. First, rule-based filtering removes incomplete or grammatically broken outputs. Second, semantic similarity filtering removes cases where the LLM's answer is simply paraphrasing the input question. Third, a representative subset is sent to human annotators, who evaluate each assertion on two criteria: plausibility (is the inferred relationship reasonable?) and typicality (is the target product commonly associated with the query or source product?). These human judgements are then used to train machine learning classifiers that score the remaining candidates, keeping only those above a quality threshold.

Stage 4: Instruction Tuning and Scaled Generation (COSMO-LM). The filtered, human-validated knowledge is used to create instruction-tuning datasets. Amazon fine-tunes an efficient language model - dubbed COSMO-LM - on approximately 30,000 annotated instructions. This COSMO-LM can then faithfully generate millions of high-quality commonsense knowledge assertions at scale, expanding the knowledge graph across all 18 major categories on Amazon without requiring proportional human annotation effort. The result is a knowledge graph comprising millions of entity-relation-entity triples assembled from this process.

What Types of Knowledge Does COSMO's Graph Contain?

COSMO's knowledge graph encodes relationships that go far beyond traditional product attributes. The graph uses a structured set of relationship types, each capturing a different dimension of how products connect to human needs and real-world contexts. Based on the SIGMOD 2024 paper and Amazon Science's technical blog, the 15 key relationship categories include:

used_for_function - What practical purpose does the product serve? Example from the paper: a long-sleeve puffer coat is capable_of providing high-level warmth. The paper's Table 2 lists examples across categories including "dry face," "hold snacks," and "build a fence." This is the most direct product-to-need mapping.

used_for_audience - Who is the intended user? Example: slip-resistant shoes are used_for_audience pregnant women. This is the relationship type Amazon Science uses most frequently to illustrate COSMO's capabilities, because it captures knowledge that keyword search simply cannot infer.

used_for_event - What occasion or event is the product associated with? Example: formal hard-soled shoes are used_for_event attending a wedding. This allows COSMO to surface products for event-driven queries like "gift for baby shower" or "supplies for camping trip" even when listings don't contain those exact phrases.

used_in_location - Where is the product typically used? The paper's Table 2 lists "bedroom" as an example tail for this relationship type. This enables location-aware product matching.

capable_of - What can the product do or enable? Example: a co-purchase of a camera case and screen protector is capable_of protecting a camera. This powers COSMO's co-purchase recommendation logic.

isA - Taxonomic classification beyond Amazon's standard browse node hierarchy. The paper's example is that a product is_a "normal suit" - classifying it by concept or product type rather than by rigid category tree alone.

cause - What outcome does the product lead to? This captures causal chains that drive purchasing intent.

The practical significance of these relationship types is that they define the commonsense dimensions COSMO uses to connect products to customer intentions. COSMO builds its knowledge graph primarily from customer behaviour data (search-buy and co-purchase patterns), not by directly parsing your listing text. However, product catalogue information (titles, descriptions, and attributes) is used as input when COSMO's knowledge is applied to downstream tasks like search relevance scoring. Listings that clearly communicate who a product is for, what it does, and where and when it's used provide better raw material for both COSMO's behaviour-based inference and the relevance models that consume COSMO's output.

How Does COSMO Relate to Amazon Rufus?

COSMO and Rufus are complementary systems that operate at different layers of Amazon's AI search stack - though it is important to note that Amazon has not explicitly confirmed a direct technical integration between them. The COSMO paper (SIGMOD 2024) does not mention Rufus, and Rufus documentation does not reference COSMO by name. However, both systems serve Amazon's search and product discovery infrastructure, and the architectural logic strongly suggests they are connected. Understanding their likely relationship helps sellers think about optimisation holistically.

COSMO is the knowledge layer. It builds and maintains the commonsense knowledge graph - the structured web of relationships between products, customer intentions, use cases, audiences, events, and locations. COSMO does not interact with customers directly. It operates behind the scenes, feeding contextual intelligence into Amazon's search relevance models, recommendation engines, and navigation systems.

Rufus is the conversational interface. Launched in February 2024 and now available to over 300 million customers across the US, UK, Germany, France, Italy, Spain, Canada, and India, Rufus is the generative AI shopping assistant that customers interact with directly. Built on Amazon Bedrock using multiple LLMs (including Anthropic's Claude Sonnet and Amazon Nova), Rufus draws on Amazon's product catalogue, customer reviews, community Q&As, and other data sources to generate conversational product recommendations.

The most likely relationship is a two-layer system:

COSMO builds the understanding: "A pregnant woman searching for shoes probably needs slip-resistant options."

Rufus delivers the recommendation: "Based on your search, here are some highly-rated slip-resistant shoes that other expectant mothers have found comfortable."

Both systems are designed to close the same gap - between what customers type and what they actually need - which makes their integration architecturally logical even if Amazon has not publicly documented it. For sellers, the practical takeaway is the same regardless: listings that are rich in contextual information about who, what, why, where, and when serve both COSMO's knowledge graph construction (which is behaviour-driven) and Rufus's retrieval-augmented generation (which is content-driven).

How Does COSMO Differ From Amazon's Traditional A9/A10 Algorithm?

COSMO differs from Amazon's A9/A10 search algorithm in three fundamental ways, though it is important to understand that COSMO does not replace A9/A10 - it augments it.

From keyword matching to intent understanding. A9 (and its A10 evolution) is fundamentally a lexical matching system. It indexes the words in product listings and matches them against the words in a customer's search query. Ranking factors include keyword relevance, conversion rate, sales velocity, and review quality - but the foundational logic is text-string matching. COSMO operates on a different paradigm entirely: it maps the intent behind the search to the purpose of the product, regardless of whether the exact keywords overlap. A customer searching for "furniture for a small apartment" through A9 sees listings containing those keywords. COSMO understands that the customer likely wants multi-functional, space-saving furniture - a sofa bed, a storage ottoman - even if those products are not listed under "small apartment furniture."

From product-centric to customer-centric taxonomy. A9 organises products using Amazon's browse node hierarchy - a rigid, product-centric taxonomy (Electronics > Accessories > Camera Cases). COSMO introduces what the researchers call "multi-turn navigation" - a dynamic, customer-centric approach where search refinements adapt to the inferred intent of the session. A search for "camping" might lead to "air mattress," which refines to "camping air mattress," then COSMO offers contextual subtypes: lakeside camping, mountain camping, 4-person camping. This navigation tree is constructed from COSMO's knowledge graph rather than from Amazon's static category structure.

From broad application to targeted deployment. A critical nuance that many seller-education articles miss: COSMO is not applied uniformly to every search. Amazon's own research states that COSMO is most valuable for broad or ambiguous queries where there is a semantic gap between what the customer typed and what they actually need. Specific, unambiguous queries - "Apple AirPods Pro 2nd generation" - are still handled primarily by A9's keyword matching and ranking logic. COSMO's power is greatest in the long-tail and conversational query space where customer intent is implicit rather than explicit.

A9/A10 has not been switched off. Keyword relevance, conversion rates, sales velocity, and review quality remain foundational ranking signals. COSMO adds a semantic intelligence layer on top of this foundation. The SIGMOD paper reports deployment on approximately 10% of US traffic for search navigation; Amazon has not publicly confirmed the current scope of deployment beyond that initial test.

What Data Sources Does COSMO Use?

COSMO draws on several data sources to construct and refine its knowledge graph, and understanding these sources helps sellers identify which touchpoints matter most for optimisation.

Search-buy behaviour pairs. The primary input: what customers searched for and what they subsequently purchased within a defined time window or number of clicks. This is COSMO's strongest signal for inferring purchase intent.

Co-purchase behaviour pairs. Products purchased together in the same shopping session. COSMO uses these to infer complementary relationships - why a customer who bought a camera case also bought a screen protector, and what common purpose both purchases serve.

Product catalogue data. The SIGMOD paper describes how, for search relevance tasks, product titles, descriptions, and attributes are concatenated into a single text span and fed into the relevance models alongside COSMO's generated knowledge. This means the text content on your listing contributes to how COSMO's knowledge is applied to your product - though the knowledge graph itself is built from behaviour data, not from directly parsing listing text.

A+ Content and Enhanced Brand Content. Amazon's published COSMO research does not explicitly confirm whether COSMO parses A+ Content modules. The SIGMOD paper describes COSMO processing product titles, descriptions, and attributes - standard catalogue fields. However, given that Amazon's broader AI search infrastructure (including Rufus) uses retrieval-augmented generation across product page data, it is reasonable to assume that richer on-page content contributes to the overall data environment these systems draw from - even if a direct COSMO-to-A+ Content pathway is unconfirmed.

Customer reviews and Q&A. The COSMO paper does not list customer reviews or community Q&A as direct inputs to its knowledge graph construction pipeline. COSMO's knowledge is built from search-buy and co-purchase behaviour patterns. However, Amazon's Rufus assistant separately uses retrieval-augmented generation that draws on reviews and Q&A content, and the language customers use in reviews - describing who uses a product, in what situations, and why - likely influences the broader search ecosystem. Reviews remain a valuable signal in Amazon's overall discovery infrastructure, even if their role in COSMO specifically is unconfirmed.

Image content. Amazon's published COSMO research does not describe image analysis as part of the knowledge graph construction pipeline. COSMO's architecture, as documented in the SIGMOD 2024 paper, is text-based - processing product titles, descriptions, attributes, and customer behaviour data. Amazon does use computer vision across its broader platform, and Rufus draws on product page data including images through retrieval-augmented generation. Sellers should treat product imagery as a contributing signal to Amazon's wider AI ecosystem, but a direct COSMO-to-image-analysis pathway has not been confirmed.

What Were COSMO's Performance Results?

COSMO's published performance results, drawn from the SIGMOD 2024 paper and Amazon's A/B testing program, demonstrate significant impact across three areas:

Search relevance improvement. In experiments using the Amazon Shopping Queries Data Set (created for KDD Cup 2022), COSMO-enhanced cross-encoder models achieved a 60% increase in macro F1 score over the best baseline when encoders were frozen (meaning the only variable was the addition of COSMO knowledge graph data). When encoders were fine-tuned, the COSMO model still maintained a 28% edge in macro F1 and a 22% edge in micro F1 over baselines. These are substantial margins in information retrieval research, demonstrating that commonsense knowledge provides complementary information that even task-specific fine-tuning cannot fully replicate.

Session-based recommendation. COSMO-GNN (a graph neural network extension incorporating COSMO's knowledge) achieved up to a 5.82% improvement in Hits@10 on session-based recommendation benchmarks in the electronics domain, and 4.05% in clothing. The paper notes that electronics sessions involved more diverse and complex search sequences (averaging 2.47 unique queries per session versus 1.36 for clothing), suggesting that COSMO's commonsense knowledge is most beneficial when customers revise their searches multiple times - indicating harder-to-express intent.

Live A/B testing. In Amazon's online A/B experiment on 10% of US traffic, COSMO produced an 8% increase in navigation engagement rates and a 0.7% increase in product sales. A 0.7% sales uplift sounds modest in isolation, but the paper's authors note it translates to hundreds of millions of dollars in annual revenue. The paper further states that extending COSMO to all US traffic for navigation alone could generate a revenue increase "in the billions."

When Is COSMO Used - And When Is It Not?

COSMO is not applied to every search query on Amazon. Understanding when COSMO's knowledge graph is activated helps sellers prioritise which types of listings and content need the most contextual optimisation.

COSMO is most active for broad, ambiguous, or intent-driven queries. Searches like "gifts for someone who likes cooking," "furniture for small apartments," "shoes for a wedding," or "supplies for camping" - where the customer's underlying need is not fully expressed in the query text - are where COSMO adds the most value. These queries have a significant semantic gap between the words typed and the products that would actually satisfy the customer's intent.

COSMO is less active for specific, unambiguous queries. A search for "Instant Pot Duo 7-in-1 6 quart" is already precise enough for A9 to handle effectively. There is no semantic gap to bridge, no commonsense inference required. COSMO's processing resources are not wasted on queries where keyword matching alone produces strong results.

COSMO powers search navigation refinements. Even when not directly altering ranked results, COSMO drives the dynamic category refinement suggestions that appear on search results pages - the "you might also want to filter by..." prompts that guide customers toward more specific intent.

The practical implication: sellers competing in categories where customer queries tend to be broad and intent-driven (Home & Kitchen, Clothing, Sports & Outdoors, Baby Products, Patio & Garden) stand to gain the most from COSMO-aligned optimisation. Sellers in categories where queries are typically brand- and model-specific (Electronics, Video Games) will see a smaller COSMO effect - though the long-tail and gifting use cases within these categories still benefit.

What Does COSMO Mean for Amazon Sellers?

COSMO's deployment changes the optimisation equation for Amazon sellers in several concrete ways:

Listings need to communicate context, not just features. The single most important strategic shift is moving from feature-specification listings to intent-rich listings. COSMO's knowledge graph encodes relationships between products and human needs, activities, life situations, and real-world contexts - derived from analysing customer behaviour patterns. When the relevance models that consume COSMO's knowledge evaluate your product against a customer query, your listing's text (titles, descriptions, attributes) is concatenated into a single input alongside COSMO's generated knowledge. A bullet point that says "BPA-free plastic" provides one signal. A bullet point that says "BPA-free plastic makes this safe for daily use with hot beverages - no chemical leaching even when microwaved" provides richer context that aligns with COSMO's used_for_function relationship type and gives relevance models more to work with.

Answer the implicit question behind the search. COSMO's knowledge graph is structured around understanding why customers buy products. The relationship types it encodes - who is this product for, what does it do, when and where would someone use it, what occasion or event does it suit, what lifestyle or life situation does it serve - are the same questions your listing should answer. Listings that address these questions in titles, bullet points, descriptions, and A+ Content are better positioned within the relevance models that use COSMO's knowledge as an input, because the product's text representation will align more closely with the commonsense intent COSMO has mapped for the relevant query.

Customer reviews and Q&A contribute to Amazon's broader AI ecosystem. While the COSMO paper does not list reviews or Q&A as direct inputs to its knowledge graph, Amazon's wider search and recommendation infrastructure - including Rufus - does draw on this content. If your reviews consistently mention that your product is "perfect for new moms" or "great for camping trips," this language feeds into the data environment that Amazon's AI systems use to understand your product's real-world context. Actively cultivating detailed, context-rich reviews - and responding thoughtfully to Q&A questions - remains a sound discovery strategy even if its specific pathway through COSMO is unconfirmed.

Image content supports the broader AI discovery ecosystem. While the COSMO paper does not describe image analysis in its pipeline, Amazon's wider search infrastructure uses computer vision across product pages. Product images that clearly depict the product in context - being used by the target audience, in the target environment, for the target purpose - provide additional signals to Amazon's AI systems beyond COSMO. Infographic images with clear, legible text overlays communicate information that both customers and AI models can interpret.

Product categorisation precision matters more. COSMO's paper describes using Amazon's browse node hierarchy (product categories) as a foundational input for sampling products and behaviour pairs during knowledge graph construction. Products are sampled from specific browse nodes, and co-purchase pairs are filtered based on category proximity. A product in the wrong category may be excluded from relevant behaviour samples or paired with unrelated products during COSMO's knowledge generation process. Precise categorisation ensures your product sits in the correct neighbourhood of COSMO's knowledge graph.

A+ Content likely contributes to AI-driven discovery. While Amazon's published COSMO research does not confirm that A+ Content modules are parsed by the knowledge graph pipeline specifically, A+ Content forms part of the product detail page that Amazon's broader AI systems - including Rufus - can access. A+ Content that includes detailed use-case scenarios, audience descriptions, and contextual imagery provides additional information that can serve Amazon's AI discovery ecosystem, even if the direct COSMO pathway is unconfirmed.

Frequently Asked Questions About Amazon COSMO

Has Amazon officially announced COSMO's integration into its search algorithm? Amazon has not made a formal public announcement equivalent to the Rufus launch. The information about COSMO comes primarily from the SIGMOD 2024 research paper published by Amazon's applied science team and the accompanying Amazon Science blog post. The paper reports that COSMO has been deployed in Amazon search navigation applications and tested on approximately 10% of US traffic. Amazon has not publicly disclosed whether deployment has expanded beyond that initial scope.

Does COSMO replace the A9/A10 algorithm? No. COSMO is an additional intelligence layer that augments Amazon's existing search infrastructure. A9/A10's keyword matching, conversion-based ranking, and sales velocity signals remain active. COSMO adds a semantic understanding layer that is particularly influential for broad, ambiguous, and intent-driven queries. Think of it as A9 handling the "what" and COSMO adding the "why."

What is the difference between COSMO and Rufus? COSMO is the backend knowledge graph - the structured database of commonsense relationships between products, intents, audiences, events, and contexts. Rufus is the frontend conversational AI assistant that customers interact with. Although Amazon has not explicitly confirmed a direct technical link between the two systems, both serve Amazon's search and product discovery infrastructure, and their architectures are complementary. Optimising for the principles behind both - contextual richness and intent clarity - serves sellers regardless of the exact internal wiring.

How quickly does COSMO index new listing content? Amazon has not published official indexing timelines for COSMO. The SIGMOD paper describes an asynchronous cache-based deployment architecture where COSMO-LM processes queries in batches and refreshes daily, rather than generating knowledge in real time at query time. The paper explicitly acknowledges that this architecture is "limited in processing real-time information, such as flash sales." This batch-processing approach suggests that changes to listing content or customer behaviour patterns would take some time to flow through into COSMO's knowledge graph, though the exact lag is not specified.

Are keywords still important for Amazon SEO? Yes. Highly relevant, high-search-volume keywords remain essential - particularly for specific, unambiguous queries that A9 handles directly. The change is that keyword relevance alone is no longer sufficient for broad and intent-driven queries. Sellers need both: keyword coverage for A9 and contextual richness for COSMO. The shift is from keyword stuffing to keyword context - using keywords naturally within sentences that also communicate who, what, why, where, and when.

Which product categories are most affected by COSMO? COSMO spans 18 major Amazon product categories, as documented in the SIGMOD paper's Table 3. Categories where customer queries tend to be broad and intent-driven - such as Home & Kitchen, Clothing, Sports & Outdoors, Baby Products, Patio & Garden, and Toys & Games - are most affected because these are the categories where COSMO's commonsense inference adds the most value over traditional keyword matching. Categories with primarily brand-specific and model-specific search behaviour (Electronics, Video Games) see a smaller but still meaningful COSMO effect, particularly for long-tail and gifting queries.

Summary: What Sellers Need to Know About Amazon COSMO

Amazon COSMO is the commonsense knowledge graph that powers the semantic intelligence layer of Amazon's search, recommendation, and navigation systems. It was presented at the ACM SIGMOD 2024 conference, demonstrated a 60% improvement in search relevance and a 0.7% live sales uplift on 10% of US traffic, and is architecturally complementary to Amazon's Rufus AI shopping assistant - though a direct technical integration between the two has not been publicly confirmed by Amazon.

The core strategic shift for sellers is from keyword-centric listings to intent-centric listings. COSMO's knowledge graph encodes relationships between products and real-world human contexts - functions, audiences, events, locations, and causes - derived from customer behaviour patterns. Listings that communicate contextual information (titles, descriptions, attributes, and likely A+ Content) are better served by the relevance models that use COSMO's knowledge, because the product's text representation aligns more closely with the commonsense connections COSMO has mapped. Sellers who only optimise for keyword matching will find their approach increasingly incomplete as Amazon's AI-driven search infrastructure continues to mature.

No items found.

Share this Article

No items found.

Share this Article

Get Started

Start Using ZonGuru

Discover opportunities. Maximize your sales. Grow your Amazon business!

COSMO Transformation Service

Amazon’s Algo Has Changed. Get Your Listings AI-Mapped.

COSMO Transformation ServiceClaim Limited Offer

Free COSMO Readiness Report

Discover How “AI-Ready” Your Amazon Listing Really Is.

Free COSMO Readiness ReportAccess FREE Now