Skip to content

How product search works ​

MyChatBot's product search is a smart, multi-step search: a single query is matched several ways at once β€” by meaning, by keywords, by category, and (optionally) by product image β€” then blended and reranked so your agent gets one clean, ranked product list back. Read this page when you want the whole flow at a glance before tuning filters, prompts, or image search.

Your agent reaches this search over the Product (commerce) MCP. The catalog it searches is whatever product feed you connected in the app.

In the app: connect a catalog

Product search runs against a product feed you connect under Knowledge base β†’ Products (app.mychatbot.app/knowledge-base). Add or refresh a feed there, and MyChatBot indexes it so your agent can search it. The Connect AI tools button on the Products card gives you the ready-made MCP URL to point an agent (or Claude / Cursor) at that catalog.

Knowledge base β†’ Products: where a catalog feed is connected, refreshed, and exposed via Connect AI tools

There are two ways your agent can search a catalog, chosen per integration by the Agentic Search toggle in the app. You'll find it when you configure a Products, Product Feed, Product Spreadsheet, or Instagram products source. The toggle governs only product search β€” FAQ and document knowledge is always simply referenced, with no modes.

Direct β€” toggle OFF (default)Agentic β€” toggle ON
How it worksOne semantic lookup against your catalog; the matching products are handed straight to the assistantA small search agent works the query in steps β€” discovers a category's filters/attributes, runs several refined searches, and compares β€” before answering
FiltersSupportedSupported, and the agent chooses them itself
Custom search promptβ€”You can give it guidance on how to search your catalog
SpeedFastestA bit slower (it does more work)
Best forStraightforward "find me X" queriesVague, comparative, or attribute-heavy questions on large / complex catalogs

Rule of thumb

Start with direct (toggle off) for speed. Turn on Agentic Search when your catalog is large and attribute-rich, or when customers ask nuanced, multi-constraint questions ("a warm waterproof jacket under $150 for hiking"). The step-by-step pipeline below is a single direct lookup β€” agentic mode simply runs that lookup several times, refining as it goes.

You flip this toggle where you connect the catalog β€” see Getting inventory in. For Instagram catalogs, the same toggle lives in the Instagram products setup. See also the Search tools reference, Filters, attributes & params, and Troubleshooting.

Cheat sheet ​

The search runs top-to-bottom. Each step below operates on the results of the step above it.

#StepWhat it doesSkipped when
1Understand the queryMatches the query both by meaning (semantic) and by exact wordsQuery is code-like (SKU/barcode) or a pasted URL β†’ semantic match dropped so exact matching wins
2Search the whole catalogBlends meaning + product-name + keyword + exact-phrase matches into one ranked listβ€”
3Attribute-filter fallbackRetries a filtered search a different way if the first pass came back emptyNo filters used, or the first pass already returned results
4Pick the best categoriesGroups results by category and ranks the most relevant categoriesβ€”
5Rank inside each categoryRe-runs the same blended search inside each top category, in parallelβ€”
6RerankA relevance model reorders the list for a cleaner top resultenable_reranking=false, ≀1 result, or a code/URL query
7Blend in an image matchMixes a product-image match into the text resultsNo image_url supplied

You build filters β€” the search never parses natural language

semantic_product_search matches the whole query string, but it only enforces the filters your agent constructs. There is no natural-language-to-filter extraction: nothing turns "red", "summer", or "under $50" into constraints for you. Splitting the shopper's sentence into query text plus structured filters is the agent's job β€” see From a shopper sentence to a search call.

How the search index is organized ​

Each catalog you connect gets its own search index. The search reads it three ways:

ViewWhat it holdsUsed by
Whole catalogEvery product β€” matchable by meaning, name, and keywords, plus your filterable attributesSteps 2–4 (blend + category discovery)
Per-category sliceThe same products, scoped to a single leaf categoryStep 5 (per-category ranking)
Image indexA visual fingerprint of each product imageStep 7 (image blending)

Exact names and phrases outrank loose semantic matches

Steps 2 and 5 blend four kinds of match β€” meaning (semantic), product-name keywords, general keywords, and exact phrases. Name and exact-phrase matches are deliberately weighted to win over loosely-semantic ones, which is why a specific product name usually surfaces its exact item at rank 1. MyChatBot tunes this relevance balance for your catalog β€” contact support if you need it adjusted.

Where each filter kind resolves ​

filters operators do not all behave the same way. Know which path a constraint takes before you build it:

Filter kindExample filterWorks on a broad, category-less search?
Equality / membershipcolor eq red, stickers in […]βœ… Yes
Rangeprice lt 50, size gte 40❌ No β€” narrow to a single leaf category first, then filter_category_products

Step by step ​

1. Understand the query ​

The query is matched both semantically (by meaning) and by exact words. Two query shapes skip the semantic step to protect exact matching:

  • Code-like β€” SKUs, barcodes, model numbers.
  • URL-like β€” a pasted product link.

For these, the semantic match is dropped so exact/code matching dominates, and reranking (step 6) is skipped β€” they already have a decisive exact answer.

2. Search the whole catalog ​

The four match types above run against the whole-catalog index and are blended into one list. Any caller-supplied filters are applied as a hard constraint on every match type. On a whole-catalog search, filters match equality / membership only (eq / in). The search fetches extra depth internally so later steps have headroom to rerank.

3. Attribute-filter fallback ​

Defense-in-depth for catalogs indexed before searchable attributes existed. If a filtered search comes back empty or errors, the search retries once using a different attribute path that supports all operators. It only engages when filters are present, so it can never change a healthy, non-empty result.

Filters suddenly return nothing on an older catalog

If a filter that used to work starts returning [] on an older catalog, its searchable attributes were probably indexed before this feature existed. The fallback auto-heals most requests, but the permanent fix is a full re-index β€” not a self-serve toggle. In the app, remove and re-add the product integration (Knowledge base β†’ Products), or contact MyChatBot support with your integration and they'll run a full re-index. See Filters, attributes & params.

4. Pick the best categories ​

Results are grouped by category and each category is scored by how highly its products ranked. Category order is then:

  1. Term-match categories first β€” categories whose top products contain all query terms in their name.
  2. Frequency-ranked categories β€” the remaining highest-scoring categories.

5. Rank inside each category ​

The selected top categories are searched concurrently, each against its own per-category slice, using the identical blended search from step 2.

Big catalogs fan out wide

A query that spans many categories runs many category searches at once, which can be slow on a very large catalog. If searches feel slow or time out, keep queries specific (or split a huge catalog into more focused integrations), and contact support to tune how many categories are searched.

6. Rerank ​

If reranking is on (the default), the relevance service is available, and there's more than one result, the blended list is reordered by a relevance model for a cleaner top result. Reranking is skipped for code-like and URL-like queries β€” they already have a decisive exact ranking β€” and a rerank failure falls back to the pre-rerank order.

7. Blend in an image match ​

When image_url is supplied, an image-fingerprint search runs against the image index and is blended with the text results. If the image step fails, the search falls back to the text results. Deep dive: Image search.

Copy-paste: a full agentic query ​

The agent-facing tool is semantic_product_search on the Product (commerce) MCP. A text + filter call:

json
{
  "tool": "semantic_product_search",
  "arguments": {
    "query": "waterproof winter boots",
    "limit": 20,
    "enable_reranking": true,
    "truncate_description": true,
    "filters": [
      { "attribute": "color",    "operator": "eq", "value": "black" },
      { "attribute": "stickers", "operator": "in", "value": ["ΠŸΠΎΠ΄Π°Ρ€ΡƒΠ½ΠΎΠΊ", "Sale"] }
    ]
  }
}

Returns { count, products[], categories[] } β€” the categories[] list reflects the category selection from step 4, so your agent can offer to drill down.

A text + image call (triggers step 7):

json
{
  "tool": "semantic_product_search",
  "arguments": {
    "query": "red evening dress",
    "image_url": "https://example.com/photo.jpg",
    "limit": 10
  }
}

In the app: get the MCP URL

You don't hand-write the endpoint. Open Knowledge base β†’ Products, click Connect AI tools on the catalog's card, and copy the generated Product MCP URL (https://product.mychatbot.app/mcp/<account>/<integration>/stream) or the ready-made claude mcp add … command / Cursor config. Full setup: Product (commerce) MCP.

From a shopper sentence to a search call ​

A sentence like "red summer dresses under $50" is not a single call β€” the agent has to split it into free-text intent (query) and structured constraints (filters), and route the range part correctly. Map each fragment to where it belongs:

FragmentGoes toWhy
dress (the thing) + summer (descriptor)query: "summer dress"Drives the blended search and category discovery; summer stays in text unless the catalog exposes a filterable season attribute (check with get_category_attributes)
redfilter: { "attribute": "color", "operator": "eq", "value": "red" }A hard equality constraint β€” works on a whole-catalog search
under $50filter: { "attribute": "price", "operator": "lt", "value": 50 }A range constraint β€” see below; it only works once a leaf category is known

Because under $50 is a range, this becomes a two-call flow.

Step 1 β€” broad search: apply the eq constraint and let the search surface candidate categories.

json
{
  "tool": "semantic_product_search",
  "arguments": {
    "query": "summer dress",
    "limit": 20,
    "filters": [
      { "attribute": "color", "operator": "eq", "value": "red" }
    ]
  }
}

Read categories[] from the response and pick the dresses leaf category (say its id is dresses_042).

Step 2 β€” enforce the price ceiling on that leaf category, where range filters work:

json
{
  "tool": "filter_category_products",
  "arguments": {
    "category_id": "dresses_042",
    "filters": [
      { "attribute": "color", "operator": "eq", "value": "red" },
      { "attribute": "price", "operator": "lt", "value": 50 }
    ],
    "limit": 100
  }
}

A price ceiling can't be enforced on a broad, category-less search

On a whole-catalog search, filters match only eq / in. Range operators (gt, gte, lt, lte) only resolve once you've narrowed to a single leaf category. So passing price lt 50 on a broad semantic_product_search call (no category_id yet) will not reliably clip the price. Narrow first (from the returned categories[], get_all_categories, or a category search), then call filter_category_products with the range filter.

Confirm price is actually a filterable attribute

price can be used as a filter only when your feed exposed it as a real, typed numeric attribute. Verify per category with get_category_attributes(category_id) (is price in attribute_names[]?) and get_category_attribute_values(category_id, "price") (is type numeric?). If the feed buried price somewhere unusual and it wasn't picked up as a filterable attribute, the agent can still see it in the product details but can't filter by it β€” and a price lt 50 filter silently matches nothing. Fixing this needs a full re-index (not a self-serve toggle): remove and re-add the integration in Knowledge base β†’ Products, or contact MyChatBot support with your integration.

Broad searches filter by equality only

On a broad, category-less search, filters can only match eq and in. Range operators (gt, gte, lt, lte) and ne / startswith / endswith only work once you've narrowed to a single category β€” reach for filter_category_products (ops: eq, gt, gte, lt, lte, in) once you know the category_id, as in the two-call flow above.

Best practices ​

Do

  • Decompose the shopper's sentence yourself β€” send free-text intent as query, and build hard constraints (color, size, in-stock) as structured filters. The search does no natural-language-to-filter extraction.
  • Let the search pick categories β€” send a clean natural-language query and read categories[] back to guide follow-ups (and to get the category_id you need for range filters).
  • Use filters for hard constraints (color, size, in-stock), not for describing intent β€” intent belongs in query.
  • Set enable_reranking=false for pure SKU / code lookups; reranking adds latency with no benefit there.
  • Reach for get_products_by_ids when you already know the exact IDs β€” skip the whole ranking flow.

Don't

  • Don't pass a range filter (lt, gte, …) on a broad, category-less semantic_product_search and expect it to clip results β€” ranges only resolve inside a single leaf category. Narrow first, then filter_category_products.
  • Don't stuff SKUs into range filters β€” code-like attributes only resolve inside a category, not on a broad search.
  • Don't assume rank stability with generic product names β€” a common word in many product names can flip rank 1; filter by category or vendor first, or contact support to tune relevance.
  • Don't paste an image expecting it to override a text-disambiguated answer β€” image matches are blended, not authoritative (see the rank-1 guard in Image search).

Test it ​

Regression-test with the known-hits pattern: define a query plus the product IDs you expect, run semantic_product_search, and assert the IDs appear. Introspect a catalog first with get_category_attributes, get_category_attribute_values, and get_available_filters (facets) to learn what's actually filterable β€” including whether price resolved to a numeric attribute. Full walkthrough: Testing search.

See also ​