Cleaner Data
Catalog One
Catalog One is Topsort’s catalog standardization system designed to help marketplaces, retailers, and brands maintain consistent, searchable, and duplicate-free product data. By identifying and unifying brands, categories, and products across different catalogs, Topsort enhances product discovery, reporting, and ad targeting.
Key Benefits
Improved UX
Better Attribution
Ad Targeting Ready
API Integration
Requirements
Customers send their product catalog to our Catalog API. Topsort Catalog API accepts product data in a structured format including key attributes such as title, description, brand name, category name, and product ID (see Schema Description)
Endpoint Engineering Features
Asynchronous Processing: Once a catalog is received, Topsort triggers an offline standardization task. This task processes the catalog using advanced matching logic and machine learning models. The time to run the full process of standardization may depend on the size of the original catalog, but a rough estimate is around 10 products per second.
Endpoint ML Features
Brand Recognition: Matches the input brand to a canonical brand ID using a mix of fuzzy matching, pre-trained models and Large Language Models.
Category Classification: Automatically maps free-text categories into a standardized category taxonomy, enabling consistent browsing and reporting. Topsort uses the Google taxonomy as reference for categories.
Product Deduplication: Detects and links duplicate products across the catalog. Duplicates are grouped under a unified master_product_id
using techniques like similarity scoring and vector-based clustering.
Output Description
The output is a clean, deduplicated, and enriched catalog where each product is linked to recognized brands, standardized categories, and master product groups. Topsort can also include scores for each inference task.
To consume this standardized catalog here’s the URL of the endpoint documentation. This is a paginated endpoint which works at a rate limit of 10 requests per second.