The schema of the standardized catalog is as follows
Core Identification
Field | Type | Description |
---|
id | UUID | Unique identifier for the standardized product record |
source_product_id | STRING | Product ID from the original catalog submission |
marketplace_id | UUID | Identifier for the submitting marketplace or vendor |
Field | Type | Description |
---|
title | STRING | Cleaned and standardized product title |
description | STRING | Standardized product description |
Brand Data
Field | Type | Description |
---|
raw_brand_name | STRING | Brand name as received in the original catalog |
brand_id | UUID | Recognized canonical brand ID |
standard_brand_name | STRING | Standardized brand name associated with the brand ID |
standard_brand_score | FLOAT | Standardized brand score associated with the brand ID |
Category Classification
Field | Type | Description |
---|
raw_category_name | STRING | Original category label from the catalog |
category_id | UUID | Mapped category ID from the standardized taxonomy |
standard_category_path | STRING | Full category path (e.g., Electronics → Phones → Smartphones) |
standard_category_score | FLOAT | Standardized brand score associated with the category path |
Deduplication and Quality
Field | Type | Description |
---|
master_product_id | UUID | Unified product ID representing deduplicated variants of the same product |
is_duplicate | BOOLEAN | Indicates whether the product is a duplicate of an existing item |
confidence_score | FLOAT | Confidence score (0–1) indicating certainty in the standardization |
Field | Type | Description |
---|
created_at | TIMESTAMP | Timestamp when the product was processed |
updated_at | TIMESTAMP | Last updated time for the standardized record |