Historical Data Migration
Historical data migration is the process of transferring performance metrics and event data from a client’s previous ad platform to accelerate Topsort’s machine learning models and reduce the initial learning period during platform transition.
Problem
When clients migrate to Topsort, their campaigns face a cold start problem where:
- No Performance History: New campaigns start without any historical performance data
- Learning Period: Machine learning models require 1-4 weeks to accumulate sufficient data for optimization
- Suboptimal Performance: During cold start, campaigns may underperform due to lack of training data
- Advertiser Frustration: Advertisers may experience reduced campaign effectiveness in the initial weeks
Solution
We provide a historical data ingestion solution that imports performance metrics and event data from the client’s previous platform. This data serves as initial training material for Topsort’s machine learning models, significantly reducing the cold start period.
How Historical Data Helps
Model Training Acceleration:
- Provides immediate training data for machine learning algorithms
- Reduces cold start period from 4 weeks to 1-2 weeks
- Enables faster campaign optimization and bidding decisions
Performance Continuity:
- Campaigns can leverage historical performance patterns
- Better initial bid recommendations based on past data
- Improved targeting decisions from historical user behavior
Risk Reduction:
- Minimizes performance dip during platform transition
- Maintains advertiser confidence with familiar performance levels
- Provides baseline metrics for comparison and optimization
Technical Implementation
Our historical data integration:
- Ingests event data including organic impressions, clicks, and purchases
- Processes performance metrics at campaign, product, and user levels
- Trains initial models using imported historical data before go-live
- Calibrates algorithms during initial operation for optimal performance
- Updates embeddings for users, products, and placements based on historical patterns
Migration Process
-
Data Assessment and Scope Definition
Evaluate Historical Data Availability
- Assess what performance data is available from previous platform
- Determine data quality and completeness
- Define time range for historical data (typically 3-6 months)
- Identify key metrics that align with Topsort’s tracking
-
Data Export and Preparation
Required Historical Data Types:
- Campaign performance metrics (impressions, clicks, conversions, spend)
- Product-level performance data (click-through rates, conversion rates)
- User behavior events (searches, views, purchases)
- Organic traffic patterns and seasonal trends
- Bidding and budget utilization history
-
Data Validation and Processing
Quality Assurance Steps:
- Validate data completeness and accuracy
- Normalize metrics to match Topsort’s data schema
- Clean and process data for model training
- Identify and handle data anomalies or outliers
-
Model Training and Calibration
Initial Training Process:
- Import historical data into Topsort’s training pipeline
- Train initial machine learning models using historical patterns
- Calibrate algorithms for optimal performance
- Validate model accuracy against known historical outcomes
-
Production Deployment and Monitoring
Go-Live Process:
- Deploy trained models to production environment
- Monitor initial performance against historical baselines
- Fine-tune algorithms based on new real-time data
- Gradually shift from historical to real-time data optimization
Data Requirements
Required Performance Metrics
Metric Category | Required Fields | Example Format |
---|---|---|
Campaign Performance | campaign_id, date, impressions, clicks, conversions, spend | campaign-123, 2024-01-15, 1000, 50, 5, 25.00 |
Product Performance | product_id, campaign_id, date, impressions, clicks, ctr, conversion_rate | prod-456, campaign-123, 2024-01-15, 100, 10, 0.10, 0.02 |
User Events | user_id (anonymized), event_type, product_id, timestamp, value | user-789, purchase, prod-456, 2024-01-15T10:30:00Z, 49.99 |
Organic Traffic | product_id, date, organic_impressions, organic_clicks, search_terms | prod-456, 2024-01-15, 500, 25, "summer shoes" |
CSV Format Examples
Campaign Performance Data:
campaign_id,date,impressions,clicks,conversions,spend,ctr,conversion_ratecampaign-123,2024-01-15,1000,50,5,25.00,0.05,0.10campaign-124,2024-01-15,800,40,3,20.00,0.05,0.075
Product Performance Data:
product_id,campaign_id,date,impressions,clicks,conversions,revenueprod-456,campaign-123,2024-01-15,100,10,2,49.98prod-457,campaign-123,2024-01-15,150,8,1,24.99
User Event Data:
user_id,event_type,product_id,timestamp,value,campaign_iduser-789,view,prod-456,2024-01-15T10:00:00Z,,user-789,click,prod-456,2024-01-15T10:05:00Z,,campaign-123user-789,purchase,prod-456,2024-01-15T10:30:00Z,49.99,campaign-123
Model Training Process
Onboarding Training
Initial Data Processing:
- Historical event data is integrated into training pipelines
- Models are trained using 3-6 months of historical performance data
- Initial embeddings are created for users, products, and campaigns
- Baseline performance predictions are established
Ongoing Optimization
Continuous Learning:
- Daily Updates: ID lookup embeddings updated with new data
- Weekly Retraining: Full model retraining incorporating both historical and new data
- Real-time Adaptation: User behavior embeddings updated continuously
- Performance Monitoring: Historical vs. current performance comparison
Success Metrics
Historical data migration success is measured by:
- Reduced Cold Start Period: Learning time decreased from 4 weeks to 1-2 weeks
- Performance Continuity: Campaign performance within 10-15% of historical levels from day one
- Model Accuracy: Prediction accuracy improved by 20-30% compared to cold start scenarios
- Advertiser Satisfaction: Maintained or improved advertiser confidence during transition
Integration with Campaign Migration
Complementary Processes
Historical data migration works alongside Campaign Migration:
- Campaign Structure: Basic campaign migration handles settings, budgets, and targeting
- Performance Data: Historical data migration provides the performance foundation
- Combined Benefit: Together, they ensure both functional campaigns and optimized performance from day one
Recommended Sequence
- Complete Campaign Migration first to establish campaign structure
- Run historical data migration in parallel during testing phase
- Deploy both campaign structure and trained models simultaneously
- Monitor performance against historical baselines
Next Steps
For clients interested in historical data migration:
- Assess data availability from your current platform
- Coordinate technical teams to discuss historical data requirements
- Plan data extraction alongside campaign migration timeline
- Coordinate with machine learning team for model training requirements