Overview
Kroo transforms complex construction API data into clean, queryable tables in your data warehouse. This page explains our data modeling methodology and sync strategies.Data Modeling Approach
The Four-Step Process
1
Fetch Raw Data
Retrieve complete JSON payloads from your construction management APIs
2
Flatten Main Resources
Convert primary objects (RFIs, change orders, etc.) into main database tables
3
Extract Nested Data
Transform nested arrays and objects into separate sub-tables
4
Create Relationships
Link sub-tables back to main tables using defined join keys for easy querying
Real Example: RFI Data Transformation
Here’s how we process a typical RFI response from Procore: Source API Response:- Main Table: kroo_procore.rfis
- Sub-table: kroo_procore.rfi_logs
| procore_id (string) | due_date (date) | status (string) |
|---|---|---|
| “abc” | 2024-11-22 | ”open” |
procore_idin the main table comes from the API’sidfieldrfi_idin the sub-table links back to the main RFI record- This structure enables easy JOINs for comprehensive reporting
Why This Approach? This methodology ensures your data warehouse follows relational database best practices while maintaining all the rich detail from your construction management systems.
Understanding Internal Metadata Columns
Common Internal Metadata Columns:_partner_id- Internal identifier used by Kroo’s sync system_data_type- Internal classification used for pipeline processing_fivetran_deleted- Soft delete flag (exception: this field MUST be used in WHERE clauses)
_fivetran_deleted = 0 to filter out soft-deleted records.
Sync Efficiency & Performance
Smart Sync Strategies
Kroo uses several optimization techniques to ensure reliable, efficient data integration:Rate Limiting
Client-side rate limits with back-off
- Respects source system availability
- Maximizes sync efficiency
- Prevents API overload
Incremental Updates
Only sync what’s changed
- Reduces data transfer volume
- Minimizes warehouse load
- Faster sync completion times
System Load Management
Our approach minimizes impact on your source systems:- Intelligent scheduling during low-usage periods
- Gradual back-off when systems are under load
- Monitoring and alerts for sync health
- Error recovery with automatic retry logic
Result: Your construction management systems continue running smoothly while Kroo keeps your data warehouse up-to-date with the latest project information.
Data Quality Assurance
Validation & Consistency
Every sync includes:- Schema validation against expected data structures
- Data type checking for consistency
- Relationship integrity between main and sub-tables
- Duplicate detection and handling
Monitoring & Alerts
Track data health with:- Pipeline sync status monitoring
- Data freshness alerts
- Error notifications and resolution
- Performance metrics and trends
Next Steps
Ready to monitor your data pipelines? Check out our Pipeline Status Monitoring guide to set up real-time dashboards.