What you’ll build
A workflow that synchronizes data between systems — for example, keeping a Google Sheet in sync with BigQuery, or pushing CRM data to your data warehouse. Time to build: 20 minutes Nodes used: Google Sheets, Loop, BigQuery Writer, ConditionalThe problem
Data lives in many places:- Marketing data in Google Sheets
- Sales data in CRM
- Product data in databases
- Analytics in BigQuery
Workflow overview
Common sync patterns
Sheet to BigQuery
Sync marketing spreadsheets to your data warehouse:CRM to warehouse
Keep sales data centralized:Multi-source aggregation
Combine data from multiple sources:Step-by-step guide: Sheet to BigQuery
Step 1: Set up your source
Create a Google Sheet with structured data:| date | campaign | spend | impressions | clicks |
|---|---|---|---|---|
| 2024-01-15 | Brand | 1000 | 50000 | 2500 |
| 2024-01-15 | Search | 2000 | 30000 | 3000 |
Step 2: Create the workflow
- Click New Workflow
- Name it “Marketing Data Sync”
Step 3: Fetch source data
1
Add Google Sheets Reader
Configure:
- Spreadsheet ID: Your marketing sheet
- Range:
Marketing!A:E - Include Headers: Yes
Step 4: Transform data
1
Add Loop node
Process each row:
- Items:
{{GoogleSheets_0.data}}
2
Add transformation logic
Use an LLM or utility nodes to:
- Convert date formats
- Calculate derived metrics
- Clean text fields
Step 5: Validate data
1
Add Conditional node
Check for valid data:Conditions:
dateis not emptyspendis a numbercampaignis not empty
Step 6: Write to destination
1
Add BigQuery Writer node
Configure:
- Project: Your GCP project
- Dataset: marketing_data
- Table: campaign_metrics
- Write Mode: Append (or Replace for full refresh)
- Data: Transformed row data
Step 7: Log sync results
1
Add Google Sheets Writer for logging
Configure:
- Spreadsheet ID: Your sync log
- Data:
Handling sync modes
Full refresh
Replace all data on each sync:Incremental sync
Only sync new/changed records:Upsert (Update or Insert)
Update existing records, insert new ones:Implementing incremental sync
Track last sync time
Store the last successful sync timestamp:1
Fetch last sync time
Read from a tracking sheet or database:
2
Filter source data
Only fetch new records:
3
Update sync time on success
After successful sync:
Error handling
Retry logic
For transient failures:Dead letter queue
Store failed records for manual review:Alert on failures
Send notifications when sync fails:Scheduling sync jobs
| Sync Type | Frequency | Timing |
|---|---|---|
| Real-time | Every 5-15 min | Continuous |
| Near real-time | Every hour | On the hour |
| Daily | Once per day | Off-peak (2-4 AM) |
| Weekly | Once per week | Weekend |
Best practices
Idempotency
Make syncs safe to re-run:- Use upsert instead of insert
- Include timestamps for deduplication
- Track sync batches with IDs
Data validation
Validate before writing:- Check required fields
- Verify data types
- Validate ranges
- Check referential integrity
Logging
Log everything:- Start/end times
- Record counts
- Success/failure status
- Error details
Monitoring
Track sync health:| Metric | Alert Threshold |
|---|---|
| Sync duration | > 2x normal |
| Error rate | > 5% |
| Missing syncs | > 1 missed |
| Data lag | > 1 hour |
Results you can expect
| Metric | Manual | Automated |
|---|---|---|
| Sync frequency | Daily (maybe) | Every 15 min |
| Error rate | 5-10% | < 1% |
| Time spent | 2-4 hours/week | 0 |
| Data freshness | Hours/days | Minutes |

