Objective
This article explains the architectural order of operations for Source Insert Functions and Segment replays. It clarifies why replays do not apply new function transformations to historical data, why certain properties might remain null after a replay, and what happens when events are permanently dropped at the ingestion edge due to function errors (such as an EventNotSupported error).
Product
Twilio Segment
Environment
Segment Console
User Account Permission/Role(s) Required
Workspace Owner
Procedure
To understand how Source Insert Functions impact your data replays and historical archives, please review the following architectural behaviors:
- Understand the Order of Operations for Live Events: When a live event hits Segment, it passes through the Source Insert Function first. After the function mutates the payload, Segment saves that final, transformed event into our historical S3 archives.
- Understand How Replays Bypass Functions: Replays completely bypass Source Insert Functions. When a replay is triggered, it pulls the raw JSON payloads exactly as they were saved in the S3 archives (subject to your workspace's data retention period) and sends them directly to your destination.
- Identify Why Replayed Data Lacks New Transformations: Because historical events are saved to our archives before you deploy a corrected or updated function, replays will replace old destination rows with the exact same untransformed historical data. Replays cannot dynamically apply new Source Insert Function logic to old data.
- Recognize the Impact of Dropped Events: If a misconfigured Source Insert Function throws an error (for example, dropping non-track events), those payloads are completely rejected at the Segment ingestion edge.
- Understand Permanent Data Loss: Because dropped events are never saved to historical S3 archives, they cannot be recovered. Replays rely entirely on these archives, meaning we cannot replay data that was never successfully stored. Those specific events are permanently lost.
Additional Information
Because Segment replays cannot dynamically transform historical data through updated Source Functions, the best path forward to backfill missing properties is to handle the mutation directly within the data warehouse.