SUPPORT.TWILIO.COM END OF LIFE NOTICE: This site, support.twilio.com, is scheduled to go End of Life on February 27, 2024. All Twilio Support content has been migrated to help.twilio.com, where you can continue to find helpful Support articles, API docs, and Twilio blog content, and escalate your issues to our Support team. We encourage you to update your bookmarks and begin using the new site today for all your Twilio Support needs.

Why Source Insert Functions Bypass Segment Data Replays

Objective

This article explains the architectural order of operations for Source Insert Functions and Segment replays. It clarifies why replays do not apply new function transformations to historical data, why certain properties might remain null after a replay, and what happens when events are permanently dropped at the ingestion edge due to function errors (such as an EventNotSupported error).

 

Product

Twilio Segment

 

Environment

Segment Console

 

User Account Permission/Role(s) Required 

Workspace Owner

 

Procedure 

To understand how Source Insert Functions impact your data replays and historical archives, please review the following architectural behaviors:

  1. Understand the Order of Operations for Live Events: When a live event hits Segment, it passes through the Source Insert Function first. After the function mutates the payload, Segment saves that final, transformed event into our historical S3 archives.
  2. Understand How Replays Bypass Functions: Replays completely bypass Source Insert Functions. When a replay is triggered, it pulls the raw JSON payloads exactly as they were saved in the S3 archives (subject to your workspace's data retention period) and sends them directly to your destination.
  3. Identify Why Replayed Data Lacks New Transformations: Because historical events are saved to our archives before you deploy a corrected or updated function, replays will replace old destination rows with the exact same untransformed historical data. Replays cannot dynamically apply new Source Insert Function logic to old data.
  4. Recognize the Impact of Dropped Events: If a misconfigured Source Insert Function throws an error (for example, dropping non-track events), those payloads are completely rejected at the Segment ingestion edge.
  5. Understand Permanent Data Loss: Because dropped events are never saved to historical S3 archives, they cannot be recovered. Replays rely entirely on these archives, meaning we cannot replay data that was never successfully stored. Those specific events are permanently lost.

 

Additional Information 

Because Segment replays cannot dynamically transform historical data through updated Source Functions, the best path forward to backfill missing properties is to handle the mutation directly within the data warehouse.

 

Have more questions? Submit a request
Powered by Zendesk