Question
I'd like to be able to see when a destination row was updated in my connected data warehouse (like Snowflake). Is there a way to append an "updated_at" column or a similar timestamp on a table to know, on a row-level basis, when it was last updated?
Product
Twilio Segment
Environment
Segment Console
Answer
Segment does not explicitly append a column named updated_at to your warehouse tables. Instead, the Segment API associates four timestamps with every call: timestamp, original_timestamp, sent_at, and received_at. All four timestamps pass through to your Warehouse for every ETL'd event.
Depending on the type of table, you can use these existing columns to determine when a row was last updated:
Upserted Tables (
usersandgroups): These tables hold the latest state of a user or group. Calls are upserted into these tables, meaning they are updated if an existing entry exists, or appended otherwise. Whenever a new call updates an existing record, the timestamp columns update with the new data. You can rely on thereceived_atortimestampcolumn to have a rough estimate of when a row was last modified.Append-only Tables (
tracks,pages,screens): These tables contain records of specific events. Because these rows are not updated after they are initially inserted, thereceived_atandtimestampcolumns will always roughly represent the time the event was received by Segment and timestamped.
When choosing which timestamp to query, consider the following:
received_at: Segment recommends using thereceived_attimestamp for all time-based queries. All tables usereceived_atfor the sort key, which means queries will execute much faster if usingreceived_at.timestamp: Use this column for queries based on event chronology.
Additional Information
You may also notice a uuid_ts column in your tables. This column is used to keep track of when the specific event was last processed by Segment's connector, specifically for deduping and debugging purposes. You can generally ignore this column.