Objective
This article helps users understand how Identity Resolution works within Segment Profiles Sync. It explains the core logic behind profile merges, the difference between a segment_id and a canonical_segment_id, and how these relationships are recorded in your data warehouse tables to maintain a unified customer identity graph.
Product
Twilio Segment
Environment
Segment Console
Procedure
1. Understand the different Segment Identifiers
To accurately query your identity graph, you must first understand the difference between the two primary identifiers Segment uses:
- segment_id: A unique identifier representing Segment's understanding of who performed an action at the exact moment that action occurred.
canonical_segment_id: A unique identifier representing Segment’s current, fully-merged understanding of that individual. The canonical ID is the oldest segment ID that has never been merged into another profile. Profiles that Segment merges away are no longer considered canonical.
2. Learn how recursive profile merges work
As new information arrives (such as an anonymous user logging in and connecting an anonymous session to a known email) the system dynamically updates the mapping so that the older segment_id points to the new canonical_segment_id.
This logic supports recursive mapping for complex multi-step identity journeys. If Profile C is merged into Profile B, and Profile B is later merged into Profile A, any event originally associated with Profile C or Profile B can successfully be resolved to Profile A by following these recursive entries.
3. Track historical identity changes using Raw Tables
Profiles Sync provides append-only raw "update" tables that serve as the system's ledger, preserving every state change the identity graph has ever undergone.
- id_graph_updates: Records every instance where a segment_id is linked to a canonical_segment_id, capturing initial profile creations and any subsequent merges.
- external_id_mapping_updates: Tracks the association between Segment's internal IDs and external identifiers provided by the user (such as user_id, email, or phone_number).
- profile_traits_updates: Captures every update to a user's traits.
4. View the current identity state using Materialized Views
For simplified analytical reporting, Profiles Sync also provides materialized views that offer an up-to-date representation of the customer. These views are updated automatically:
- profile_merges: An audit log of identity graph merges that shows when two or more segment_ids were merged and maps them to the canonical_segment_id.
- user_identifiers: A mapping table that associates a canonical_segment_id with an external identifier. Upon a merge, Segment deletes the identifiers associated with the "merge-from" profile and re-assigns them to the "merge-to" profile.
- user_traits: Contains the most recently updated trait values for each canonical profile. When a merge occurs, the "merge-from" profile is deleted from the view, and its traits are merged into the "merge-to" profile using "most recent wins" logic.
Additional Information
Segment allows you to configure identifier limits (for example, limiting a profile to a maximum of five associated email addresses) to prevent "identity leakage" where multiple users are incorrectly merged into one. Profiles Sync allows teams to proactively identify profiles that have reached these limits by querying the external ID mapping tables.