Issue
When using Profile Sync to backfill the "identifies" event type table, users may want to estimate the number of records with distinct email addresses, or sync only records where an email address is present.
Product
Segment Profile Sync
Environment
Segment Console & Connected Warehouse
Cause
Profile Sync transfers all records for the selected event type within the specified timeframe. It does not support conditional row-level filtering or trait-based restrictions (e.g., filtering out records missing an email address) during the sync process The count of identify events with an email trait is not the same as the count of distinct email addresses.
Resolution
To filter or count distinct email addresses, allow Profiles Sync to fully load the data into your warehouse. From there, you must perform your analytical filtering directly via SQL queries.
Tip: To find distinct email addresses, query the materialized
user_identifierstable or theprofile_traitstable rather than the rawidentifiesevent table, as Segment automatically resolves and deduplicates these records for you.
Additional Information
While Profiles Sync features Selective Sync (which allows you to disable entire tables or specific property columns), it does not support partial or criteria-based row filtering. All event data within the sync timeframe will be included. For advanced filtering or analysis, use SQL queries in your warehouse after the sync.