Fill the gaps in your profiles using data from your warehouse

In this recipe, you’ll set up rETL to automatically pull in both backfill customer data and events from your warehouse to help complete your identity graph and provide a holistic view of your customer’s journey over an extended period of time.

Made by Donnie Wallar

What do you need?

  • Segment Reverse ETL

  • Any Data Warehouse (We used Snowflake)

Easily personalize customer experiences with first-party data

With a huge integration catalog and plenty of no-code features, Segment provides easy-to-maintain capability to your teams with minimal engineering effort. Great data doesn't have to be hard work!

On this page

Getting up and running with Segment is easy.  In only a few steps, Segment automatically begins collecting activities on your website and within your application.  But what about information that is locked in Snowflake tables that doesn’t get automatically pulled into your workspace? They contain full customer profiles and years of web and app activity that you don’t want to exclude.

You're in luck; Segment offers a solution for this!

Segment’s libraries allow you to stream data in from a variety of sources to handle many backfill requests. But sometimes that requires additional work on your side to expose that data and code correct payloads.  With reverse ETL, Segment allows you to connect directly to those views or tables and pull data in using a point and click mapping interface.  You no longer need to build JSON payloads with code for backfill because Segment provides you with the tools to streamline this process and immediately add relevant data to your workspace.

You’ll learn how to connect to the Snowflake views containing events and profile data and map that data to identify and track calls all within the intuitive interface.

Step 1: Set up Snowflake as a source for Reverse ETL

  • In your Segment workspace, navigate to Reverse ETL -> Sources and click Add Source. 

  • Select Snowflake and then name your source on the Basic Settings tab.

  • Have your Snowflake Account ID, Database Name, Warehouse, Username, and Password available and follow the instructions on the Connection Settings tab to connect your Snowflake source.

Step 2: Build a model to pull data from your Snowflake table

  • We will start with profile data and send identify calls directly into your workspace.  In your Segment workspace, under Reverse ETL -> Sources, click into your newly created Snowflake source and select Add Model -> SQL Editor.

Example 1: Identify Events (updating and adding customer profiles)

  • Create your SQL query to pull the data from your Snowflake table.  An example query for this model would look like this for a table with customer records: SELECT userId, email, firstName, lastName, plan, street, city, state, postalCode, country, phone, birthday, createdAt, updatedAt, gender FROM  ACME_MDWHO.ACME_PRODUCTION.CUSTOMERS

  • Enter userID as the unique identifier column below the query.

  • Preview the results to ensure there are no errors.

  • Enter a name for the model (i.e. Customer Records), and set how frequently you would like it to be performed. If you want this only to happen once, you can choose 1 hour and then disable the connection once the initial pull has successfully completed.

Example 2: Track Events (adding customer order completed events)

  • Create your SQL query to pull the data from your Snowflake table.  An example query for this model would look like this for a table of order completed events: SELECT orderId, userId, email, orderDate, total, shipping, tax, currency, discount, affiliation FROM  ACME_MDWHO.ACME_PRODUCTION.ORDERS

  • Enter orderID as the unique identifier column below the query.

  • Preview the results to ensure there are no errors.

  • Enter a name for the model (i.e. Order Completed), and set how frequently you would like it to be performed. If you want this only to happen once, you can choose 1 hour and then disable the connection once the initial pull has successfully completed.

Step 3: Setup Segment as a destination and map the table data as Identify or Track event.

  • In your Segment workspace, navigate to Reverse ETL -> Destinations and select Add Destination.

  • Select Segment and connect it to the Snowflake source.

Name the Conversion and add the Source Write Key which you can find by opening a new tab and navigating to an existing source within your workspace, selecting the Settings tab and copying the Write Key.  In this example we are using an HTTP API source.  You can create a new source specifically for rETL or use an existing source.  It is recommended that you test this with a new HTTP API source and while testing you don’t connect to Engage because you will be making real time updates and entries to your customer profiles.

 

dwa1
  • Select your Endpoint Region.

  • Enable the destination and Save Changes.

  • Navigate to the Mappings tab of the Segment destination and select Add Mapping. 

Example 1: Identify Events (updating and adding customer profiles)

  • Select your model at the top, and then select the Send Identify mapping.

  • For this example we will select Added records in section 1, next, choose a test record in section 2, toggle “Show test record preview” to the on position in section 3 and then map each of the columns to the relevant Segment fields.

  • If these are all identified users with a canonical userId, map properties.userId to User ID.  In most cases you can skip Anonymous ID.  The exception would be if you have profile information and a different matching identifier such as email address or phone number but that table is missing the canonical userId.  In that case you would want to generate a unique anonymousId and include the email address and/or phone number in the identify call to allow Segment to merge the profile based on Segment identify resolution.  Here is a way that you can create a unique anonymousId using Snowflake

  • Map properties.updatedAt to Timestamp.  If your table doesn’t include a timestamp, you can skip this step and the current timestamp will be used.

  • At the bottom of section 3, click + Show all fields and move down the bottom of section 3 and either Select Object and choose properties. If necessary, rename the fields to comply with your naming convention and the Segment specification. You now have the option to fix casing and update trait labels if your database columns don’t exactly match the traits you use in Segment.

  • Once mapping is complete in sections 1-3, click Test Mapping in section 4.  If there are any issues with your mapping, you will receive the information you need to correct the issues within section 4.  Once your test is successful, click Create Mapping.

  • Test the mapping and once you confirm it’s working, click Save.

  • Toggle Mapping State to Enabled to enable the mapping.

Example 2: Track Events (adding order completed events)

  • Select your model at the top, and then select the Send Track mapping.

  • For this example we will select Added records in section 1, next, choose a test record in section 2, toggle “Show test record preview” to the on position in section 3 and then map each of the columns to the relevant Segment fields.

  • Either use the event name from your query or if you are sending in the same event from a table such as Email Clicked, you can enter the event name following the naming conversions of your workspace and the Segment specification

  • If these are all identified users with a canonical userId, map properties.userId to User ID.  If you have tables that don’t contain the canonical userId you will need to use Anonymous ID.  The key to sending in backfill events is that you have at least one identifier that is part of Segment’s identity resolution so that the events you send in can be mapped to existing users.  A good example would be email clicked events or form completed events where you have the email address but don’t have a matching canonical user id.  You can use Segment identify resolution to automatically apply those events to an existing profile. Or in some cases, you can create new profiles when the email address hasn’t been seen before.  This also can be used if you have other identity elements such as device id, phone, or even ga_client_id.  Here is a way that you can create a unique anonymousId using Snowflake

  • Map properties.timestamp to Timestamp.  

  • At the bottom of section 3, click + Show all fields and move down the bottom of section 3 and either Select Object and choose properties. If you need to rename the fields to comply with your naming convention and the Segment specification, you have the option to fix casing and update property labels if your database columns don’t exactly match the properties you use in Segment.

  • Once mapping is complete in sections 1-3, click Test Mapping in section 4.  If there are any issues with your mapping, you will receive the information you need to correct the issues within section 4.  Once your test is successful, click Create Mapping.

  • Test the mapping and once you confirm it’s working, click Save.

  • Toggle Mapping State to Enabled to enable the mapping.

Wrapping up

In this growth recipe, we’ve reviewed how to use Reverse ETL to backfill into Segment. We connected Segment to our table in Snowflake, and mapped the data from our table to a Segment Identify and/or Track event.

Getting started is easy

Start connecting your data with Segment.