Skip to main content

How to connect Batch to Snowflake?

Here's a step-by-step guide on how to sync your Snowflake data with Batch, for segmentation, personalization and orchestration purposes.

Selma avatar
Written by Selma
Updated this week

Modern data architectures demand seamless integration between storage, processing, and actionable platforms. Connecting Batch to Snowflake enables a unified, efficient, and scalable approach to leveraging customer data for real-time engagement. Here’s why this integration is essential:

  1. Centralized Data as the Single Source of Truth
    Snowflake consolidates customer data across platforms, ensuring Batch can access a single, accurate source of truth. This reduces discrepancies and latency in activating customer engagement strategies.

  2. Event-Driven Architecture

    The integration leverages views and streams to cascade data changes efficiently. Simultaneously, it emphasizes minimal data movement, directly querying Snowflake’s computational power to prepare data for Batch without unnecessary replication or intermediaries, ensuring a secure and streamlined process.

  3. Scalability and Future-Readiness
    Snowflake’s elastic design easily handles growing datasets, ensuring Batch can engage with millions of profiles in real time. This setup also supports future innovations, such as AI-driven personalization or advanced segmentation workflows.

By connecting Batch to Snowflake, you unlock a data architecture that prioritizes responsiveness, accuracy, and scalability. It’s the foundation for real-time, personalized customer engagement, built on modern principles of data-driven efficiency.

Syncing Snowflake Data with Batch

Batch’s Snowflake integration makes it easy to sync your customer data from Snowflake into Batch for activation. With our prepackaged containerized job and a few simple Snowflake configurations, you can go from raw data to a live sync in no time.

This guide walks you through the setup, step by step.

Requirements

  • Make sure the role you're using has the required permissions granted as mentioned in the steps below.

  • Snowpark Container Services need to be available on your Snowflake account.

Configuration steps

1. Identify the data you want to sync

Start by deciding which Profile attributes you want to synchronise from Snowflake to Batch.

To make the data is available for syncing, create a SQL view in Snowflake containing the data you want to send, and then set up a Snowflake stream on top of it.

The stream will capture changes to this dataset (new or updated rows) which will serve as an input to Batch’s sync Job.

CREATE OR REPLACE VIEW batch_sync_view AS 

-- Your SQL query here
CREATE OR REPLACE STREAM batch_sync_stream ON VIEW batch_sync_view;

Important:

  • Your view must contain a User ID column to be used as a mapping key with Batch’s Profile base. This must be the same identifier you use as Batch’s Custom User ID.

  • Batch’s Sync Job (mentioned in Step 2 below) uses Batch’s Profile API to send profile data. Thus, your view data must comply with the APIs requirements (e.g. attribute max length, etc). Non compliant data will be ignored during synchronization.

2. Deploy the Batch sync job with Snowpark Container Services

Batch provides a prepackaged job service designed to run on Snowpark Container Services. This job reads data from your configured Snowflake stream and sends it to your specified Batch project.

To deploy the sync job:

  • Build the container image and push it on a Snowflake image repository

    You will need to download the files available here. Then, using Docker, build a container image and push it to a Snowflake image repository (create it if one doesn’t already exist) as described in this documentation.

    See required privileges on the image repository to complete this step.

  • Create a compute pool, if one doesn’t already exist:

    This will be used to run the sync Job.
    See required privileges to complete this step.

    CREATE COMPUTE POOL my_compute_pool 
    MIN_NODES = 1
    MAX_NODES = 1
    INSTANCE_FAMILY = CPU_X64_XS;

3. Create an External Access Integration

Since Batch's sync Job will be calling Batch APIs, you'll need to create a network rule authorizing calls to Batch APIs and apply it when executing the job. To do so:

  • Create the network rule:

CREATE OR REPLACE NETWORK RULE batch_api_access_rule 
TYPE = HOST_PORT
MODE = EGRESS
VALUE_LIST = ('api.batch.com:443');

See required privileges to execute this step.

  • Create an external access integration using the network rule:

CREATE OR REPLACE EXTERNAL ACCESS INTEGRATION batch_api_integration  ALLOWED_NETWORK_RULES = (batch_api_access_rule)  
ENABLED = TRUE

See required privileges to execute this step.

4. Automate the sync with a scheduled task

To keep the data flowing, you can automate the job using a Snowflake task.

→ Create a task executing your Job

You’ll need a task that triggers the sync job with the following parameters:

  • project-key (String) Your Batch project key (Available in your project General Settings).

  • source-stream (String) The Snowflake stream you created in Step 1.

  • id-column (String) Name of the column in your stream containing the User ID used (used for mapping with Batch Custom User ID).

  • date-columns (String) List of column names containing date attributes separated by a coma.

  • url-columns (String) List of column names containing URL attributes separated by a coma.

  • api-credentials-table (String) → A secret Snowflake table containing your Batch credentials including a PROJECT_KEY and a REST_API_KEY column.

Here is what your task should look like:

CREATE OR REPLACE TASK BATCH_SYNC_TASK 
WAREHOUSE = MY_WAREHOUSE
SCHEDULE = '60 MINUTE'
AS
EXECUTE JOB SERVICE IN COMPUTE POOL MY_POOL
NAME = MY_JOB_NAME
EXTERNAL_ACCESS_INTEGRATIONS = (batch_api_integration)
FROM SPECIFICATION $$
spec:
containers:
- name: batch-sync-container
image: "MY_IMAGE_URL"
args:
- "--project-key"
- "MY_BATCH_PROJECT_KEY"
- "--source-stream"
- "MY_DB.MY_SCHEMA.MY_STREAM"
- "--id-column"
- "MY_ID_COLUMN"
- "--date-columns"
- "MY_DATE_COLUMNS"
- "--url-columns"
- "MY_URL_COLUMNS"
- "--api-credentials-table"
- "MY_DB.MY_SCHEMA.BATCH_API_CREDENTIALS"
env:
SNOWFLAKE_ROLE: "MY_ROLE"
$$;

Once that is done, all that's left is activating your task:

ALTER TASK BATCH_SYNC_TASK RESUME;

See required privileges to execute this step.

And that’s it, your sync is live!

5. Test and monitor the workflow

Monitor Snowflake tasks and logs for execution status and debug issues if needed.

You can use Batch’s Profile View to verify that the customer data has reached its destination, on the basis of a customer ID: How to find a user's profile on the dashboard?

You’re now ready to use your data to send personalized and engaging messages to your audience!


This article belongs to Batch's FAQ. Need more help? Find insightful articles, documentation, case & market studies, guides, and even more in our website's Resources section on batch.com and our blog.

Did this answer your question?