Connecting BigQuery to Churney

Suela Isaj, Jonas Dahlbæk | September 18, 2023 · 5 min

Here you can read our guide for connecting your BigQuery data warehouse with Churney.

We follow the Google standard for sharing data in a secure way, described in https://cloud.google.com/bigquery/docs/share-access-views

This guide covers these two steps:

  1. Create dataset which will contain the hashed views

  2. Assign permissions for Churney to access the views

What kind of data is required?

The short answer is as much as possible. The long answer is that Churney requires data about:

  • Payments

  • Trials (if applicable)

  • User demographic (if available)

  • User activity

Additionally, we need to know the location (region) of your data warehouse.

Create the hashed views

To create a new dataset with the views, go on your project where the raw data is and click on the three dots to create a new dataset, for example, if your project is called churney_playground, click as below:

To create the views, we would need to create queries to hash the PII columns.

For context, Facebook has a guide on how to hash contact information for their conversion api: https://developers.facebook.com/docs/marketing-api/conversions-api/parameters/customer-information-parameters. Google has a similar guide for enhanced conversions using their ads api https://developers.google.com/google-ads/api/docs/conversions/enhance-conversions. Basically, we want to hash columns that contain data which would allow one to determine the identity of a user: First name, last name, birth day, street address, phone number, email address etc.

Creating hashed views through a script

For each potentially relevant dataset, we would ask you to run the following query:

SELECT
  table_schema, table_name, ddl
FROM
  <dataset_name>.INFORMATION_SCHEMA.TABLES;

Export the result, either as json or csv and share it with Churney

In the case of json fields, if there is private information in those fields like email, phone, last name, first name, etc, then we should skip the column from the view, and unpack the elements we need. The private information will also be included in the view for us, but we can modify the create view statement to unpack and hash the private field. In these cases, what we need from you is the name of the json column that has private information, and then the key that we should search for to unpack and hash. For the rest of the json fields that do not have any private user identifier, we can read them as they are.

Example of creating a hashed view

Create a dataset named churney .

If the raw data contains email, phone, birthday or other identifiers, the columns need to be excluded and hashed in the view.

Example, if the raw data lies under raw_data.sensitive and you would like to create a view churney.sensitive_data_view

CREATE OR REPLACE <your_project>.churney.sensitive_data_view as ( 
select 
TO_HEX(SHA256(LOWER(TRIM(email)))) as email,
TO_HEX(SHA256(LOWER(TRIM(name)))) as name,
TO_HEX(SHA256(LTRIM(regexp_replace(phone, '[()-+-]', ''), '0'))) as phone,
TO_HEX(SHA256(FORMAT_DATE('%Y%m%d',birthday))) as birthday,
* EXCEPT(email, name, phone, birthday)
from <your_project>.raw_data.sensitive
)

if you want to test it how the script looks with some data:

with sensitive as
(
  select 1 as id,
  "abc" as name,
  "[email protected]" as email,
  current_date() as birthday,
  '+4511111111' as phone
)
select 
TO_HEX(SHA256(LOWER(TRIM(email)))) as email,
TO_HEX(SHA256(LOWER(TRIM(name)))) as name,
TO_HEX(SHA256(LTRIM(regexp_replace(phone, '[()-+-]', ''), '0'))) as phone,
TO_HEX(SHA256(FORMAT_DATE('%Y%m%d',birthday))) as birthday,
* EXCEPT(email, name, phone, birthday)
from sensitive

Assign permissions for Churney

Churney will need permissions to read the hashed views. We base our guide on the best practices published on Google documentation https://cloud.google.com/bigquery/docs/share-access-views

Churney will give you a service account and below you can find which permissions to assign to it:

1. Assign BigQuery User role to Churney service account https://cloud.google.com/bigquery/docs/share-access-views#assign_a_project-level_role_to_your_data_analysts

For this, you need to go to the IAM page of the project where you created the views for Churney and add BigQuery User to the service account. Note that this does not give Churney permission to access the data under your project.

2. Give Churney service account permission to access the dataset with the views created previously https://cloud.google.com/bigquery/docs/share-access-views#assign_access_controls_to_the_dataset_containing_the_view 

This requires that you go to the dataset and click on the three dots on the side to select Share

Continue on Add Principal and add the Churney service account with the permission BigQuery Data Viewer

Authorize views

Authorize the view to access the source data. This means that Churney, only through the view, would be able to read the data. https://cloud.google.com/bigquery/docs/share-access-views#authorize_the_view_to_access_the_source_dataset 

For this step, go on the raw data dataset (in this example, that is called synthetic), and click on the dataset itself (not the 3 dots) and choose Sharing and then on Authorize Views


Then finally, type the name of the authorized view that you created above