Connecting BigQuery to Churney
Here you can read our guide for connecting your BigQuery data warehouse with Churney.
We follow the Google standard for sharing data in a secure way, described in https://cloud.google.com/bigquery/docs/share-access-views
This guide covers these two steps:
Create dataset which will contain the hashed views
Assign permissions for Churney to access the views
What kind of data is required?
The short answer is as much as possible. The long answer is that Churney requires data about:
Payments
Trials (if applicable)
User demographic (if available)
User activity
Additionally, we need to know the location (region) of your data warehouse.
Create the hashed views
To create a new dataset with the views, go on your project where the raw data is and click on the three dots to create a new dataset, for example, if your project is called churney_playground
, click as below:
To create the views, we would need to create queries to hash the PII columns.
For context, Facebook has a guide on how to hash contact information for their conversion api: https://developers.facebook.com/docs/marketing-api/conversions-api/parameters/customer-information-parameters. Google has a similar guide for enhanced conversions using their ads api https://developers.google.com/google-ads/api/docs/conversions/enhance-conversions. Basically, we want to hash columns that contain data which would allow one to determine the identity of a user: First name, last name, birth day, street address, phone number, email address etc.
Creating hashed views through a script
For each potentially relevant dataset, we would ask you to run the following query:
SELECT
table_schema, table_name, ddl
FROM
<dataset_name>.INFORMATION_SCHEMA.TABLES;
Export the result, either as json or csv and share it with Churney
In the case of json fields, if there is private information in those fields like email, phone, last name, first name, etc, then we should skip the column from the view, and unpack the elements we need. The private information will also be included in the view for us, but we can modify the create view statement to unpack and hash the private field. In these cases, what we need from you is the name of the json column that has private information, and then the key that we should search for to unpack and hash. For the rest of the json fields that do not have any private user identifier, we can read them as they are.
Example of creating a hashed view
Create a dataset named churney
.
If the raw data contains email, phone, birthday or other identifiers, the columns need to be excluded and hashed in the view.
Example, if the raw data lies under raw_data.sensitive
and you would like to create a view churney.sensitive_data_view
CREATE OR REPLACE <your_project>.churney.sensitive_data_view as (
select
TO_HEX(SHA256(LOWER(TRIM(email)))) as email,
TO_HEX(SHA256(LOWER(TRIM(name)))) as name,
TO_HEX(SHA256(LTRIM(regexp_replace(phone, '[()-+-]', ''), '0'))) as phone,
TO_HEX(SHA256(FORMAT_DATE('%Y%m%d',birthday))) as birthday,
* EXCEPT(email, name, phone, birthday)
from <your_project>.raw_data.sensitive
)
if you want to test it how the script looks with some data:
with sensitive as
(
select 1 as id,
"abc" as name,
"[email protected]" as email,
current_date() as birthday,
'+4511111111' as phone
)
select
TO_HEX(SHA256(LOWER(TRIM(email)))) as email,
TO_HEX(SHA256(LOWER(TRIM(name)))) as name,
TO_HEX(SHA256(LTRIM(regexp_replace(phone, '[()-+-]', ''), '0'))) as phone,
TO_HEX(SHA256(FORMAT_DATE('%Y%m%d',birthday))) as birthday,
* EXCEPT(email, name, phone, birthday)
from sensitive
Assign permissions for Churney
Churney will need permissions to read the hashed views. We base our guide on the best practices published on Google documentation https://cloud.google.com/bigquery/docs/share-access-views
Churney will give you a service account and below you can find which permissions to assign to it:
1. Assign BigQuery User role to Churney service account https://cloud.google.com/bigquery/docs/share-access-views#assign_a_project-level_role_to_your_data_analysts
For this, you need to go to the IAM page of the project where you created the views for Churney and add BigQuery User to the service account. Note that this does not give Churney permission to access the data under your project.
2. Give Churney service account permission to access the dataset with the views created previously https://cloud.google.com/bigquery/docs/share-access-views#assign_access_controls_to_the_dataset_containing_the_view
This requires that you go to the dataset and click on the three dots on the side to select Share.
Continue on Add Principal and add the Churney service account with the permission BigQuery Data Viewer
Authorize views
Authorize the view to access the source data. This means that Churney, only through the view, would be able to read the data. https://cloud.google.com/bigquery/docs/share-access-views#authorize_the_view_to_access_the_source_dataset
For this step, go on the raw data dataset (in this example, that is called synthetic), and click on the dataset itself (not the 3 dots) and choose Sharing and then on Authorize Views
Then finally, type the name of the authorized view that you created above