Data Sources/Google BigQuery Source

Google BigQuery Source

Sourcesible Product Documentation · Data Sources

Last updated: 2026-03-01

Overview

The Google BigQuery Source lets you connect a BigQuery dataset to Sourcesible and stream customer data — including events, traits, and identifiers — directly into your pipeline for unification and downstream activation. It is designed for teams who already manage customer data in BigQuery and want to enrich Sourcesible profiles without duplicating storage.

Before you begin, ensure you have a GCP project with BigQuery enabled, billing configured, and a dataset that contains the tables you intend to sync. You will also need sufficient IAM permissions in GCP to create and manage service accounts.

This Source pulls data from BigQuery into Sourcesible on a scheduled or event-driven basis. Sourcesible does not write back to your BigQuery dataset — this is a read-only integration.

BigQuery Credential / Permission Setup

You must create a GCP service account and grant it the appropriate BigQuery IAM roles before configuring the connection in Sourcesible. Two permission tiers are supported, depending on your organisation's data governance policy.

This tier grants Sourcesible read access to your dataset and the ability to run query jobs. It follows the principle of least privilege.

Required IAM roles:

IAM RoleScopePurpose
roles/bigquery.dataViewerDataset-levelRead tables and views in the target dataset
roles/bigquery.jobUserProject-levelRun query and export jobs on behalf of Sourcesible

Run the following commands in Google Cloud Shell or your local terminal to create the service account and assign roles:

bash
# 1. Create the service account
gcloud iam service-accounts create sourcesible-bq-source \
  --display-name="Sourcesible BigQuery Source" \
  --project=YOUR_PROJECT_ID

# 2. Grant BigQuery Data Viewer at dataset level
bq add-iam-policy-binding \
  --member="serviceAccount:sourcesible-bq-source@YOUR_PROJECT_ID.iam.gserviceaccount.com" \
  --role="roles/bigquery.dataViewer" \
  YOUR_PROJECT_ID:YOUR_DATASET_ID

# 3. Grant BigQuery Job User at project level
gcloud projects add-iam-policy-binding YOUR_PROJECT_ID \
  --member="serviceAccount:sourcesible-bq-source@YOUR_PROJECT_ID.iam.gserviceaccount.com" \
  --role="roles/bigquery.jobUser"

Full Access (Optional)

Use this tier if you need Sourcesible to query across multiple datasets in the same project, or if your pipeline requires metadata access (e.g., schema introspection across the entire project).

Required IAM role:

IAM RoleScopePurpose
roles/bigquery.adminProject-levelFull read/write access across all datasets in the project

roles/bigquery.admin grants write access to your BigQuery project. Use this role only if strictly required. Prefer the Minimum Access tier where possible.

Download the Service Account Key

  1. In the GCP Console, navigate to IAM & Admin → Service Accounts.
  2. Click the service account you created (sourcesible-bq-source).
  3. Select the Keys tab and click Add Key → Create new key.
  4. Choose JSON format and click Create.
  5. Save the downloaded .json key file securely — you will upload it to Sourcesible in the next section.

Treat the service account JSON key as a secret credential. Do not commit it to source control or share it via email. Store it in a secrets manager if possible.

Connection Configuration (Inside Sourcesible)

Follow these steps to set up the Google BigQuery Source inside the Sourcesible platform.

Add a New Source

  1. In the Sourcesible sidebar, click Sources.
  2. Click Add Source in the top-right corner.
  3. In the source catalogue, search for or select Google BigQuery.
  4. Click Connect.

Enter GCP Connection Details

  1. In the Project ID field, enter your GCP project ID exactly as it appears in the GCP Console.
  2. In the Dataset ID field, enter the BigQuery dataset you want to sync from.
  3. Under Service Account Key, click Upload JSON Key and select the .json file you downloaded in the previous section.

The Project ID and Dataset ID fields are case-sensitive. You can find your Project ID in the GCP Console header bar or by running: gcloud config get-value project

Configure Source Settings

  1. Enter a descriptive name in the Source Name field (e.g., BigQuery - Production Events).
  2. Optionally add a Description to help your team identify this connection.
  3. Click Save & Continue to proceed to the connection test.

Test Your Connection

After saving your configuration, Sourcesible automatically runs a connection test. Click Test Connection on the confirmation screen to trigger it manually at any time.

During the test, Sourcesible validates the following:

  • Service account credentials are valid and have not expired.
  • The specified Project ID exists and is accessible by the service account.
  • The specified Dataset ID exists within the project.
  • The service account holds at least roles/bigquery.dataViewer on the dataset.
  • The service account holds at least roles/bigquery.jobUser on the project.
  • At least one table or view is accessible within the selected dataset.
  • A test query (SELECT 1) executes successfully to confirm job creation rights.

The connection test runs a lightweight query and does not sync any data. If the test times out after 30 seconds, check that your GCP project does not have VPC Service Controls or firewall rules that block outbound connections from Sourcesible's IP ranges.

Next Steps

Once your BigQuery Source connection is active and verified, the next step is to create a Dataset in Sourcesible. A Dataset lets you select the specific tables from your BigQuery source that you want to make available for building Audiences, Traits, and Pipelines.

  • Create a Dataset — Navigate to Datasets and click Add Dataset. Select your BigQuery Source, then choose the tables you want to include.
  • Define Traits — Map columns from your Dataset to Sourcesible user or account Traits for profile enrichment.
  • Build Audiences — Use your Dataset to create Audience segments based on event data and Traits.
  • Set Up Pipelines — Route data from your Dataset through Transformations before forwarding to Destinations.
  • Monitor Sync Health — View sync run history and error logs in Sources → BigQuery → Sync History.

Tips and Troubleshooting

Permission Denied on Connection Test

Symptom: The connection test fails with a 403 Access Denied or Permission Denied error.

Cause: The service account is missing one or both required IAM roles, or the roles were applied at the wrong scope (e.g., dataset-level vs. project-level).

Fix: Confirm that roles/bigquery.dataViewer is applied at dataset level and roles/bigquery.jobUser at project level. Run the following to verify:

bash
# Check project-level bindings
gcloud projects get-iam-policy YOUR_PROJECT_ID \
  --flatten="bindings[].members" \
  --filter="bindings.members:sourcesible-bq-source" \
  --format="table(bindings.role)"

# Check dataset-level bindings
bq get-iam-policy YOUR_PROJECT_ID:YOUR_DATASET_ID

Uploaded JSON Key Is Rejected

Symptom: Sourcesible displays an "Invalid credentials" error immediately after you upload the service account JSON key.

Cause: The JSON key file is corrupted, belongs to a different GCP project, or the service account has been disabled or deleted in GCP.

Fix: Verify the service account is active and regenerate a fresh key:

bash
# Verify the service account
gcloud iam service-accounts describe \
  sourcesible-bq-source@YOUR_PROJECT_ID.iam.gserviceaccount.com \
  --project=YOUR_PROJECT_ID

# Generate a new key
gcloud iam service-accounts keys create new-key.json \
  --iam-account=sourcesible-bq-source@YOUR_PROJECT_ID.iam.gserviceaccount.com

Upload the newly generated new-key.json file in the Service Account Key field in Sourcesible.

← PreviousNext →