Sourcesible
Datasets/Add Dataset from BigQuery

Add Dataset from BigQuery

Overview

The BigQuery Dataset allows you to expose a specific table from your connected BigQuery project as a structured Dataset in Sourcesible. Once configured, the Dataset can be used across Audiences, Computed Fields, and Dataset Models. Before creating a Dataset, ensure you have a BigQuery Source already connected under Data Sources. Note that BigQuery table metadata can take up to 30 seconds to load depending on the size of your project.

Creating a Dataset from BigQuery

Step 1 — Open the Dataset Page

  1. In the left navigation, click Datasets.
  2. Click Add Dataset in the top-right corner.

Step 2 — Choose a Data Source

  1. On the Choose Data Source screen, select the radio button next to BigQuery.
  2. Click Next.

Step 3 — Choose Method

Select how you want to define your dataset:

MethodDescriptionBest For
BigQuery TableBrowse and select a table directly from your connected BigQuery project. Requires a user with read and write permissions.Quick setup when you want the full table with no transformation
SQLWrite a custom query using the online SQL editor.Custom datasets, filtered views, joins, or aggregations

Use BigQuery Table for raw ingestion. Use SQL when you need to pre-filter or transform data before it enters Sourcesible.

Click Next after selecting your method.

Step 4 — Choose a BigQuery Table

The Choose a BigQuery Table screen appears under Choose Data (Step 2 of 3). Sourcesible fetches the list of datasets and tables from your connected BigQuery project.

The table list may take up to 30 seconds to load depending on your BigQuery project size. A loading spinner is displayed while Sourcesible retrieves the metadata. Do not navigate away during this time.

  1. Wait for the table list to finish loading.
  2. Use the Search field to filter by table name if needed. The search filters across all datasets in your project simultaneously.
  3. Tables are grouped under their parent BigQuery dataset name (e.g., oak_datatest_education, oak_edu, oak_edu_upgrade). Select the radio button next to the table you want to use.
  4. Click Confirm.

Step 5 — Set Up Dataset

The wizard advances to Set up Dataset (Step 3 of 3).

Dataset Identity

  1. In the Dataset Name field, enter a name to identify this dataset in Sourcesible (maximum 50 characters). You can update this at any time under Settings.
  2. Optionally, enter a Dataset Description (maximum 50 characters).

Source Reference (Read-only)

The following fields are auto-populated from your table selection and cannot be edited here:

  • Dataset Name from BigQuery Dataset — displays the BigQuery dataset name (e.g., oak_edu_upgrade)
  • Table Name from BigQuery Table — displays the selected table name (e.g., students)

Data Settings

The Data Settings section lists all fields detected from your selected table along with their BigQuery data types (e.g., STRING, DATE). For each field, configure the following:

ColumnDescription
Show In FilterMakes this field available as a filter criterion in Audience and segmentation tools. Click the header checkbox to toggle all fields at once.
PIIMarks this field as Personally Identifiable Information. PII fields are displayed as masked in the data preview.
Exclude from PersonalizationPrevents this field from being used in personalization or activation contexts.
  1. Configure the Show In Filter, PII, and Exclude from Personalization checkboxes for each field as required.
  2. Click Preview to validate that Sourcesible can read records from the table. A preview of up to 10 rows will appear below Data Settings.
  3. Click Save.

The note at the bottom of the preview reads: "For data privacy, your selected PII fields will be displayed as masked."

Test Your Connection (Preview)

When you click Preview, Sourcesible queries the first 10 records from your selected BigQuery table and renders them inline. The following are validated during preview:

  • The BigQuery source credentials are still valid
  • The selected dataset and table exist and are accessible by the Sourcesible service account
  • The field schema matches the columns listed in Data Settings
  • PII-flagged fields are masked in the preview output

If the preview returns no records or an error, do not click Save. Verify that the BigQuery service account has the appropriate IAM roles on the selected table before retrying.

Next Steps

Once your BigQuery Dataset is saved, you can:

Tips and Troubleshooting

Table List Takes a Long Time to Load or Does Not Appear

Symptom: After reaching the Choose a BigQuery Table step, the table list displays a loading spinner for more than 60 seconds or never appears.

Cause: BigQuery metadata enumeration can be slow for large projects with many datasets and tables. Additionally, if the Sourcesible service account lacks the bigquery.datasets.get permission, the list will not load at all.

Fix: First, confirm the service account has the required roles:

roles/bigquery.dataViewer

roles/bigquery.jobUser

If permissions are correct but loading is still slow, use the Search field to filter by table name immediately after the list begins loading — this reduces the number of results Sourcesible needs to render.

Preview Returns an Error or Empty Result

Symptom: Clicking Preview shows an error or a blank table despite the table appearing selectable.

Cause: The Sourcesible service account may lack table-level read permissions, or the table contains zero rows.

Fix: Confirm row count and access in BigQuery:

SELECT COUNT(*) FROM `oak_edu_upgrade.students`;

If rows exist but preview fails, ensure the service account has bigquery.tables.getData permission on the specific table, or grant the broader roles/bigquery.dataViewer role at the dataset level:

bq add-iam-policy-binding \

--member=serviceAccount:[email protected] \

--role=roles/bigquery.dataViewer \

your-project:oak_edu_upgrade

Duplicate Table Names Across Multiple BigQuery Datasets

Symptom: The Search returns multiple entries with the same table name (e.g., students appearing under oak_datatest_education, oak_edu, and oak_edu_upgrade) and it is unclear which to select.

Cause: BigQuery allows tables with identical names in different datasets. Sourcesible lists all matching tables across datasets when you search.

Fix: Check the dataset group header above each table entry to identify the correct parent dataset. If you are unsure, confirm the intended dataset and table in the BigQuery console before returning to Sourcesible to make your selection.

PII Fields Showing Unmasked in Preview

Symptom: Fields containing personal information appear as plain text in the preview.

Cause: The PII checkbox was not checked for those fields before clicking Preview.

Fix: Scroll up to Data Settings, check the PII column for the relevant fields (e.g., email, phone), then click Preview again to refresh the masked output.