Connect BigQuery

BigQuery is a first-class source. Auth is a service account JSON key; IAM follows the least-privilege principle — roles/bigquery.dataViewer on the dataset plus roles/bigquery.jobUser on the project.

1. Create the service account

export PROJECT=my-gcp-project
export DATASET=my_bq_dataset

gcloud iam service-accounts create oa-reader \
  --display-name "OneAnalytics reader" --project="$PROJECT"

SA="oa-reader@${PROJECT}.iam.gserviceaccount.com"

# Dataset-scoped read
bq --project_id="$PROJECT" show --format=prettyjson "$DATASET" > /tmp/acl.json
# … edit /tmp/acl.json to add {"role": "READER", "userByEmail": "$SA"} …
bq update --source=/tmp/acl.json "$PROJECT:$DATASET"

# Project-level job runner
gcloud projects add-iam-policy-binding "$PROJECT" \
  --member="serviceAccount:$SA" --role="roles/bigquery.jobUser"

# Key
gcloud iam service-accounts keys create /tmp/oa-reader.json \
  --iam-account="$SA"

Upload /tmp/oa-reader.json into the OneAnalytics connection dialog — stored encrypted at rest.

2. Add the connection

Sources → Add → BigQuery:

Project ID: my-gcp-project
Dataset (optional): my_bq_dataset — scopes the browser
Service account key: upload oa-reader.json
Location: US, EU, asia-south1, … (matches your dataset's region)

Click Test — we list tables you have read access to.

3. Pick a mode

Direct — typical. BigQuery charges per bytes scanned; our compiler aggressively pushes down filters and projections. The sql_preview shows exactly what will be scanned.
Import — for very small tables or to cap BigQuery cost. Streams via the Storage Read API.

Cost controls

Per-query maximum bytes billed — set in Dataset Settings. Queries over the limit fail fast.
Partitioned-table filter — we refuse to issue a scan that omits the partition column on a partitioned table unless you tick the override. Saves a lot of money.
Slot reservation — if you use flex slots, configure the reservation ID in Dataset Settings; we send it with every job.

Labels

Every job we submit carries labels app=oneanalytics, workspace_id=<uuid>, dataset_id=<uuid>. Use them in INFORMATION_SCHEMA.JOBS_BY_PROJECT for cost attribution.

Nested / repeated fields

STRUCT and ARRAY types are fully supported. In the semantic model, reference nested fields with dotted paths (order.shipping.city), and UNNEST for arrays happens transparently when you add them as dimensions.