Go to Studio

Google BigQuery Reader

The Google BigQuery Reader node executes SQL queries against Google BigQuery and returns the results as structured data for downstream processing.

Google BigQuery Reader node executing SQL workflows against datasets projects

What does the Google BigQuery Reader node do?

The Google BigQuery Reader node connects to your Google Cloud project and runs a SQL query against BigQuery, returning the rows as a JSON string. It lets you pull large analytics datasets, warehouse tables, or query results into your workflow without manual exports or intermediate files.

Common use cases:

  • Pulling daily campaign performance from a marketing data warehouse to summarize with an LLM
  • Extracting product catalog rows for downstream enrichment, classification, or content generation
  • Running an aggregation query (e.g. revenue by channel) and feeding the results into a report node
  • Joining BigQuery tables on the fly to build a dataset that another node consumes row by row

Quick setup

Connect a BigQuery integration

Open the node settings and pick a BigQuery Integration from the dropdown. If none is listed, go to Settings > Integrations and connect a Google account that has BigQuery access.

Select the Google Cloud project

Once the integration is connected, choose the Google Cloud Project that owns your dataset. The dropdown is populated automatically from the projects visible to the integration.

Write the SQL query

In the SQL Query field, enter a valid BigQuery Standard SQL SELECT statement. You can reference fully qualified tables with backticks, for example `bigquery-public-data.usa_names.usa_1910_2013`.

Connect the output

Connect the output port to the next node. The query results are emitted as a JSON string, ready to feed into a Loop, JSON Path Extractor, LLM, or any other downstream node.

Configuration parameters

BigQuery Reader settings SQL editor integrations dataset location parameterized

Required fields

integration_id integration required

BigQuery Integration — Select the Google account connection to use. The integration must have BigQuery read access on the target project.

google_cloud_project string required

Google Cloud Project — The Google Cloud project that hosts your BigQuery dataset. Pick it from the dropdown populated by the selected integration. Switching the integration resets this field.

sql_query string required

SQL Query — The BigQuery Standard SQL query to execute. Use backticks around fully qualified table names and standard BigQuery functions. Template variables like {{start_date}} are resolved at runtime from upstream node outputs.

SELECT name, state, number
FROM `bigquery-public-data.usa_names.usa_1910_2013`
WHERE state = 'TX'
LIMIT 100
Tip

You can inject template variables anywhere in the query (e.g. WHERE date >= '{{start_date}}'). They are resolved from upstream inputs before the query is sent to BigQuery.

What does the node output?

The node outputs the query results as a single JSON string containing the array of rows. Each row is an object whose keys are the column names returned by the query.

[
  { "name": "Maria", "state": "TX", "number": 12345 },
  { "name": "James", "state": "TX", "number": 11200 }
]
columnsoutput string

A JSON string containing the array of rows returned by the query. Column order and names match the SELECT clause.

Usage examples

Example 1: Daily revenue report

You want to summarize last week’s revenue per campaign and feed the result into an LLM for a written report.

Configuration:

  • BigQuery Integration: Marketing GCP
  • Google Cloud Project: acme-marketing-prod
  • SQL Query:
    SELECT date, campaign, SUM(revenue) AS total
    FROM `acme-marketing-prod.ads.campaign_revenue`
    WHERE date >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY)
    GROUP BY date, campaign
    ORDER BY date

Connect the output to an LLM node with a prompt like Summarize the weekly revenue trends from the JSON below.

Example 2: Loop over rows for per-row processing

You need to enrich each product in a catalog with a generated description.

Configuration:

  • SQL Query:
    SELECT sku, title, category
    FROM `acme-prod.catalog.products`
    WHERE active = TRUE

Pipe the output into a Loop node, then into an LLM node that drafts a description per row.

Example 3: Dynamic date range with template variables

Reuse the same node across workflows by templating the date filter.

Configuration:

  • SQL Query:
    SELECT user_id, event, ts
    FROM `acme-prod.analytics.events`
    WHERE DATE(ts) BETWEEN '{{start_date}}' AND '{{end_date}}'

The start_date and end_date values come from upstream Date or Text Input nodes.

Common issues

The node fails with BigQuery integration is not configured

Cause: No BigQuery integration is selected on the node.

Solution: Open the node settings and pick a BigQuery integration in the dropdown. If none is listed, add one in Settings > Integrations.

Google Cloud project ID is missing

Cause: The Google Cloud Project field is empty, often because the integration was just changed and the previous project was reset.

Solution: Pick a project from the dropdown after the integration finishes loading. The list is refreshed automatically when you switch integrations.

SQL query is empty or rejected by BigQuery

Cause: The SQL Query field is empty, contains only whitespace, or the syntax is not valid BigQuery Standard SQL.

Solution: Test the query first in the BigQuery console. Make sure to use backticks around fully qualified table names (project.dataset.table) and Standard SQL functions.

Permission denied or table not found

Cause: The integration account lacks read access to the dataset, or the table reference is wrong.

Solution: Verify the service or user account has at least BigQuery Data Viewer and BigQuery Job User roles on the target project. Double-check the dataset and table names.

Queries time out or return too much data

Cause: The query scans large tables without filters or returns millions of rows that overload downstream nodes.

Solution: Add a WHERE clause on a partitioned column, use LIMIT, or pre-aggregate with GROUP BY. Avoid SELECT * on large tables.

Best practices

Tip

Filter on partitioned columns. Most BigQuery analytics tables are partitioned by date. Adding WHERE date >= ... cuts the bytes scanned and the cost dramatically.

Tip

Select only the columns you need. Listing columns explicitly is faster, cheaper, and more robust to schema changes than SELECT *.

Warning

Avoid unbounded queries. Running a query that scans terabytes can hit project quotas, slow your workflow, and incur unexpected costs. Always preview the bytes processed in the BigQuery console first.

Warning

Sanitize template variables. Values injected via {{var}} are not parameterized at the BigQuery layer. Make sure upstream values come from trusted sources before using them in WHERE clauses.

How does it fit into a workflow?

The Google BigQuery Reader is typically used at the start of a workflow to pull a dataset that subsequent nodes process row by row or in aggregate.

graph LR
    BQ[Google BigQuery Reader] --> Loop[Loop node]
    Loop --> LLM[LLM node processes each row]
    LLM --> Out[Write results downstream]