Guides

How to Use BigQuery Data Canvas

Arkzero ResearchMar 29, 20269 min read

Last updated Mar 29, 2026

BigQuery Data Canvas is a visual, node-based workspace inside Google Cloud Console that lets analysts query and chart BigQuery data without writing SQL from scratch. Users chain search, query, and visualization nodes on a drag-and-drop canvas, with Gemini AI generating SQL from natural language prompts. Two IAM roles and an enabled Gemini API are all that is required to get started. This guide covers setup, building an analysis, and the practical limits of the canvas environment.
BigQuery logo on a clean editorial background representing visual data analysis

BigQuery Data Canvas is Google Cloud's visual, node-based analytics workspace that lets analysts query and visualize BigQuery data without writing SQL from scratch. Users chain search, query, and chart nodes on a drag-and-drop canvas while Gemini generates SQL from plain-language prompts. The canvas sits inside the BigQuery console and requires no additional installation. This tutorial covers setup through first analysis for teams that already have data in BigQuery.

What You Need Before Starting

Data Canvas is part of BigQuery Studio, available in all Google Cloud projects at no extra charge. Queries run through the canvas are billed at standard BigQuery on-demand rates. Before opening the canvas, confirm three things are in place.

Gemini in BigQuery enabled: Navigate to BigQuery in the Google Cloud Console and confirm Gemini is active for your project. If the Gemini icon does not appear in the toolbar, go to APIs and Services, search for "Vertex AI API," and enable it. Gemini in BigQuery depends on the Vertex AI API.

IAM role: BigQuery Studio User (roles/bigquery.studioUser): Grants access to BigQuery Studio, the environment where Data Canvas runs.

IAM role: Gemini for Google Cloud User (roles/cloudaicompanion.user): Required for natural language and AI-assisted features within the canvas. Without this role, the AI assistant does not appear.

Assign both roles to each analyst's account under IAM and Admin in the Google Cloud Console. Google's official documentation identifies these two roles as the minimum required for full canvas functionality, including natural language query generation.

Opening Data Canvas in BigQuery

Log into the Google Cloud Console and navigate to BigQuery using the left navigation or the search bar. In the BigQuery Studio interface, locate the "SQL query" button in the top-left area of the editor pane. Click the dropdown arrow next to this button and select "Data Canvas" from the menu.

A blank canvas workspace opens in the same browser window. The interface shows a grid background with a toolbar at the top containing buttons for each node type. Unlike the standard SQL editor, there is no code pane. The entire workspace is a spatial canvas where analyses are built visually.

Understanding the Node-Based Interface

Every step in a Data Canvas analysis is represented by a node. Google's official documentation identifies five node types that cover the complete analytics workflow:

  • Search nodes: Discover datasets using natural language. Type a description and Data Canvas surfaces matching tables from Dataplex Universal Catalog.
  • Table nodes: Display raw table data with a row preview. Use these to inspect columns and sample values before writing queries.
  • SQL nodes: Execute SQL against connected tables. Write SQL directly or use Gemini to generate it from a natural language prompt.
  • Visualization nodes: Create charts from query results. Supports bar, line, pie, scatter, and heat map formats.
  • Destination nodes: Write query results to a new BigQuery table for downstream use.

Nodes connect by dragging lines between outputs and inputs. The canvas supports branching: a single source table can feed into two independent SQL nodes running different analyses in parallel.

Discovering and Adding Data

Click the "Search" node button in the toolbar to start a new analysis. A text field appears where you describe the data you want. Data Canvas searches Dataplex Universal Catalog, which indexes all BigQuery datasets in your project plus any public datasets you have access to.

For example, type "Google public e-commerce data" to find the thelook_ecommerce dataset, or "Chicago taxi rides" to find the public chicago_taxi_trips table. Click the table name in the search results to add it to the canvas as a Table node.

If you already know the exact table you want, click "Table" in the toolbar and type the full reference directly: project.dataset.table_name. This skips the search step and is faster for analysts who know their data warehouse layout.

Querying Data with Natural Language

Select an existing Table node and click the "+" icon to add a connected SQL node. A dialog appears with two options: "Write SQL" and "Ask Gemini." Choose "Ask Gemini" and type your analysis request in plain English.

For the chicago_taxi_trips table, an effective prompt would be: "Show the average fare and trip duration by payment type for trips taken in 2024." Gemini produces a GROUP BY query with the correct column references, casts, and date filters.

According to Google Cloud documentation, prompt quality improves when you reference actual column names from the table. Rather than "show trips by type," write "show count of trips grouped by payment_type and company." Gemini uses the column names as anchors to generate more accurate SQL, reducing the need to edit its output afterward.

Gemini also handles more complex requests. A prompt like "rank the top 10 pickup community areas by total fare for each quarter of 2024" generates a query using a window function with a date partition, which would take a non-SQL user significant time to write manually.

After Gemini generates the SQL, review it in the node editor before running. Check that column names match the table schema, that date ranges are correct, and that the aggregation logic reflects your intent. An estimated bytes processed figure appears before execution, giving a cost preview before any charges are incurred.

Working with Joins

Data Canvas handles multi-table analysis through connected table references. Add a second table to the canvas, then type a natural language prompt in a SQL node that references both tables. Gemini detects the cross-table reference and generates a JOIN clause automatically.

For example: "Join orders with customers on customer_id and show total revenue by customer segment." Gemini writes the full JOIN with the aggregation and the correct key columns.

Alternatively, you can write the join SQL manually in the SQL node editor. Drag a connection line from two Table nodes into one SQL node and write a standard JOIN statement. Both approaches produce the same result.

Multi-table joins work well in Data Canvas for two or three tables. For analyses involving four or more joins, or queries with complex subquery logic, writing SQL directly in the SQL node is more reliable than relying on Gemini generation.

Building Visualizations

Once a SQL node has run successfully, click the chart icon in the output panel or add a Visualization node from the toolbar connected to the SQL node.

A sidebar opens with chart configuration options. Select a chart type from the dropdown, then assign columns to the X axis and Y axis. For a bar chart showing average fare by payment type, assign payment_type to the X axis and avg_fare to the Y axis. For a line chart tracking monthly revenue, assign the date column to X and the revenue total to Y.

Add a "Color by" grouping to segment a single chart by a categorical column. Charts update automatically when you rerun the SQL node above them.

You can add multiple Visualization nodes to the same SQL node, producing different chart types from the same query result without duplicating the underlying query.

Saving Results and Exporting SQL

To write results to a permanent table, add a Destination node downstream from a SQL node. Enter the destination table reference (project.dataset.new_table_name) and click "Run." Data Canvas executes the query and writes the output to BigQuery. This is the standard path for sharing clean data with other team members or scheduling a regular data refresh.

To get the SQL generated by the canvas for use in production pipelines, click "View SQL" on any SQL node. The full query text appears in an overlay and can be copied to the clipboard. This SQL can be taken to BigQuery's Scheduled Queries feature or to a dbt project to operationalize the analysis.

Name and save the canvas using the file save button in the top bar. Saved canvases appear in the BigQuery Studio file browser and can be shared with other project members who have BigQuery Studio User access on the same Google Cloud project.

Limitations Worth Knowing

Data Canvas does not support scheduled queries directly. The canvas is an exploration environment, not a production pipeline. To automate any canvas-generated query, export the SQL and schedule it through BigQuery's native Scheduled Queries feature or a workflow orchestration tool like Dataform or Composer.

Gemini's natural language accuracy is highest on single-table aggregations and basic joins. Complex queries involving nested subqueries, multiple CTEs, or window functions over partitioned data often require manual SQL editing. For these cases, writing SQL directly in the SQL node is faster than correcting Gemini output.

Data Canvas is also not a dashboard builder. Charts produced in the canvas are exploratory and cannot be published as shareable reports from within the canvas. To share visualizations with stakeholders, connect the underlying BigQuery table to Looker Studio or another BI tool.

For teams that want the full workflow from data upload to chart without any cloud configuration or IAM setup, VSLZ handles that from a single file upload with no infrastructure required.

The canvas is best understood as a lower-friction entry point for exploratory data analysis within the Google Cloud ecosystem. For organizations that already store data in BigQuery, it removes the SQL barrier for analysts who know what they want to find but are not fluent in writing the queries to find it.

FAQ

What is BigQuery Data Canvas?

BigQuery Data Canvas is a visual, node-based analytics workspace inside Google Cloud Console. It lets analysts build data queries and visualizations by chaining nodes together on a drag-and-drop canvas, using Gemini AI to generate SQL from natural language prompts. It is part of BigQuery Studio and requires no additional installation or pricing beyond standard BigQuery usage costs.

Does BigQuery Data Canvas require SQL knowledge?

No. Data Canvas includes a Gemini AI assistant that generates SQL from plain-language prompts. You can type a request like "show total sales by region for Q4 2024" and Gemini produces the query. However, reviewing and occasionally editing the generated SQL is recommended, as Gemini accuracy drops on complex multi-table or subquery-heavy requests. Knowing SQL helps you catch errors before running costly queries.

Is BigQuery Data Canvas free to use?

The canvas interface itself is included in BigQuery Studio at no extra charge. However, every query you run through the canvas is billed at standard BigQuery on-demand rates, which charge per terabyte of data processed. Data Canvas shows an estimated bytes processed figure before each query runs, giving you a cost preview. Gemini in BigQuery is included in the Google Cloud free tier up to certain usage limits, after which it bills separately.

How do I enable Gemini in BigQuery Data Canvas?

Enabling Gemini in Data Canvas requires two steps. First, enable the Vertex AI API for your Google Cloud project by going to APIs and Services, searching for "Vertex AI API," and clicking Enable. Second, assign the Gemini for Google Cloud User IAM role (roles/cloudaicompanion.user) to your account in the IAM and Admin section of Google Cloud Console. Without both steps, the Ask Gemini option does not appear in SQL nodes.

What is the difference between BigQuery Data Canvas and Looker Studio?

BigQuery Data Canvas is an exploratory analysis tool built for writing and testing queries. It uses a node-based canvas where analysts can discover data, generate SQL, and build charts for internal use during analysis. Looker Studio is a dashboard and reporting tool designed for creating shareable, publishable reports and dashboards for stakeholders. The two tools are complementary: use Data Canvas to explore and develop queries, then connect the resulting BigQuery tables to Looker Studio to build the final report.

Related