How to Set Up Databricks AI/BI Genie
Last updated Apr 25, 2026

What You Are Setting Up
Databricks AI/BI Genie is a chat interface that lets business users ask plain-English questions about data in a Databricks workspace and get SQL-backed answers instantly. Setting up a Genie space takes under an hour if your data is already registered in Unity Catalog. This guide walks through the full configuration: creating the space, reviewing query suggestions, building a knowledge store, adding example queries, and sharing the space with your team.
Why Teams Are Adopting It
Genie became generally available on June 12, 2025. During its preview period, more than 4,000 customers adopted the tool. On Azure Databricks, Genie monthly active users grew over 300% year over year. The core appeal is simple: instead of filing a ticket and waiting for a data analyst to write a query, an ops manager types "show me this week's revenue by region" and gets a chart in seconds.
The architecture is straightforward. A Genie space connects to one or more Unity Catalog tables, uses a serverless or pro SQL warehouse for compute, and answers user questions by generating and running SQL against those tables. Responses can be flagged as Trusted when they come from pre-approved SQL patterns, which gives teams an auditable layer on top of the AI-generated output.
Prerequisites
Before creating a Genie space, confirm you have three things in place.
First, Unity Catalog. All tables you plan to include must be registered in Unity Catalog. Genie does not work with legacy Hive Metastore tables.
Second, a SQL warehouse. You need a pro or serverless SQL warehouse. Classic warehouses are not supported. You also need at least CAN USE permission on that warehouse.
Third, entitlements and data access. You need the Databricks SQL workspace entitlement and SELECT privileges on the tables you want Genie to query.
If you are on a shared workspace, confirm with your admin that partner-powered AI features are enabled at the account level. Without that toggle on, the Genie spaces icon appears in the sidebar but no spaces will load.
Step 1: Create the Genie Space
Open your Databricks workspace and click Genie spaces in the left sidebar. Click New in the upper-right corner. A dialog opens asking you to select data sources. Choose the Unity Catalog tables or views you want Genie to answer questions about. You can add up to 30 tables per space. Click Create.
At this point you have a functional but unconfigured space. Genie can already attempt to answer questions, but response accuracy will be inconsistent without further setup. Treat this step as placing the foundation.
One practical note on scoping: create separate Genie spaces for different business domains rather than loading every available table into one space. A sales Genie space and a finance Genie space each perform better than a single space with 25 unrelated tables. Focused spaces give the model cleaner context and reduce hallucination on ambiguous column names.
Step 2: Review Query Suggestions
After adding tables, Genie scans your workspace query history for SQL queries that reference those tables. If relevant queries are found, a notification appears in the Data tab of the Instructions panel. Click Review to open the suggested query dialog.
Each suggestion shows a proposed question title and the full SQL query. Review them one by one. Edit the title to match how a real business user would phrase the question. Specific phrasing matters because Genie uses the title text to match incoming prompts to stored examples.
Accept queries that represent real, recurring patterns. Reject queries that are one-off or unrepresentative. Accepted queries go into the SQL Queries context for the space and form the beginning of Genie's example library.
If no suggestions appear, either the tables are new with no query history, or the workspace queries do not have sufficient permissions for Genie to surface them. In that case, move on and add example queries manually in Step 4.
Step 3: Build the Knowledge Store
The knowledge store is the single biggest lever for improving response quality. It does not require writing SQL. It is a structured way of telling Genie what your data means in plain English.
Open Configure and click Data. Click any table name to see its columns. From here you can add four types of semantic context.
Column descriptions: Write plain-English descriptions for columns with ambiguous or technical names. If you have a column called fc_cat, add a description saying it stands for "forecast category" and can hold values "Pipeline," "Commit," or "Best Case." This eliminates a class of wrong answers where Genie guesses at column intent.
Synonyms: Map business terms to column names. If your team calls it ARR but the column is named annual_recurring_revenue, add a synonym. Now any query asking about ARR resolves correctly without the user knowing the actual column name.
Join relationships: Define how tables connect. Select a table and click Add relationship. Enter the join condition, for example accounts.id = opportunity.accountid. Genie uses these definitions to construct accurate JOIN statements without being prompted each time. For multi-column joins or joins using SQL expressions, click Use SQL expression to write the full condition.
SQL expressions: Define reusable measures and filters that encode business logic. A common example is a current open pipeline measure:
SELECT SUM(amount)
FROM sales.crm.opportunity
WHERE stagename NOT ILIKE '%closed%'
AND forecastcategory = 'Pipeline'
Save this as a named measure called Open Pipeline. Users can now ask "what is our open pipeline this quarter?" and get a consistent, repeatable answer every time. Each space supports up to 200 knowledge store snippets across descriptions, joins, and expressions.
Step 4: Add Text Instructions and Example Queries
Click Configure and then Instructions. Open the Text tab and write guidelines in the General instructions field. The most useful patterns are date definitions, term definitions, and filter rules.
Date definition example: "When users ask for 'this month,' use the current calendar month based on CURRENT_DATE."
Term definition example: "A closed deal means stagename = 'Closed Won'. A lost deal means stagename = 'Closed Lost'."
Filter rule example: "Always filter out accounts where account_type = 'Internal' unless the user explicitly asks to include them."
Keep instructions specific and free of contradictions. Vague guidance like "be helpful and accurate" adds noise without improving response quality. Genie operates in a nondeterministic way, so conflicting instructions increase the chance of inconsistent answers.
In the SQL Queries tab, add example queries for the most common questions your team will ask. Write each title as the exact question a user would type. The more closely the title matches real user phrasing, the more reliably Genie returns a Trusted response. Focus on queries that encode logic unique to your organization: specific joins, business-defined filters, and aggregations that reflect how your team actually defines its metrics. Each space supports up to 100 total instructions across example queries, SQL functions, and the general instructions block.
Step 5: Share With Your Team
Click the share icon in the top-right corner to open the access control dialog. Add users or groups and assign one of three permission levels: CAN VIEW for users who will ask questions and read answers, CAN EDIT for analysts who will tune the space configuration, and CAN MANAGE for full control including sharing.
For most business users, CAN VIEW is the right level. Assign CAN EDIT to the analysts who plan to continue refining the space.
After sharing, send a short briefing explaining what data is in the space and what kinds of questions it is designed to answer. Genie performs best on questions within the domain of its configured tables. A brief orientation reduces frustration from users asking off-scope questions and getting low-confidence or incorrect answers.
Monitoring and Improvement
Open Configure and click Monitor to see conversation history and flag problematic responses. When a response is wrong, use it as a starting point to add a corrective example query or refine an instruction. The spaces that work best after 30 days are the ones with 20 to 30 carefully written example queries covering the domain's core questions, not the ones with the most tables or the most instructions.
Schedule a weekly review for the first month. Check which questions triggered fallback answers or hedged responses. Add example queries for recurring patterns you see in the conversation logs. After the first month, a biweekly review is usually sufficient to keep accuracy high as questions evolve.
If your team works primarily from uploaded spreadsheets and CSV files rather than a managed data lakehouse, VSLZ handles ad-hoc data questions directly from a file upload without requiring Unity Catalog or a SQL warehouse to be configured first.
Summary
A well-configured Genie space takes two to four hours to set up properly: around 30 minutes to create the space and add tables, one hour to build the knowledge store, and one to two hours to write 15 to 20 good example queries. The return is a self-service analytics tool your non-technical stakeholders can use without opening a ticket. The monitoring step is what keeps it reliable over time; the spaces that degrade are the ones that never get reviewed after launch.
FAQ
What are the prerequisites for using Databricks AI/BI Genie?
You need three things: data registered in Unity Catalog (Hive Metastore tables are not supported), a pro or serverless SQL warehouse with at least CAN USE permission, and the Databricks SQL workspace entitlement along with SELECT privileges on the tables you want to include. Your workspace admin also needs to enable partner-powered AI features at the account level.
How many tables can I add to a single Genie space?
Each Genie space supports up to 30 tables or views. For best results, keep spaces focused on a single business domain rather than loading all available tables into one space. A focused space with fewer, well-described tables performs better than a large space with many unrelated tables.
What is a knowledge store in Databricks Genie?
A knowledge store is a collection of semantic definitions that helps Genie understand your data. It includes column descriptions, synonyms that map business terms to column names, join relationship definitions between tables, and reusable SQL expressions for common measures and filters. Each space supports up to 200 knowledge store snippets. Building a knowledge store is the single most effective way to improve Genie response accuracy.
How do I improve Genie response accuracy after launch?
Review conversation history in the Monitor tab weekly during the first month. Identify questions that returned low-confidence or wrong answers. For each pattern you see repeating, add a corrective example SQL query with a title that matches the natural phrasing of that question. Also refine your General Instructions to clarify date definitions, business term definitions, and any required default filters. Spaces that receive regular updates maintain higher accuracy over time.
Can business users access a Genie space without knowing SQL?
Yes. Business users with CAN VIEW permission can ask questions in plain English through the Genie chat interface without writing any SQL. Genie generates and runs the SQL behind the scenes and returns results as tables or visualizations. Users can see the SQL behind a response if they have CAN EDIT access, which is useful for verifying answers or for analysts who want to learn from the generated queries.


