Guides

How to Set Up Snowflake Cortex Analyst

Arkzero ResearchApr 24, 20267 min read

Last updated Apr 24, 2026

Snowflake Cortex Analyst lets analysts and business users ask plain-English questions about their warehouse data and get SQL-generated answers instantly. Setup requires a Snowflake account, a semantic model YAML file that maps your tables into business terms, and either a Streamlit interface or the Cortex REST API. This guide walks through each step from environment prep to running your first natural language query.

What Snowflake Cortex Analyst Does

Snowflake Cortex Analyst translates plain-English questions into SQL and runs them against your Snowflake tables. Ask "What were the top five sales regions last quarter?" and Cortex Analyst generates the query, executes it, and returns the answer in a chat interface or via an API response.

The mechanism behind it is a semantic model: a YAML file you create once that maps your technical table and column names to business-friendly terms. When a user submits a question, Cortex Analyst reads that model to decide which tables apply, which columns represent revenue versus region versus date, and what aggregations make sense. The model is what separates a correct answer from a confusing SQL error.

Cortex Analyst reached general availability in early 2026 and is included in all Snowflake editions at no additional charge beyond the compute credits consumed.

Prerequisites

Before starting, confirm you have access to the following:

A Snowflake account (any cloud region) with a role that can create databases, schemas, tables, and stages. A free trial account with ACCOUNTADMIN covers all required permissions.
At least one structured table already loaded into Snowflake with clean column types. Date columns stored as strings will break Cortex Analyst's time intelligence features.
Python 3.8 or above installed locally, along with the Snowflake Python connector: pip install snowflake-connector-python

Start with a single table. Trying to include your entire warehouse in the first semantic model is a common mistake that makes debugging difficult.

Step 1: Identify Your Target Table

A sales transaction table is a strong starting point because it has clear dimensions (date, product, region, salesperson) and obvious measures (revenue, quantity, margin). For this guide, assume a table called SALES.PUBLIC.ORDERS with these columns:

order_id        VARCHAR
order_date      DATE
region          VARCHAR
product_name    VARCHAR
salesperson     VARCHAR
revenue         FLOAT
quantity        INTEGER
margin_pct      FLOAT

Run a quick sanity check before building the semantic model:

SELECT * FROM SALES.PUBLIC.ORDERS LIMIT 10;
SELECT typeof(order_date) FROM SALES.PUBLIC.ORDERS LIMIT 1;

Confirm that order_date returns DATE or TIMESTAMP_NTZ, not TEXT. If it returns a string type, cast it in a view and point the semantic model at the view instead of the base table.

Step 2: Create the Semantic Model YAML

This step is where most guides lose readers. The semantic model YAML is not boilerplate — it is the configuration that determines whether Cortex Analyst gives useful answers or nonsensical ones.

Create a file called orders_model.yaml:

name: orders_semantic_model
description: Sales order data for the company, refreshed daily at midnight UTC.
tables:
  - name: orders
    description: One row per completed sales transaction.
    base_table:
      database: SALES
      schema: PUBLIC
      table: ORDERS
    time_dimensions:
      - name: order_date
        description: The date the order was placed.
        expr: order_date
        data_type: date
        unique: false
    dimensions:
      - name: region
        description: The geographic region of the sale. Also called territory or area.
        expr: region
        data_type: varchar
      - name: product_name
        description: The name of the product sold.
        expr: product_name
        data_type: varchar
      - name: salesperson
        description: The name of the salesperson who closed the deal. Also called rep or account executive.
        expr: salesperson
        data_type: varchar
    measures:
      - name: total_revenue
        description: Total revenue in US dollars.
        expr: SUM(revenue)
        data_type: float
        default_aggregation: sum
      - name: total_quantity
        description: Total units sold.
        expr: SUM(quantity)
        data_type: integer
        default_aggregation: sum
      - name: avg_margin_pct
        description: Average gross margin percentage across orders.
        expr: AVG(margin_pct)
        data_type: float
        default_aggregation: avg

Three things to get right in this file. First, the description fields are read by the LLM, not by a parser — write them in the language your team actually uses. If your team calls margin "gross margin," say gross margin. If salespeople are referred to as "reps," add that as a synonym in the description. Second, classify each column correctly: dimensions are categorical or qualitative fields you filter or group by, time dimensions are date or timestamp columns used for period questions like "last month" or "year to date," and measures are numeric columns you aggregate. Third, match the database and schema in base_table exactly to where the table lives in Snowflake — a mismatch here produces a cryptic error that does not name the cause.

Step 3: Upload the Semantic Model to a Stage

Cortex Analyst reads the YAML from a Snowflake internal stage, not from your local filesystem. Create the stage and upload the file.

CREATE STAGE IF NOT EXISTS SALES.PUBLIC.cortex_models
  ENCRYPTION = (TYPE = 'SNOWFLAKE_SSE');

Then from your terminal using the Snowflake CLI:

snowsql -a <your_account> -u <your_username> \
  -q "PUT file:///path/to/orders_model.yaml @SALES.PUBLIC.cortex_models AUTO_COMPRESS=FALSE"

Verify the upload before proceeding:

LIST @SALES.PUBLIC.cortex_models;

You should see orders_model.yaml with a non-zero file size. If the file size shows as zero, the upload silently failed — retry with a full absolute path.

Step 4: Query Using the Python Client

With the semantic model staged, call Cortex Analyst from Python to verify the setup before building any interface.

import snowflake.connector
import json

conn = snowflake.connector.connect(
    account="<your_account>",
    user="<your_username>",
    password="<your_password>",
    database="SALES",
    schema="PUBLIC",
    warehouse="COMPUTE_WH"
)

cursor = conn.cursor()
question = "What were the top three regions by total revenue last month?"

response = cursor.execute(
    """
    SELECT SNOWFLAKE.CORTEX.ANALYST(
        '@SALES.PUBLIC.cortex_models/orders_model.yaml',
        ?
    )
    """,
    (question,)
).fetchone()

result = json.loads(response[0])
print(json.dumps(result, indent=2))

A successful response returns the SQL Cortex Analyst generated and the result set. If you get a model-not-found error, check that the stage path in the query matches the exact stage name and YAML filename from Step 3. If you get a permission error, verify the role used for the connection has USAGE on the database and schema and READ on the stage.

Step 5: Add a Streamlit Interface for Business Users

For team members who should not be writing Python, a Streamlit app provides a chat-like interface without exposing any code.

Install Streamlit with pip install streamlit, then create app.py:

import streamlit as st
import snowflake.connector
import json

st.title("Sales Data Assistant")
question = st.text_input("Ask a question about your sales data:")

if question:
    conn = snowflake.connector.connect(
        account=st.secrets["snowflake"]["account"],
        user=st.secrets["snowflake"]["user"],
        password=st.secrets["snowflake"]["password"],
        database="SALES",
        schema="PUBLIC",
        warehouse="COMPUTE_WH"
    )
    cursor = conn.cursor()
    response = cursor.execute(
        """
        SELECT SNOWFLAKE.CORTEX.ANALYST(
            '@SALES.PUBLIC.cortex_models/orders_model.yaml',
            ?
        )
        """,
        (question,)
    ).fetchone()
    result = json.loads(response[0])
    answer = result.get("message", {}).get("content", "No answer returned.")
    st.write(answer)

Store Snowflake credentials in .streamlit/secrets.toml under a [snowflake] block. Run the app with streamlit run app.py and share the local URL with your team.

For teams without a Snowflake environment who want the same natural language querying experience, VSLZ.ai lets you upload a file directly and ask plain-English questions without any YAML configuration or warehouse setup.

Where Cortex Analyst Works Well and Where It Struggles

Cortex Analyst is accurate for aggregation questions: totals, averages, top-N rankings, and period comparisons. In testing across 500 sample queries on well-described semantic models, Snowflake reports accuracy rates above 85% for these question types.

It struggles when questions require joining multiple tables not described in the same semantic model, when business logic is implicit in the phrasing ("show me at-risk accounts" where at-risk means a specific churn score threshold your team defined), or when the semantic model descriptions are vague. If a question returns a wrong result, the fix is almost always in the YAML descriptions, not the underlying data.

Snowflake logs all Cortex Analyst queries in the QUERY_HISTORY view. Reviewing that log after the first week of real usage shows which questions fail most often and points directly at which descriptions in the semantic model need more specificity.

Next Steps

Once the basic setup works on one table, extend the semantic model to cover additional tables and add a relationships section so Cortex Analyst can join them. Snowflake's semantic model specification also supports verified queries: saved question-and-SQL pairs that Cortex Analyst will prefer when a new question closely matches a verified one. These verified queries improve accuracy significantly for the 10 to 20 questions your team asks most often.

FAQ

What is Snowflake Cortex Analyst and how does it work?

Snowflake Cortex Analyst is a feature within Snowflake Cortex AI that translates plain-English questions into SQL queries and runs them against your Snowflake data. It uses a semantic model YAML file you configure to understand your table structure in business terms, then applies a large language model to convert user questions into accurate SQL. It is available in all Snowflake editions at no additional license cost beyond standard compute credits.

What is a semantic model in Snowflake Cortex Analyst?

A semantic model is a YAML configuration file that maps your Snowflake table and column names into business-friendly terms. It classifies each column as a dimension (categorical field for grouping or filtering), time dimension (date or timestamp for period-based questions), or measure (numeric field to aggregate). The descriptions you write in this file are read by the LLM to decide which columns to use when answering a question, making accurate descriptions more important than the technical configuration.

Do I need to know SQL or Python to use Snowflake Cortex Analyst?

End users do not need SQL or Python knowledge to ask questions once Cortex Analyst is set up. However, an initial technical setup is required: creating a Snowflake stage, writing the semantic model YAML, and deploying the Python client or Streamlit app. A data engineer or analyst typically handles the one-time setup, after which business users can query data through a chat interface without writing any code.

How accurate is Snowflake Cortex Analyst for business questions?

Snowflake reports accuracy rates above 85% for aggregation questions (totals, averages, top-N rankings, period comparisons) on well-configured semantic models. Accuracy drops for questions that require joining tables not covered in the same semantic model, or for questions with implicit business logic embedded in the phrasing. The primary lever for improving accuracy is adding more specific descriptions and synonyms to the semantic model YAML, not modifying the underlying data.

How much does Snowflake Cortex Analyst cost?

Snowflake Cortex Analyst is included in all Snowflake editions with no additional license fee. Usage is billed through standard Snowflake compute credits consumed when executing the generated SQL queries. There is no per-query fee for the natural language translation itself. For exact credit consumption rates for your warehouse size, check your Snowflake account's usage dashboard.