Guides

How to Set Up Jupyter AI for Data Analysis

Arkzero ResearchApr 28, 20267 min read

Last updated Apr 28, 2026

Jupyter AI is an open-source JupyterLab extension that connects AI models to your notebook environment. Install it with a single pip command, enter your OpenAI or Anthropic API key in settings, and use the %%ai magic command to generate analysis code, summarize results, and debug errors without leaving the notebook. This guide covers installation, model configuration, and three practical data analysis tasks you can run immediately.
A JupyterLab interface showing AI-assisted data analysis in a notebook environment

If you already use Jupyter notebooks for data work, Jupyter AI adds a chat interface and %%ai magic command that lets you prompt a language model from inside the notebook itself. No browser tab switching, no copy-pasting code between tools. You install one package, add your API key, and the model becomes available as a collaborator in the same environment where your data lives.

This guide covers installation, model configuration with OpenAI or Anthropic, and three practical examples you can run immediately on your own data.

What Jupyter AI Does

Jupyter AI is a JupyterLab extension maintained by the Project Jupyter team. It adds two things to your environment.

Jupyternaut is a sidebar chat UI where you ask questions, get code suggestions, and have the AI inspect or explain cells in your current notebook. The %%ai magic command lets you send a prompt directly from a notebook cell and have the response appear inline, formatted as code, markdown, or plain text depending on what you need.

Both interfaces connect to the same underlying model. You configure your provider and API key once in settings, and both use it.

As of early 2026, Jupyter AI supports OpenAI, Anthropic, Amazon Bedrock, Google Gemini, Hugging Face, and self-hosted models via Ollama. For most analysts working with business data, OpenAI (GPT-4o) or Anthropic (Claude Sonnet) are the practical choices because they handle unstructured data summaries and code generation reliably.

Step 1: Prerequisites

You need JupyterLab 4.x. Classic Notebook does not support the chat extension. Check your version:

jupyter lab --version

If the output is below 4.0, upgrade first:

pip install --upgrade jupyterlab

You also need Python 3.9 or higher. Check with python --version.

Step 2: Install Jupyter AI

The base package is jupyter-ai. To use OpenAI models you also need langchain-openai. Install both together:

pip install jupyter-ai langchain-openai

For Anthropic Claude support instead:

pip install jupyter-ai langchain-anthropic

After installation, restart JupyterLab:

jupyter lab

You should see a new chat icon in the left sidebar.

Step 3: Configure Your API Key

Open the Jupyternaut chat panel by clicking the chat icon in the left sidebar. Click the gear icon in the top right of the chat panel to open settings.

Under Language Model, select your provider from the dropdown. For OpenAI, choose OpenAI :: gpt-4o. For Anthropic, choose Anthropic :: claude-sonnet-4-5.

Enter your API key in the field provided. The key is stored in your local JupyterLab configuration and is not sent anywhere other than the selected provider's API.

Alternatively, set the key as an environment variable before launching JupyterLab. This is the recommended approach for shared or team environments:

export OPENAI_API_KEY="your-key-here"
jupyter lab

Once configured, send a test message in the chat panel to confirm the connection is working.

Step 4: Use the %%ai Magic Command

The %%ai magic command runs from a notebook cell and sends everything below the %%ai line as a prompt to the configured model. The response appears as a new cell output.

The basic syntax is:

%%ai openai-chat:gpt-4o
Your prompt here

If you have already configured a default model in settings, you can use the shorthand:

%%ai
Your prompt here

Three practical examples follow.

Example 1: Generate Exploratory Analysis Code

Suppose you have loaded a CSV into a pandas DataFrame called df. Ask the AI to write the analysis:

%%ai openai-chat:gpt-4o
I have a pandas DataFrame called df with columns: date, revenue, region, product_category.
Write code to:
1. Show a monthly revenue trend line plot
2. Show a bar chart of revenue by region
3. Print the top 5 product categories by total revenue
Use matplotlib and print a one-line summary of the most important finding.

The model returns runnable Python code you can execute in the next cell. In a benchmark by Towards Data Science comparing AI-assisted notebook workflows, analysts using %%ai for exploratory analysis tasks reduced time-to-first-insight by approximately 40 percent compared to manually searching documentation or Stack Overflow.

Example 2: Summarize Results in Plain English

After running analysis, use the magic command to reformat findings for non-technical stakeholders:

%%ai openai-chat:gpt-4o -f markdown
Based on this output: revenue grew 23% in Q1 2026, driven by the West region (+41%) and the SaaS product category (+58%). The East region declined 12%.
Write a 3-sentence executive summary suitable for a business report.

The -f markdown flag renders the response as formatted markdown. Other options are code, html, json, and text.

Example 3: Debug a Failing Cell

When a cell throws an error, paste the traceback into a %%ai prompt without leaving the notebook:

%%ai openai-chat:gpt-4o
I got this error:
KeyError: 'revenue'
df.groupby('region')['revenue'].sum()

My DataFrame columns are: ['Revenue', 'Region', 'Date', 'Product'].
What is the fix?

The model identifies the case mismatch and returns the corrected line in under a second. This workflow keeps context intact: the notebook cell above the %%ai prompt contains the exact code that failed, so there is nothing to re-explain.

Using the Sidebar Chat for Notebook-Wide Context

Jupyternaut can read your open notebook when you enable the "Include notebook context" option in chat settings. This lets you ask higher-level questions without quoting specific cells.

Once enabled, you can ask: "Summarize what this notebook does in three sentences" or "Is there a more efficient way to do the join in cell 7?" The model reads all cells and responds with full context.

This is useful for reviewing a notebook before sharing it, or for identifying performance bottlenecks in longer analysis scripts.

Storage and Cost

Jupyter AI does not store conversation history on any server. All messages go directly from your machine to the model provider's API and back.

For typical data analysis tasks, each %%ai prompt sends between 200 and 2,000 tokens depending on how much context you include. At GPT-4o pricing of approximately $2.50 per million input tokens (April 2026 pricing), a full day of active AI-assisted notebook work costs under $1 for most analysts.

If your primary goal is analyzing a file or data source rather than writing notebook code, VSLZ AI lets you upload a CSV or connect a database and get end-to-end analysis from a single prompt without any environment setup or API key management.

Common Troubleshooting

Extension does not appear after install: Run jupyter lab build and restart. This is required on some older JupyterLab 4 installations.

API key not recognized: Set the environment variable before launching JupyterLab, not after. Restart JupyterLab after setting the variable.

%%ai command not found: The magic is only available if jupyter-ai is installed in the same Python environment as the running kernel. If you use virtual environments or conda, install into the active environment before launching.

Model returns an error: Confirm the provider package is installed. For OpenAI, langchain-openai must be present. For Anthropic, langchain-anthropic must be present. Reinstall and restart the kernel.

Summary

Jupyter AI adds AI chat and the %%ai magic command to JupyterLab in under five minutes. Configure your API key once, then generate code, format summaries, and debug errors without leaving the notebook. For analysts who already work in notebooks daily, it removes the friction of switching to a separate tool for every code generation or explanation task.

FAQ

What AI models does Jupyter AI support?

Jupyter AI supports OpenAI (GPT-4o, GPT-4o mini), Anthropic (Claude Sonnet, Claude Haiku), Amazon Bedrock, Google Gemini, Hugging Face inference endpoints, and self-hosted models through Ollama. Each provider requires its own Python package and API key. For OpenAI, install langchain-openai. For Anthropic, install langchain-anthropic. You configure the active model in the Jupyternaut settings panel inside JupyterLab.

Does Jupyter AI work in classic Jupyter Notebook or only JupyterLab?

The Jupyternaut sidebar chat interface requires JupyterLab 4.x. However, the %%ai magic command works in both JupyterLab and classic Jupyter Notebook as long as the jupyter-ai package is installed and recognized by the IPython kernel. If you use classic Notebook, you can still run %%ai prompts from cells but will not have the sidebar chat UI or notebook-context features.

How much does it cost to use Jupyter AI for data analysis?

Jupyter AI itself is free and open source under the BSD license. You pay only for the model API calls you make. With OpenAI GPT-4o at approximately $2.50 per million input tokens (April 2026 pricing), a full day of AI-assisted data analysis typically costs under $1 USD. Using GPT-4o mini reduces costs to less than $0.20 per day for typical usage. Self-hosted models via Ollama have no per-token cost.

Is my data sent to the cloud when I use Jupyter AI?

Yes, the content of each %%ai prompt and any notebook context you include is sent to the configured model provider's API. If you are working with sensitive data, avoid including raw personally identifiable information in your prompts. Use summarized or anonymized examples instead. For full data privacy, configure Jupyter AI to use a self-hosted model via Ollama, which keeps all processing local and makes no external API calls.

What is the difference between Jupyternaut and the %%ai magic command?

Jupyternaut is the sidebar chat interface in JupyterLab. It supports multi-turn conversations, can read your entire open notebook for context, and lets you ask broad questions about your analysis. The %%ai magic command runs inline from a notebook cell and is designed for single-turn tasks like generating a code snippet, explaining an output, or fixing a specific error. Both interfaces use the same configured model and API key.

Related

OpenMetadata data catalog interface showing database schema discovery
Guides

How to Set Up OpenMetadata for Data Discovery

OpenMetadata is an open-source data catalog that gives teams a single place to discover, document, and govern their data assets. Setting it up takes under 30 minutes using Docker: spin up the containers, log into the UI at localhost:8585, then connect your first data source using one of 90+ pre-built connectors. Once ingestion runs, every table, column, and owner is searchable and lineage-linked across your entire stack.

Arkzero Research · Apr 29, 2026
Streamlit logo on a clean white background
Guides

How to Build a Data Dashboard with Streamlit

Streamlit is an open-source Python library that turns a script into a shareable web dashboard without any front-end code. Install it with pip, write a Python file that loads your CSV with pandas, add sidebar widgets for filtering, and render interactive charts with Plotly. Push the file to GitHub, connect it to Streamlit Community Cloud, and anyone with the URL can view live results. No server configuration required.

Arkzero Research · Apr 29, 2026
Airbyte Cloud data integration platform
Guides

How to Set Up Airbyte Cloud for Data Syncing

Airbyte Cloud is a managed data integration platform that syncs data from SaaS tools, databases, and APIs into a central warehouse without requiring Docker, infrastructure, or engineering resources. A free 30-day trial lets you connect sources like Salesforce, HubSpot, Stripe, or Google Sheets to destinations like BigQuery, Snowflake, or Postgres in minutes. This guide walks through the full setup from account creation to your first automated sync.

Arkzero Research · Apr 29, 2026