Guides

How to Set Up ClickHouse Cloud for Analytics

Arkzero ResearchApr 26, 20267 min read

Last updated Apr 26, 2026

ClickHouse Cloud is a managed deployment of the open-source ClickHouse columnar database, built for fast analytical queries on large datasets. You can create a free account, upload a CSV file, and run sub-second SQL aggregations without managing any infrastructure. This guide covers account setup, table creation, data loading from CSV and remote sources, running aggregation queries, and connecting a BI tool to your ClickHouse service.
ClickHouse logo on a clean background representing fast analytical database

ClickHouse Cloud lets you run a fully managed ClickHouse instance in a browser, with no local installation required. You sign up, provision a service in about 30 seconds, and start writing SQL against your data. For teams that need fast aggregations on large event, log, or transaction datasets, it removes the infrastructure work that previously made ClickHouse accessible only to engineering teams.

What Makes ClickHouse Different From a Standard Database

Most relational databases store data row by row. When a query aggregates one column across millions of rows, the database reads every column in every row to find the values it needs. On large datasets, this makes analytical queries slow.

ClickHouse stores data column by column. A query summing revenue only reads the revenue column. A query counting events by country reads two columns. The database skips everything else on disk. This is why ClickHouse can aggregate a billion rows in under two seconds on a single node, where Postgres or MySQL on the same hardware would take several minutes.

Columnar storage also compresses well. ClickHouse applies LZ4 compression by default and achieves 10x to 20x compression ratios on most real-world datasets. One terabyte of raw event data typically fits in 50 to 100GB on disk, which reduces storage costs and speeds up disk reads further.

Companies at significant scale use ClickHouse specifically for this profile. Cloudflare processes trillions of DNS and HTTP events per day through ClickHouse. Spotify uses it for music streaming analytics. eBay uses it to power buyer and seller reporting dashboards.

Creating a ClickHouse Cloud Account

Go to clickhouse.cloud and click Start for Free. You will need an email address and a password. No credit card is required to start. The free tier includes 1TB of data transfer and 10GB of storage per month, which is enough to run real experiments before committing to a paid plan.

After signing in, ClickHouse prompts you to create a service. A service is a single managed ClickHouse cluster. Select a cloud provider and region closest to your users. AWS us-east-1 is a reasonable default for North American teams. For the tier, choose Development. This scales down automatically when idle and keeps costs near zero for low-traffic workloads under 100GB.

Click Create Service. Provisioning completes in about 30 seconds and you are taken to the service overview.

Navigating the SQL Console

Open SQL Console from the left sidebar. This is a browser-based query editor connected directly to your instance. No local client installation is needed.

The console has three panels: a database tree on the left listing databases and tables, a query editor in the center, and a results pane below. You can open multiple query tabs, and results export to CSV with one click.

Confirm the connection is working with this query:

SELECT version()

This returns the current ClickHouse server version. If it responds, your service is live.

Creating Your First Table

ClickHouse tables require an engine. For analytical workloads, use MergeTree. It handles high-volume inserts efficiently and is optimized for range scans and aggregations.

Here is a table schema for website event data:

CREATE DATABASE IF NOT EXISTS analytics;

CREATE TABLE analytics.events
(
    event_id     UUID DEFAULT generateUUIDv4(),
    user_id      String,
    event_name   String,
    page         String,
    occurred_at  DateTime,
    country      String,
    revenue      Float64
)
ENGINE = MergeTree()
ORDER BY (occurred_at, user_id);

The ORDER BY clause defines the primary sort key. ClickHouse physically sorts data on disk by these columns. Range queries on timestamp columns become significantly faster when the timestamp is the first sort key. For time-series analytics, this is standard practice.

Loading Data from a CSV File

ClickHouse Cloud supports CSV uploads directly from the SQL Console. Click the database icon in the left sidebar, then select Import Data. Drag and drop a CSV file up to 1GB. ClickHouse will auto-detect column types and generate a CREATE TABLE statement.

Review the inferred schema before confirming. ClickHouse sometimes interprets numeric identifiers as integers rather than strings, which causes problems when those IDs are concatenated or compared as text rather than arithmetic.

For larger files, the URL table function streams data directly from remote storage:

INSERT INTO analytics.events
SELECT * FROM url('https://your-bucket.s3.amazonaws.com/events.csv', CSVWithNames)

ClickHouse supports reading from S3, GCS, and Azure Blob Storage using the same pattern, with dedicated s3() and gcs() functions that accept credentials directly in the query.

Running Analytical Queries

With data loaded, a standard aggregation looks like this:

SELECT
    event_name,
    count()            AS total_events,
    round(avg(revenue), 2) AS avg_revenue
FROM analytics.events
WHERE occurred_at >= now() - INTERVAL 30 DAY
GROUP BY event_name
ORDER BY total_events DESC
LIMIT 20;

On a table with 50 million rows, this query typically completes in under one second on a Development tier service.

ClickHouse supports window functions for rolling calculations. This query computes a seven-day rolling average of daily revenue:

SELECT
    toDate(occurred_at)  AS day,
    sum(revenue)         AS daily_revenue,
    avg(sum(revenue)) OVER (
        ORDER BY day
        ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
    ) AS rolling_7d_avg
FROM analytics.events
GROUP BY day
ORDER BY day;

Both queries run interactively without pre-aggregation or caching configuration.

Connecting a BI Tool

ClickHouse Cloud exposes an HTTPS endpoint and a native TCP port. Most BI tools support ClickHouse via a JDBC or HTTP driver.

In Metabase, go to Admin, then Databases, then Add a Database. Select ClickHouse as the database type. Install the official ClickHouse Metabase driver from the Metabase admin panel if it is not already listed. Enter the host, port 8443 for HTTPS, database name, and the credentials shown in your ClickHouse Cloud service settings. Metabase will read the table schema and make your tables available for questions and dashboards within a few minutes.

Grafana, Apache Superset, and Tableau all follow the same pattern with their respective ClickHouse connectors. If you want to skip schema design and SQL entirely, VSLZ lets you upload a data file and ask questions in plain English, handling the analysis without any query configuration.

Controlling Costs with Materialized Views

On the free tier, charges reset monthly. On paid plans, ClickHouse charges on compute used during queries and on compressed storage bytes. Idle Development services pause automatically after a configurable period, stopping compute charges between uses.

Materialized views reduce repeated query costs significantly. A materialized view pre-aggregates data on insert, so dashboards read from a small summary table instead of scanning billions of raw rows:

CREATE MATERIALIZED VIEW analytics.daily_revenue_mv
ENGINE = SummingMergeTree()
ORDER BY day
AS SELECT
    toDate(occurred_at) AS day,
    sum(revenue)        AS total_revenue
FROM analytics.events
GROUP BY day;

Once this view exists, a dashboard query reading daily totals scans a table with one row per day rather than millions of raw events. Compute cost drops to near zero for that query pattern.

When ClickHouse Is Not the Right Tool

ClickHouse is purpose-built for analytical reads. It is not suited for transactional workloads that involve frequent row-level updates, high-frequency point lookups by primary key, or complex relational writes. Postgres or MySQL handle those cases better.

The right architecture for most teams keeps a transactional database for application writes and replicates data to ClickHouse for the analytical layer. ClickHouse can read directly from Postgres using the postgresql() table function, or a replication tool like Airbyte can sync tables on a schedule.

Summary

ClickHouse Cloud is a practical starting point for teams with large analytical datasets who want sub-second query times without managing infrastructure. The free tier covers real workloads. The SQL Console removes the need for local tooling. The columnar engine makes aggregations fast enough for interactive dashboards at billion-row scale.

The setup path is: create an account, provision a Development service, define a MergeTree table, load data via CSV upload or URL function, and connect a BI tool. Most teams have a working dashboard within a day of starting.

FAQ

Is ClickHouse Cloud free to use?

Yes. ClickHouse Cloud offers a free tier that includes 1TB of data transfer and 10GB of storage per month. No credit card is required to sign up. Development tier services pause automatically when idle, which keeps compute costs near zero during periods of low activity. Paid plans charge on compute consumed during queries and on compressed storage bytes.

What is the difference between ClickHouse and PostgreSQL?

PostgreSQL is a row-oriented database optimized for transactional workloads: inserts, updates, point lookups, and complex relational queries on moderate data volumes. ClickHouse is a column-oriented database optimized for analytical workloads: aggregations, range scans, and GROUP BY queries on billions of rows. ClickHouse is significantly faster for analytics but is not designed for high-frequency row-level updates. Most production setups use both: Postgres for application writes and ClickHouse for the analytical reporting layer.

Can ClickHouse query CSV files directly?

Yes. ClickHouse Cloud supports CSV upload via the SQL Console for files up to 1GB. For larger files, the url(), s3(), and gcs() table functions stream data directly from remote storage into a query or INSERT operation without loading the full file into memory. ClickHouse auto-detects column types and delimiters with the CSVWithNames format when column headers are present in the file.

How fast is ClickHouse for analytical queries?

On a single ClickHouse Cloud Development node, aggregation queries on datasets of 50 to 100 million rows typically complete in under one second. ClickHouse achieves this through columnar storage, LZ4 compression (10x to 20x compression ratios on most real-world data), vectorized query execution, and multi-threaded processing. The official ClickHouse benchmark at benchmark.clickhouse.com shows sub-two-second GROUP BY results on one billion rows, compared to several minutes on traditional row-oriented databases.

Can I connect Metabase or Grafana to ClickHouse Cloud?

Yes. ClickHouse Cloud exposes an HTTPS endpoint on port 8443 and supports connections from Metabase, Grafana, Apache Superset, Tableau, and most other BI tools via official ClickHouse drivers. In Metabase, install the ClickHouse driver from the admin panel, then add a new database using the service host, port 8443, and the credentials shown in your ClickHouse Cloud service settings. The process takes less than five minutes and makes all your tables available for dashboard building.

Related

OpenMetadata data catalog interface showing database schema discovery
Guides

How to Set Up OpenMetadata for Data Discovery

OpenMetadata is an open-source data catalog that gives teams a single place to discover, document, and govern their data assets. Setting it up takes under 30 minutes using Docker: spin up the containers, log into the UI at localhost:8585, then connect your first data source using one of 90+ pre-built connectors. Once ingestion runs, every table, column, and owner is searchable and lineage-linked across your entire stack.

Arkzero Research · Apr 29, 2026
Streamlit logo on a clean white background
Guides

How to Build a Data Dashboard with Streamlit

Streamlit is an open-source Python library that turns a script into a shareable web dashboard without any front-end code. Install it with pip, write a Python file that loads your CSV with pandas, add sidebar widgets for filtering, and render interactive charts with Plotly. Push the file to GitHub, connect it to Streamlit Community Cloud, and anyone with the URL can view live results. No server configuration required.

Arkzero Research · Apr 29, 2026
Airbyte Cloud data integration platform
Guides

How to Set Up Airbyte Cloud for Data Syncing

Airbyte Cloud is a managed data integration platform that syncs data from SaaS tools, databases, and APIs into a central warehouse without requiring Docker, infrastructure, or engineering resources. A free 30-day trial lets you connect sources like Salesforce, HubSpot, Stripe, or Google Sheets to destinations like BigQuery, Snowflake, or Postgres in minutes. This guide walks through the full setup from account creation to your first automated sync.

Arkzero Research · Apr 29, 2026