# How-To Guides

# Creating a Database

ClickHouse is a high-performance columnar database designed for real-time analytical processing. It’s known for its blazing speed, horizontal scalability, and efficient use of disk I/O. Proper setup is essential for taking advantage of ClickHouse’s full capabilities, including fault tolerance, secure access, and high query performance. This guide walks through various ways to run and connect to ClickHouse: using the ClickHouse CLI (<span class="s3">clickhouse-client</span>), Docker containers, and command-line tools for scripting and automation. Best practices are highlighted throughout to ensure robust deployments.

### **Creating using clickhouse-client**

The ClickHouse command-line interface (<span class="s3">clickhouse-client</span>) is a built-in tool used to connect to and manage ClickHouse servers. It supports both local and remote connections and allows for SQL-based interaction with the database engine.

#### **Connect to ClickHouse:**

If you’re running ClickHouse locally (via package manager or Docker), you can start the CLI with:

```
clickhouse-client
```

For remote connections, specify the hostname, port (default 9000), and user credentials:

```
clickhouse-client -h <host> --port <port> -u <username> --password
```

Once connected, you can run SQL queries directly from the shell.

### **Running ClickHouse Using Docker**

Docker provides a fast, reproducible way to run ClickHouse in isolated environments. This is ideal for local development or self-contained production setups.

##### **Access Elestio Terminal**

If you’re using Elestio for ClickHouse hosting, log into the Elestio dashboard. Go to your ClickHouse service, then navigate to <span class="s2">**Tools &gt; Terminal**</span> to open a pre-authenticated shell session.

[![image.png](https://docs.elest.io/uploads/images/gallery/2025-06/scaled-1680-/Cntimage.png)](https://docs.elest.io/uploads/images/gallery/2025-06/Cntimage.png)

Now change the directory:

```
cd /opt/app/
```

##### **Access the ClickHouse Container Shell**

Elestio-managed services run on Docker Compose. Use this to enter the ClickHouse container:

```
docker-compose exec clickhouse bash
```

##### **Access ClickHouse CLI from Inside the Container**

Once inside the container, the <span class="s2">clickhouse-client</span> tool is available. Run it like this (add <span class="s2">--password</span> if needed):

```
clickhouse-client -u <user> --port <port> --password
```

You are now connected to the running ClickHouse instance inside the container.

##### **Test Connectivity**

Try creating a database and querying data to verify functionality:

```sql
CREATE DATABASE test_db;
CREATE TABLE test_db.test_table (id UInt32, message String) ENGINE = MergeTree() ORDER BY id;
INSERT INTO test_db.test_table VALUES (1, 'Hello ClickHouse');
SELECT * FROM test_db.test_table;
```

**Expected Output:**

```
1	Hello ClickHouse
```

This confirms read/write operations and query functionality.

### **Connecting Using clickhouse-client in Scripts**

<span class="s2">clickhouse-client</span> can be used non-interactively for scripting, automation, and cron-based jobs.

For example, to insert data from a shell script:

```
echo "INSERT INTO test_db.test_table VALUES (2, 'Automated')" | \
clickhouse-client -h <host> -u <user> --password
```

This is useful for automated ETL jobs, health checks, or backup pipelines.

## **Best Practices for Setting Up ClickHouse**

##### **Use Clear Naming for Databases and Tables**

Adopt consistent naming conventions for databases, tables, and columns. Use lowercase, underscore-separated names like:

```
user_events_2024
product_sales_agg
```

This improves clarity in multi-schema environments and helps with automation and maintenance scripts.

##### **Choose the Right Engine and Indexing Strategy**

ClickHouse supports various table engines like <span class="s2">MergeTree</span>, <span class="s2">ReplacingMergeTree</span>, and <span class="s2">SummingMergeTree</span>. Pick the engine that best matches your use case and define <span class="s2">ORDER BY</span> keys carefully to optimize performance.

Example:

```sql
CREATE TABLE logs (
  timestamp DateTime,
  service String,
  message String
) ENGINE = MergeTree()
ORDER BY (timestamp, service);
```

Inappropriate engine selection can lead to poor query performance or high disk usage.

##### **Enable Authentication and Secure Access**

Always configure user-level authentication and restrict access in production. Add users and passwords in <span class="s2">users.xml</span> or via SQL:

```sql
CREATE USER secure_user IDENTIFIED WITH plaintext_password BY 'strong_password';
GRANT ALL ON *.* TO secure_user;
```

Use TLS for encrypted connections by enabling SSL in the <span class="s1">config.xml</span> file:

```xml
<tcp_port_secure>9440</tcp_port_secure>
<openSSL>
  <server>
    <certificateFile>/etc/clickhouse-server/certs/server.crt</certificateFile>
    <privateKeyFile>/etc/clickhouse-server/certs/server.key</privateKeyFile>
  </server>
</openSSL>
```

##### **Configure Data Persistence and Storage Paths**

ClickHouse stores data on disk by default, but ensure proper mounting, storage separation, and backup routines.

<span class="s2">In </span>config.xml<span class="s2">:</span>

```xml
<path>/var/lib/clickhouse/</path>
<tmp_path>/var/lib/clickhouse/tmp/</tmp_path>
<user_files_path>/var/lib/clickhouse/user_files/</user_files_path>
```

Use RAID, SSDs, or networked volumes depending on your availability and performance needs.

##### **Monitor and Tune Performance**

Use built-in introspection tools like:

```sql
SELECT * FROM system.metrics;
SELECT * FROM system.query_log ORDER BY event_time DESC LIMIT 10;
SELECT * FROM system.parts;
```

For real-time observability, integrate with Grafana, Prometheus, or use [ClickHouse Keeper metrics](https://clickhouse.com/docs/en/operations/monitoring/metrics/).

Also review:

- <span class="s1">system.mutations</span> for long-running mutation jobs
- <span class="s1">system.errors</span> for crash/debug info
- <span class="s1">system.replication\_queue</span> for sync issues in replicated tables

## **Common Issues and Their Solutions**

<table border="1" id="bkmrk-issue-cause-solution" style="border-collapse: collapse; border-color: rgb(0, 0, 0);"><thead><tr><th style="border-color: rgb(0, 0, 0);">**Issue**

</th><th style="border-color: rgb(0, 0, 0);">**Cause**

</th><th style="border-color: rgb(0, 0, 0);">**Solution**

</th></tr></thead><tbody><tr><td style="border-color: rgb(0, 0, 0);">Authentication failure

</td><td style="border-color: rgb(0, 0, 0);">Wrong password or no user set

</td><td style="border-color: rgb(0, 0, 0);">Double-check credentials; use <span class="s1">--password</span> flag

</td></tr><tr><td style="border-color: rgb(0, 0, 0);">Cannot connect to localhost

</td><td style="border-color: rgb(0, 0, 0);">Service not running or incorrect port

</td><td style="border-color: rgb(0, 0, 0);">Ensure ClickHouse is running and check the port

</td></tr><tr><td style="border-color: rgb(0, 0, 0);">SSL/TLS handshake failed

</td><td style="border-color: rgb(0, 0, 0);">Incorrect certificate paths or permissions

</td><td style="border-color: rgb(0, 0, 0);">Verify file locations in <span class="s1">config.xml</span> and restart service

</td></tr><tr><td style="border-color: rgb(0, 0, 0);">Queries are slow

</td><td style="border-color: rgb(0, 0, 0);">Poor ORDER BY design or unoptimized table engine

</td><td style="border-color: rgb(0, 0, 0);">Reevaluate schema design and use indexes effectively

</td></tr><tr><td style="border-color: rgb(0, 0, 0);">Data lost after restart

</td><td style="border-color: rgb(0, 0, 0);">Misconfigured data path or ephemeral container

</td><td style="border-color: rgb(0, 0, 0);">Ensure proper disk volume mounts and storage persistence

</td></tr></tbody></table>

# Upgrading to Major Version

Upgrading a database service on Elestio can be done without creating a new instance or performing a full manual migration. Elestio provides a built-in option to change the database version directly from the dashboard. This is useful for cases where the upgrade does not involve breaking changes or when minimal manual involvement is preferred. The version upgrade process is handled by Elestio internally, including restarting the database service if required. This method reduces the number of steps involved and provides a way to keep services up to date with minimal configuration changes.

### **Log In and Locate Your Service**

To begin the upgrade process, log in to your Elestio dashboard and navigate to the specific database service you want to upgrade. It is important to verify that the correct instance is selected, especially in environments where multiple databases are used for different purposes such as staging, testing, or production. The dashboard interface provides detailed information for each service, including version details, usage metrics, and current configuration. Ensure that you have access rights to perform upgrades on the selected service. Identifying the right instance helps avoid accidental changes to unrelated environments.

### **Back Up Your Data**

Before starting the upgrade, create a backup of your database. A backup stores the current state of your data, schema, indexes, and configuration, which can be restored if something goes wrong during the upgrade. In Elestio, this can be done through the **Backups** tab by selecting **Back up now** under Manual local backups and **Download** the backup file. Scheduled backups may also be used, but it is recommended to create a manual one just before the upgrade. Keeping a recent backup allows quick recovery in case of errors or rollback needs. This is especially important in production environments where data consistency is critical.

[![Screenshot 2025-06-11 at 11.10.38 AM.jpg](https://docs.elest.io/uploads/images/gallery/2025-06/scaled-1680-/screenshot-2025-06-11-at-11-10-38-am.jpg)](https://docs.elest.io/uploads/images/gallery/2025-06/screenshot-2025-06-11-at-11-10-38-am.jpg)

### **Select the New Version**

Once your backup is secure, proceed to the **Overview** and then **Software &gt; Update config** tab within your database service page.

[![Screenshot 2025-06-11 at 11.12.02 AM.jpg](https://docs.elest.io/uploads/images/gallery/2025-06/scaled-1680-/screenshot-2025-06-11-at-11-12-02-am.jpg)](https://docs.elest.io/uploads/images/gallery/2025-06/screenshot-2025-06-11-at-11-12-02-am.jpg)

Here, you'll find an option labeled **ENV**. In the **ENV** menu, change the desired database version to `SOFTWARE_VERSION`. After confirming the version, Elestio will begin the upgrade process automatically. During this time, the platform takes care of the version change and restarts the database if needed. No manual commands are required, and the system handles most of the operational aspects in the background.[![Screenshot 2025-06-11 at 11.12.50 AM.jpg](https://docs.elest.io/uploads/images/gallery/2025-06/scaled-1680-/screenshot-2025-06-11-at-11-12-50-am.jpg)](https://docs.elest.io/uploads/images/gallery/2025-06/screenshot-2025-06-11-at-11-12-50-am.jpg)

### **Monitor the Upgrade Process**

The upgrade process may include a short downtime while the database restarts. Once it is completed, it is important to verify that the upgrade was successful and the service is operating as expected. Start by checking the logs available in the Elestio dashboard for any warnings or errors during the process. Then, review performance metrics to ensure the database is running normally and responding to queries. Finally, test the connection from your client applications to confirm that they can interact with the upgraded database without issues.

# Installing and Updating an Extension

ClickHouse supports <span class="s1">**custom extensions**</span> via \[User Defined Functions (UDFs)\], external dictionaries, and shared libraries that extend its core capabilities with custom logic, formats, or integrations. These behave similarly to modules or plugins in other systems and must be configured at server startup. Common examples include integration with geospatial libraries, custom UDFs, or external dictionary sources like MySQL or HTTP.

In Elestio-hosted ClickHouse instances or any Docker Compose-based setup, extensions can be added by mounting external libraries or configuration files and referencing them in <span class="s3">config.xml</span> or <span class="s3">users.xml</span>. This guide walks through how to install, load, and manage ClickHouse extensions using Docker Compose along with best practices and common troubleshooting steps.

## **Installing and Enabling ClickHouse Extensions**

ClickHouse extensions are typically compiled as shared objects (<span class="s3">.so</span>) files or defined as configuration files for dictionaries or formats. These files must be mounted into the container and referenced explicitly in the server’s configuration files.

#### **Example: Load Custom Shared Library UDF**

Suppose you have a compiled UDF called `<span class="s3">libexample_udf.so</span>`. To include it in a Docker Compose setup:

##### **Update docker-compose.yml**

Mount the shared library into the container:

```yaml
services:
  clickhouse:
    image: clickhouse/clickhouse-server:latest
    volumes:
      - ./modules/libexample_udf.so:/usr/lib/clickhouse/user_defined/libexample_udf.so
      - ./configs/config.xml:/etc/clickhouse-server/config.xml
    ports:
      - "8123:8123"
      - "9000:9000"
```

- `<span class="s1">./modules/libexample_udf.so</span>`: local path to the shared library on the host.
- `<span class="s1">/usr/lib/clickhouse/user_defined/</span>`: default directory for user libraries inside the container.

Make sure the file exists before running Docker Compose.

##### **Configure config.xml to Load the UDF**

In your custom <span class="s2">config.xml</span>:

```xml
<user_defined>
    <function>
        <name>example_udf</name>
        <type>udf</type>
        <library>libexample_udf.so</library>
    </function>
</user_defined>
```

> The library path must match the volume mount location.

##### **Restart the ClickHouse Service**

After updating the Compose and configuration files, restart the service:

```
docker-compose down
docker-compose up -d
```

This will reload ClickHouse with the specified UDF.

##### **Verify the Extension is Loaded**

Connect using the ClickHouse CLI or HTTP interface and run:

```
SELECT example_udf('test input');
```

If successful, the function will return expected results from the loaded library. You can also confirm the server loaded your shared library by inspecting logs:

```
docker-compose logs clickhouse
```

Look for lines that indicate the library was found and loaded.

## **Managing External Dictionaries**

ClickHouse supports loading external data sources (like MySQL, HTTP APIs, or files) as dictionaries

##### **Mount Dictionary Configuration**

<span class="s2">In </span>docker-compose.yml<span class="s2">:</span>

```
volumes:
  - ./configs/dictionaries/:/etc/clickhouse-server/dictionaries/
```

##### **Reference in config.xml**

```xml
<dictionaries_config>/etc/clickhouse-server/dictionaries/*.xml</dictionaries_config>
```

Example dictionary file (<span class="s1">mysql\_dictionary.xml</span>):

```xml
<dictionary>
  <name>mysql_dict</name>
  <source>
    <mysql>
      <host>mysql-host</host>
      <user>root</user>
      <password>password</password>
      <db>test</db>
      <table>cities</table>
    </mysql>
  </source>
  <layout><flat /></layout>
  <structure>
    <id>id</id>
    <attribute>
      <name>name</name>
      <type>String</type>
    </attribute>
  </structure>
</dictionary>
```

Use the dictionary in queries:

```
SELECT dictGetString('mysql_dict', 'name', toUInt64(42));
```

## **Updating or Removing Extensions**

ClickHouse doesn’t support unloading UDFs or dictionaries at runtime. To modify or remove an extension:

**1. Stop the container**<span class="s1">:</span>

```
docker-compose down
```

**2. Edit config files**<span class="s1">:</span>

- Replace or remove the <span class="s1">&lt;function&gt;</span> entry in <span class="s1">config.xml</span> or dictionary config.
- Replace or remove the <span class="s1">.so</span> file if applicable.

**3. Restart the container**<span class="s1">:</span>

```
docker-compose up -d
```

> Always test changes in staging before deploying to production.

## **Troubleshooting Common Extension Issues**

<table border="1" id="bkmrk-issue-cause-resoluti" style="border-collapse: collapse; border-color: rgb(0, 0, 0);"><thead><tr><th style="border-color: rgb(0, 0, 0);">**Issue**

</th><th style="border-color: rgb(0, 0, 0);">**Cause**

</th><th style="border-color: rgb(0, 0, 0);">**Resolution**

</th></tr></thead><tbody><tr><td style="border-color: rgb(0, 0, 0);">ClickHouse fails to start

</td><td style="border-color: rgb(0, 0, 0);">Invalid config or missing <span class="s1">.so</span> file

</td><td style="border-color: rgb(0, 0, 0);">Run <span class="s1">docker-compose logs clickhouse</span> and fix missing files or XML syntax

</td></tr><tr><td style="border-color: rgb(0, 0, 0);">UDF not recognized

</td><td style="border-color: rgb(0, 0, 0);">Wrong library path or missing permissions

</td><td style="border-color: rgb(0, 0, 0);">Ensure volume mount is correct and file is executable inside container

</td></tr><tr><td style="border-color: rgb(0, 0, 0);">Dictionary not available

</td><td style="border-color: rgb(0, 0, 0);">Config file not found or misconfigured XML

</td><td style="border-color: rgb(0, 0, 0);"><span class="s1">Double-check </span>dictionaries\_config<span class="s1"> and validate with </span>SHOW DICTIONARIES

</td></tr><tr><td style="border-color: rgb(0, 0, 0);">Segmentation fault

</td><td style="border-color: rgb(0, 0, 0);">Invalid shared library or ABI mismatch

</td><td style="border-color: rgb(0, 0, 0);">Recompile UDF for correct platform, verify against installed ClickHouse version

</td></tr><tr><td style="border-color: rgb(0, 0, 0);">Query fails silently

</td><td style="border-color: rgb(0, 0, 0);">Dictionary or UDF not fully loaded

</td><td style="border-color: rgb(0, 0, 0);">Recheck server logs for errors during startup

</td></tr></tbody></table>

## **Security Considerations**

ClickHouse extensions especially shared libraries run with the same privileges as the ClickHouse server. Be cautious:

- Only load trusted <span class="s1">.so</span> files from verified sources.
- Ensure <span class="s1">clickhouse</span> user has restricted permissions inside the container.
- Never expose dictionary or UDF paths to writable directories from external systems.

Avoid using custom UDFs or dictionaries from unknown sources in production environments without a thorough code review.

# Creating Manual Backups

Regular backups are essential when running a ClickHouse deployment, especially if you’re using it for persistent analytics or time-series data. While Elestio handles automated backups by default, you may want to create manual backups before configuration changes, retain a local archive, or test backup automation. This guide walks through multiple methods for creating ClickHouse backups on Elestio, including dashboard snapshots, command-line approaches, and Docker Compose-based setups. It also explains backup storage, retention, and automation using scheduled jobs.

## **Manual Service Backups on Elestio**

If you’re using Elestio’s managed ClickHouse service, the simplest way to perform a full backup is directly through the Elestio dashboard. This creates a snapshot of your current ClickHouse dataset and stores it in Elestio’s infrastructure. These snapshots can be restored later from the same interface, which is helpful when making critical changes or testing recovery workflows.

To trigger a manual ClickHouse backup on Elestio:

- Log in to the Elestio dashboard.
- Navigate to your ClickHouse service or cluster.
- Click the <span class="s1">**Backups**</span> tab in the service menu.
- Choose <span class="s1">**Back up now**</span> to generate a manual snapshot.

[![Screenshot 2025-06-11 at 11.10.38 AM.jpg](https://docs.elest.io/uploads/images/gallery/2025-06/scaled-1680-/xU6screenshot-2025-06-11-at-11-10-38-am.jpg)](https://docs.elest.io/uploads/images/gallery/2025-06/xU6screenshot-2025-06-11-at-11-10-38-am.jpg)

This method is recommended for quick, reliable backups without needing to use the command line.

## **Manual Backups Using Docker Compose**

If your ClickHouse instance is deployed via Docker Compose (as is common on Elestio-hosted environments), you can manually back up ClickHouse by either copying its internal storage files or using the native <span class="s3">BACKUP</span> SQL command (available in ClickHouse v21.12+). These approaches allow you to maintain control over backup logic and frequency.

#### **Access Elestio Terminal**

Go to your deployed ClickHouse service in the Elestio dashboard, navigate to <span class="s1">**Tools &gt; Terminal**</span>, and log in using the credentials provided.

#### **Locate the ClickHouse Container Directory**

Navigate to your app directory:

```
cd /opt/app/
```

This is the working directory of your Docker Compose project, which contains the <span class="s1">docker-compose.yml</span> file.

#### **Trigger a Backup (Using SQL)**

If you’re using ClickHouse with backup support enabled, you can execute:

```
docker-compose exec clickhouse clickhouse-client --query="BACKUP DATABASE default TO Disk('/backups/backup_$(date +%F)')"
```

This creates a full backup of the <span class="s1">default</span> database inside the container at <span class="s1">/backups</span>.

#### **Copy Backup Files from the Container**

Use <span class="s1">docker cp</span> to move the backup directory to your host system:

```
docker cp $(docker-compose ps -q clickhouse):/backups/backup_$(date +%F) ./clickhouse_backup_$(date +%F)
```

This gives you a restorable backup snapshot for storage or future recovery.

## **Backup Storage &amp; Retention Best Practices**

After creating backups, it’s important to store them securely and manage retention properly. ClickHouse backups can grow large depending on the volume and compression of your data.

#### **Guidelines to Follow:**

- <span class="s1">Use clear naming: </span>clickhouse\_backup\_2025\_06\_09
- Store off-site or on cloud storage (e.g. AWS S3, Backblaze, encrypted storage)
- Retain: 7 daily backups, 4 weekly backups, and 3–6 monthly backups
- Automate old file cleanup with cron jobs or retention scripts
- Optionally compress backups with <span class="s1">tar</span>, <span class="s1">gzip</span>, or <span class="s1">xz</span> to reduce space

## **Automating ClickHouse Backups (cron)**

Manual backup commands can be scheduled using tools like <span class="s2">cron</span> on Linux-based systems. This allows you to regularly back up your database without needing to run commands manually. Automating the process also reduces the risk of forgetting backups and ensures more consistent retention.

#### **Example: Daily Backup at 3 AM**

Edit the crontab:

```
crontab -e
```

Add a job like:

```
0 3 * * * docker-compose -f /opt/app/docker-compose.yml exec clickhouse \
clickhouse-client --query="BACKUP DATABASE default TO Disk('/backups/backup_$(date +\%F)')" && \
docker cp $(docker-compose -f /opt/app/docker-compose.yml ps -q clickhouse):/backups/backup_$(date +\%F) /backups/clickhouse_backup_$(date +\%F)
```

Make sure <span class="s1">/backups/</span> exists and is writable by the cron user.

You can also compress the file or upload to cloud storage in the same script:

```
tar -czf /backups/clickhouse_backup_$(date +\%F).tar.gz /backups/clickhouse_backup_$(date +\%F)
rclone copy /backups/clickhouse_backup_$(date +\%F).tar.gz remote:clickhouse-backups
```

## **Backup Format and Restore Notes**

<table border="1" id="bkmrk-format-description-r" style="border-collapse: collapse; border-color: rgb(0, 0, 0); width: 100%;"><thead><tr><th style="border-color: rgb(0, 0, 0); width: 21.9282%;">**Format**

</th><th style="border-color: rgb(0, 0, 0); width: 30.7536%;">**Description**

</th><th style="border-color: rgb(0, 0, 0); width: 47.3182%;">**Restore Method**

</th></tr></thead><tbody><tr><td style="border-color: rgb(0, 0, 0); width: 21.9282%;">`/backups/backup_<date>`

</td><td style="border-color: rgb(0, 0, 0); width: 30.7536%;">SQL-based backup using <span class="s1">BACKUP</span> command

</td><td style="border-color: rgb(0, 0, 0); width: 47.3182%;">Use <span class="s1">RESTORE DATABASE</span> from the same Disk location

</td></tr><tr><td style="border-color: rgb(0, 0, 0); width: 21.9282%;">`<span class="s1">.tar.gz</span>` or `<span class="s1">.tar</span>` archive

</td><td style="border-color: rgb(0, 0, 0); width: 30.7536%;">Filesystem snapshot of `<span class="s1">/var/lib/clickhouse</span>`

</td><td style="border-color: rgb(0, 0, 0); width: 47.3182%;">Stop ClickHouse, extract data back into the directory, then restart

</td></tr></tbody></table>

#### **To restore from a backup:**

- **Stop ClickHouse**<span class="s1">:</span>

```
docker-compose down
```

- **Restore via SQL**<span class="s1">:</span>

```
docker-compose exec clickhouse clickhouse-client --query="RESTORE DATABASE default FROM Disk('/backups/backup_2025-06-09')"
```

- **Or restore from file-based archive**<span class="s1">:</span>

```
tar -xzf clickhouse_backup_2025-06-09.tar.gz -C /opt/app/data/clickhouse/
docker-compose up -d
```

# Restoring a Backup

Restoring ClickHouse backups is essential for disaster recovery, staging environment duplication, or rolling back to a known state. Elestio supports backup restoration both through its web dashboard and manually through Docker Compose and command-line methods. This guide explains how to restore ClickHouse backups from SQL-based snapshots or file-based archives, covering both full and partial restore scenarios, and includes solutions for common restoration issues.

## **Restoring from a Backup via Terminal**

This method applies when you have a backup created using ClickHouse’s native <span class="s3">BACKUP</span> command or a direct copy of the data directory. To restore the backup, you must stop the running ClickHouse container, replace the data files, and restart the container to load the restored dataset.

#### **Stop the ClickHouse Container**

Shut down the ClickHouse container cleanly to avoid issues with open file handles or inconsistent state:

```
docker-compose down
```

#### **Replace the Backup Files**

If your backup was created using the native ClickHouse <span class="s2">BACKUP</span> command and saved to <span class="s2">/backups/backup\_2025\_06\_09</span>, copy it into the appropriate path within the container or bind mount.

Example:

```
cp -r ./clickhouse_backup_2025_06_09 /opt/app/backups/backup_2025_06_09
```

Make sure this path corresponds to the volumes specified in your <span class="s1">docker-compose.yml</span>. For example:

```yaml
volumes:
  - ./backups:/backups
  - ./data:/var/lib/clickhouse
```

If you’re restoring from a tarball archive, extract it into the correct volume mount:

```
tar -xzf clickhouse_backup_2025_06_09.tar.gz -C /opt/app/data/
```

#### **Restart ClickHouse**

Start the ClickHouse container again:

```
docker-compose up -d
```

ClickHouse will load the data either from the standard data directory or, if using the backup snapshot method, you can explicitly restore the database using SQL (next section).

## **Restoring via Docker Compose Terminal**

If you’re using backups made with the SQL <span class="s2">BACKUP</span> command, ClickHouse also provides a built-in method to restore via the <span class="s2">RESTORE</span> command.

#### **Copy the Backup Directory into the Container**

```
docker cp ./clickhouse_backup_2025_06_09 $(docker-compose ps -q clickhouse):/backups/backup_2025_06_09
```

#### **Restore with ClickHouse SQL**

Enter the container terminal:

```
docker-compose exec clickhouse bash
```

Then run the restore command:

```
clickhouse-client --query="RESTORE DATABASE default FROM Disk('/backups/backup_2025_06_09')"
```

This will restore the <span class="s1">default</span> database and its contents from the previously created backup directory.

## **Partial Restores in ClickHouse**

ClickHouse supports more granular restore operations using SQL syntax. You can restore individual tables or databases if the backup was created using the native <span class="s1">BACKUP</span> command.

#### **Restore a Single Table**

```
clickhouse-client --query="RESTORE TABLE default.events FROM Disk('/backups/backup_2025_06_09')"
```

This restores just the <span class="s1">events</span> table from the <span class="s1">default</span> database without affecting other tables.

#### **Restore Specific Schemas or Data**

You can also export and import CSV or TSV snapshots for partial data management:

```
clickhouse-client --query="SELECT * FROM default.events FORMAT CSV" > events.csv
clickhouse-client --query="INSERT INTO default.events FORMAT CSV" < events.csv
```

## **Common Errors &amp; How to Fix Them**

Restoring ClickHouse data can occasionally fail due to permission issues, path mismatches, unsupported formats, or version conflicts. Here are some frequent issues and their solutions.

#### **1. ClickHouse Fails to Start After Restore**

**Error:**

```
DB::Exception: Corrupted data part ...
```

<span class="s1">**Cause:**</span> The backup directory is incomplete or corrupted, or the file was not extracted properly.

**Resolution:**

- Re-verify that the backup files were copied completely.
- Use <span class="s1">tar -tzf</span> to inspect archive contents before extracting.
- Make sure you’re restoring on the same ClickHouse version that created the backup.

#### **2. RESTORE Command Fails with Permission Denied**

**Error:**

```
DB::Exception: Cannot read from backup: Permission denied
```

<span class="s1">**Cause:**</span> The container cannot access the <span class="s2">/backups/</span> directory due to permissions.

**Resolution:**

- Ensure the backup directory is readable by the ClickHouse process.
- Use <span class="s1">chmod -R 755 /opt/app/backups/</span> to adjust permissions if needed.

#### **3. Data Not Restored**

<span class="s1">**Cause:**</span> The RESTORE command did not include the correct database/table name or no data existed in the backup path.

**Resolution:**

- <span class="s1">Use </span>`clickhouse-client --query="SHOW DATABASES"`<span class="s1"> to confirm no restore happened.</span>
- Run <span class="s1">ls /backups/backup\_2025\_06\_09/</span> inside the container to verify backup contents.

#### **4. Permission Denied When Copying Files**

**Error:**

```
cp: cannot create regular file ‘/opt/app/data/’: Permission denied
```

**Resolution:**

Ensure your terminal session or script has write access to the target directory. Use sudo if needed:

```
sudo cp -r ./clickhouse_backup_2025_06_09 /opt/app/data/
```

# Identifying Slow Queries

Slow queries can impact ClickHouse performance, especially under high load or with inefficient queries or schema design. Whether you’re using ClickHouse on Elestio via the dashboard, accessing it inside a Docker Compose container, or running CLI queries, ClickHouse offers built-in tools to detect, diagnose, and optimize performance bottlenecks. This guide explains how to capture slow queries using system tables, measure query latency, and improve performance through tuning and query optimization.

### **Inspecting Slow Queries from the Terminal**

ClickHouse logs query profiling information by default, which you can access via system tables. This allows you to identify long-running or resource-intensive queries directly from SQL.

#### **Connect to ClickHouse via Terminal**

Use the ClickHouse client to connect to your instance:

```
clickhouse-client -h <host> --port <port> --user <username> --password <password>
```

Replace <span class="s1">&lt;host&gt;</span>, <span class="s1">&lt;port&gt;</span>, <span class="s1">&lt;username&gt;</span>, and <span class="s1">&lt;password&gt;</span> with your credentials from the Elestio dashboard.

[![image.png](https://docs.elest.io/uploads/images/gallery/2025-06/scaled-1680-/V4Ximage.png)](https://docs.elest.io/uploads/images/gallery/2025-06/V4Ximage.png)

#### **View Recent Slow Queries**

ClickHouse logs query performance stats in the <span class="s1">system.query\_log</span> table. To view the 10 most recent queries that took longer than 1 second:

```sql
SELECT
  query_start_time,
  query_duration_ms,
  query
FROM system.query_log
WHERE type = 'QueryFinish'
  AND query_duration_ms > 1000
ORDER BY query_start_time DESC
LIMIT 10;
```

You can adjust the <span class="s1">query\_duration\_ms</span> threshold to capture slower or more critical queries.

### **Analyzing Inside Docker Compose**

If your ClickHouse instance is running inside Docker Compose, you can inspect query logs and system performance from inside the container.

#### **Access the ClickHouse Container**

Open a shell session inside the running container:

```
docker-compose exec clickhouse bash
```

Then run the ClickHouse client:

```
clickhouse-client --user root
```

If a password is required, append <span class="s1">--password &lt;yourpassword&gt;</span> to the command.

#### **Query the system.query\_log Inside the Container**

Run the same slow query inspection SQL as above to analyze performance issues:

```sql
SELECT query_start_time, query_duration_ms, query
FROM system.query_log
WHERE type = 'QueryFinish' AND query_duration_ms > 1000
ORDER BY query_start_time DESC
LIMIT 10;
```

### **Using the System Metrics &amp; Events Tables**

ClickHouse includes system tables that expose performance-related metrics in real time.

#### **Check Overall Query Performance**

You can use the <span class="s2">system.metrics</span> table to view metrics like query execution time, memory usage, and background operations:

```sql
SELECT *
FROM system.metrics
WHERE value != 0
ORDER BY value DESC;
```

For cumulative statistics like total queries processed, check the <span class="s1">system.events</span> table:

```sql
SELECT *
FROM system.events
WHERE value > 0
ORDER BY value DESC;
```

### **Understanding and Resolving Common Bottlenecks**

Slow performance in ClickHouse is often caused by suboptimal queries, improper indexing (i.e., no primary key usage), disk I/O, or high memory usage.

#### **Common Causes of Slow Queries:**

- <span class="s1">**Large table scans:**</span> Caused by missing filtering conditions or lack of primary key usage.
- <span class="s1">**JOINs on unindexed keys:**</span> Inefficient joins can result in full-table scans.
- <span class="s1">**High cardinality aggregations:**</span> Especially costly without optimization (e.g., using <span class="s2">uniqExact()</span>).
- <span class="s1">**High insert latency:**</span> Triggered by too frequent small batch writes.
- <span class="s1">**Disk bottlenecks:**</span> Heavy merges or large result sets can overload I/O.

#### **Best Practices to Avoid Slow Queries:**

- <span class="s1">**Use appropriate filtering:**</span> Always filter with indexed columns (usually primary keys).
- <span class="s1">**Avoid SELECT \*:**</span> Specify only the needed columns.
- <span class="s1">**Use sampling when possible:**</span> ClickHouse supports SAMPLE clause on MergeTree tables.
- <span class="s1">**Use LIMIT:**</span> Avoid returning large result sets when debugging.
- <span class="s1">**Optimize JOINs:**</span> Prefer <span class="s2">ANY INNER JOIN</span> or <span class="s2">JOIN ... USING</span> for performance.

### **Optimizing with Configuration Changes**

ClickHouse performance can be tuned via its configuration files (<span class="s2">config.xml</span> and <span class="s2">users.xml</span>) or environment variables. For Docker Compose setups, these can be overridden via <span class="s2">docker-compose.override.yml</span>.

#### **Adjust Query and Memory Settings Dynamically**

Some performance-related settings can be changed per session or globally:

```sql
SET max_memory_usage = 2000000000;
SET max_threads = 4;
SET log_queries = 1;
```

To make permanent changes, modify your <span class="s1">config.xml</span> or <span class="s1">users.xml</span> inside the container volume mount.

# Detect and terminate long-running queries

ClickHouse is a high-performance, column-oriented OLAP database, but poorly optimized or long-running queries can still impact performance especially in resource-constrained environments like Elestio. Because ClickHouse executes large queries across multiple threads and can consume high memory and disk I/O, monitoring and controlling slow or blocking operations is essential.

This guide explains how to <span class="s3">**detect**</span>, <span class="s3">**analyze**</span>, and <span class="s3">**terminate**</span> long-running queries using terminal tools, Docker Compose setups, and ClickHouse’s internal system tables. It also outlines <span class="s3">**prevention strategies**</span> to help maintain system health.

## **Monitoring Long-Running Queries**

ClickHouse exposes query execution data through system tables like <span class="s4">system.processes</span> and <span class="s4">system.query\_log</span>. These allow you to monitor currently executing and historical queries for duration, memory usage, and user activity.

##### **Check Active Queries via Terminal**

To list currently running queries and their duration:

```sql
SELECT
  query_id,
  user,
  elapsed,
  memory_usage,
  query
FROM system.processes
ORDER BY elapsed DESC;
```

- <span class="s1">elapsed</span> is the query runtime in seconds.
- <span class="s1">memory\_usage</span> is in bytes.
- This lets you pinpoint queries that are taking too long or consuming excessive memory.

##### **Monitor Query Load in Real Time**

ClickHouse doesn’t have a MONITOR-like command, but you can simulate real-time monitoring by repeatedly querying <span class="s2">system.processes</span>:

```
watch -n 2 'clickhouse-client --query="SELECT elapsed, query FROM system.processes ORDER BY elapsed DESC LIMIT 5"'
```

This updates every 2 seconds and shows the top 5 longest-running queries.

### **Terminating Problematic Queries Safely**

If you identify a query that is consuming too many resources or blocking critical workloads, you can terminate it by its <span class="s2">query\_id</span>.

##### **Kill a Query by ID**

```sql
KILL QUERY WHERE query_id = '<id>';
```

- The <span class="s1">&lt;id&gt;</span> can be found in the <span class="s1">system.processes</span> table.
- This forces termination of the query while leaving the user session intact.

To forcibly kill all long-running queries (e.g., &gt;60 seconds):

```sql
KILL QUERY WHERE elapsed > 60 SYNC;
```

> Use <span class="s1">SYNC</span> to wait for the termination to complete before proceeding.

### **Managing Inside Docker Compose**

If ClickHouse is running inside Docker Compose on Elestio, follow these steps:

##### **Access the ClickHouse Container**

```
docker-compose exec clickhouse bash
```

Then run:

```
clickhouse-client --user default
```

If authentication is enabled, add <span class="s1">--password &lt;your\_password&gt;</span>.

You can now run queries like:

```sql
SELECT query_id, elapsed, query FROM system.processes;
```

Or terminate:

```sql
KILL QUERY WHERE query_id = '<id>';
```

## **Analyzing Query History**

ClickHouse logs completed queries (including failures) in the <span class="s2">system.query\_log</span> table.

##### **View Historical Long-Running Queries**

```sql
SELECT
  query_start_time,
  query_duration_ms,
  user,
  query
FROM system.query_log
WHERE type = 'QueryFinish'
  AND query_duration_ms > 1000
ORDER BY query_start_time DESC
LIMIT 10;
```

This helps identify patterns or repeat offenders.

### **Understanding Query Latency with Profiling Tools**

<span class="s2">ClickHouse provides advanced metrics via </span>system.metrics<span class="s2">, </span>system.events<span class="s2">, and </span>system.asynchronous\_metrics<span class="s2">.</span>

##### **Generate a Performance Snapshot**

```sql
SELECT * FROM system.metrics WHERE value != 0 ORDER BY value DESC;
```

- Use to analyze memory pressure, merge operations, disk reads/writes, and thread usage.

To examine detailed breakdowns of CPU usage or IO latency:

```sql
SELECT * FROM system.events WHERE value > 0 ORDER BY value DESC;
```

### **Best Practices to Prevent Long-Running Queries**

Preventing long-running queries is vital for maintaining ClickHouse performance, especially under high concurrency or on shared infrastructure.

- **Avoid Full Table Scans:** Use filters on primary key or indexed columns. Avoid queries without WHERE clauses on large tables.

```sql
SELECT count() FROM logs WHERE date >= '2024-01-01';
```

- **Limit Result Set Sizes:** Avoid returning millions of rows to clients. Use <span class="s2">LIMIT</span> and paginated access.

```sql
SELECT * FROM logs ORDER BY timestamp DESC LIMIT 100;
```

- **Optimize Joins and Aggregations:** Use <span class="s1">ANY INNER JOIN</span> for faster lookups. Avoid joining two huge datasets unless one is pre-aggregated or dimensionally small.
- **Avoid High Cardinality Aggregates:** Functions like <span class="s2">uniqExact()</span> are CPU-intensive. Prefer approximate variants (<span class="s2">uniq()</span>) when precision isn’t critical.
- **Set Query Timeouts and Memory Limits:** Limit resource usage per query:

```sql
SET max_execution_time = 30;
SET max_memory_usage = 1000000000;
```

- **Use Partitions and Projections:** Partition large datasets by time (e.g., <span class="s1">toYYYYMM(date)</span>) to reduce scanned rows. Use projections for fast pre-aggregated access.

# Preventing Full Disk Issues

Running out of disk space in a ClickHouse environment can cause query failures, part merge errors, and even full service downtime. ClickHouse is highly dependent on disk for storing columnar data, part files, metadata, temporary sort buffers, and backups. On platforms like Elestio, infrastructure is managed, but users are still responsible for monitoring storage, managing data retention, and optimizing resource usage. This guide explains how to monitor and clean up disk usage, configure safe retention policies, and implement long-term strategies to prevent full disk scenarios in ClickHouse when running under Docker Compose

### **Monitoring Disk Usage**

#### **Inspect the host system storage**

Run this on the host machine to check which mount point is filling up:

```
df -h
```

This shows usage across all mounted volumes. Look for the mount used by your ClickHouse volume—usually mapped to something like <span class="s1">/var/lib/docker/volumes/clickhouse\_data/\_data</span>.

#### **Check disk usage from inside the container**

Enter the ClickHouse container shell:

```
docker-compose exec clickhouse bash
```

Inside, check total ClickHouse disk usage:

```
du -sh /var/lib/clickhouse
```

To inspect usage of specific folders like <span class="s1">data/</span>, <span class="s1">tmp/</span>, or <span class="s1">store/</span>:

```
ls -lh /var/lib/clickhouse
```

## **Configuring Alerts and Cleaning Up Storage**

#### **Inspect Docker’s storage usage**

On the host, check space used by containers, images, volumes:

```
docker system df
```

#### **Identify and remove unused Docker volumes**

List all Docker volumes:

```
docker volume ls
```

Remove unused volumes (only if you’re sure they’re not needed):

```
docker volume rm <volume-name>
```

> <span class="s1">**Warning:**</span> Never delete your active ClickHouse data volume unless you’ve backed it up.

#### **Drop data manually using SQL**

To free space by removing outdated partitions or tables:

```
ALTER TABLE logs DROP PARTITION '2024-01';
TRUNCATE TABLE temp_events;
```

#### **Clean up local backups**

If you’re storing backups under <span class="s2">/var/lib/clickhouse/backup</span>, list and delete old ones:

```
ls -lh /var/lib/clickhouse/backup
rm -rf /var/lib/clickhouse/backup/backup-<timestamp>
```

Ensure important backups are offloaded before removing.

### **Managing Temporary Files**

#### **Monitor temporary file usage**

Check the temp directory inside the container:

```
du -sh /var/lib/clickhouse/tmp
```

Old files may remain if queries or merges crashed. Clean up when the system is idle.

#### **Redirect temporary paths to persistent storage**

Modify the <span class="s2">tmp\_path</span> in <span class="s2">config.xml</span> to use a volume-backed directory:

```
<tmp_path>/var/lib/clickhouse/tmp/</tmp_path>
```

Restart the container after applying changes.

### **Best Practices for Disk Space Management**

- <span class="s1">**Avoid storing binary blobs:**</span> Do not store large files like PDFs or images in ClickHouse. Use external object storage and only store references.
- <span class="s1">**Use TTL to expire old data:**</span> Automatically delete old data based on timestamps:

```
ALTER TABLE logs MODIFY TTL created_at + INTERVAL 90 DAY;
```

- <span class="s1">**Drop old partitions regularly:**</span> If partitioned by month/day, remove outdated partitions:

```
ALTER TABLE logs DROP PARTITION '2023-12';
```

- <span class="s1">**Enable efficient compression:**</span> Use ZSTD for better compression and lower disk usage:

```
CREATE TABLE logs (...) ENGINE = MergeTree() SETTINGS compression = 'ZSTD';
```

- <span class="s1">**Split large inserts into smaller batches:**</span> Avoid memory and disk spikes during large ingest operations.
- <span class="s1">**Optimize background merge load:**</span> Tune merge concurrency and thresholds using:

```
<background_pool_size>8</background_pool_size>
```

- <span class="s1">**Limit disk spill during queries:**</span> Prevent massive temp usage during large operations:

```
<max_bytes_before_external_sort>500000000</max_bytes_before_external_sort>
```

- <span class="s1">**Rotate Docker logs:**</span> Prevent logs from filling up your disk using log rotation:

```yaml
logging:
  driver: "json-file"
  options:
    max-size: "10m"
    max-file: "3"
```

- <span class="s1">**Monitor disk usage from ClickHouse itself:**</span> Track table-level disk usage using system tables:

```
SELECT table, sum(bytes_on_disk) AS size FROM system.parts GROUP BY table ORDER BY size DESC;
```

- <span class="s1">**Offload backups to remote storage:**</span> Backups inside containers should be copied off-host. Use Elestio’s backup tool or mount a backup volume:

```
volumes:
  - /mnt/backups:/backups
```

# Checking Database Size and Related Issues

As your ClickHouse data grows especially with large analytical workloads or high-ingestion pipelines it’s important to track how storage is being used. Unchecked growth can lead to full disks, failed inserts, increased merge times, and slower queries. While Elestio handles the infrastructure, ClickHouse storage optimization and cleanup remain your responsibility. This guide explains how to inspect disk usage, analyze table size, detect inefficiencies, and manage ClickHouse storage effectively under a Docker Compose setup.

### **Checking Table Size and Disk Usage**

ClickHouse stores data in columnar parts on disk, organized by partitions and merges. You can inspect disk consumption using SQL queries and Docker commands.

#### **Check total disk space used by ClickHouse**

From the host machine:

```
docker system df
```

Identify the Docker volume associated with ClickHouse, then check disk usage:

```
docker volume ls
sudo du -sh /var/lib/docker/volumes/<clickhouse_volume_name>/_data
```

#### **Inspect space used per table**

Connect to ClickHouse from the container:

```
docker-compose exec clickhouse clickhouse-client
```

Run:

```sql
SELECT
  database,
  table,
  formatReadableSize(sum(bytes_on_disk)) AS size_on_disk
FROM system.parts
WHERE active
GROUP BY database, table
ORDER BY sum(bytes_on_disk) DESC;
```

This shows total size used by each active table on disk.

#### **View storage location inside container**

ClickHouse typically writes data under <span class="s2">/var/lib/clickhouse</span>:

```
docker-compose exec clickhouse ls -lh /var/lib/clickhouse/store
```

This contains all table parts and metadata. Review sizes and delete orphaned data if needed.

### **Detecting Bloat and Inefficiencies**

ClickHouse can accumulate unnecessary disk usage due to unoptimized merges, redundant partitions, or abandoned tables.

#### **Check for unmerged parts**

A high number of unmerged parts can slow down queries and increase disk usage:

```sql
SELECT
  database,
  table,
  count() AS part_count
FROM system.parts
WHERE active
GROUP BY database, table
ORDER BY part_count DESC;
```

Tables with many small parts may need a manual merge trigger.

#### **Detect inactive or outdated parts**

Look for inactive parts still occupying disk:

```sql
SELECT
  name,
  active,
  remove_time
FROM system.parts
WHERE active = 0
LIMIT 50;
```

These parts are safe to delete if they’re old and not part of ongoing operations.

#### **Analyze storage by partition**

To pinpoint heavy partitions:

```sql
SELECT
  table,
  partition_id,
  formatReadableSize(sum(bytes_on_disk)) AS size
FROM system.parts
WHERE active
GROUP BY table, partition_id
ORDER BY sum(bytes_on_disk) DESC;
```

Large partitions can indicate hot data or poor partitioning strategy.

### **Optimizing and Reclaiming ClickHouse Storage**

ClickHouse provides several tools to optimize disk usage and clear unnecessary files.

#### **Drop old partitions manually**

For time-series or event tables, drop outdated partitions:

```sql
ALTER TABLE logs DROP PARTITION '2023-12';
```

Use partition pruning to maintain data freshness.

#### **Optimize tables to force merges**

To reduce part count and improve compression:

```sql
OPTIMIZE TABLE logs FINAL;
```

Use FINAL sparingly it can be resource-intensive.

#### **Clean up old tables or unused databases**

Drop stale or abandoned tables:

```sql
DROP TABLE old_analytics;
```

Drop entire databases if needed:

```sql
DROP DATABASE dev_test;
```

Always ensure no production data is affected.

### **Managing and Optimizing Files on Disk**

ClickHouse stores metadata, parts, WAL logs, and temp files under <span class="s2">/var/lib/clickhouse</span>. You should monitor this path inside the container and from the host.

#### **Monitor disk from inside container**

```
docker-compose exec clickhouse du -sh /var/lib/clickhouse
```

To drill down:

```
docker-compose exec clickhouse du -sh /var/lib/clickhouse/*
```

Identify unexpectedly large directories like <span class="s1">/store</span>, <span class="s1">/tmp</span>, or <span class="s1">/data</span>.

#### **Purge temporary files and logs**

<span class="s3">ClickHouse writes to </span>/var/lib/clickhouse/tmp<span class="s3"> and </span>/var/log/clickhouse-server/<span class="s3">:</span>

```
docker-compose exec clickhouse du -sh /var/lib/clickhouse/tmp
docker-compose exec clickhouse du -sh /var/log/clickhouse-server/
```

Clear if disk is nearing full. Rotate or truncate logs if necessary.

#### **Clean WALs and outdated mutations**

If mutations or insert queues are stuck:

```
SELECT * FROM system.mutations WHERE is_done = 0;
```

Investigate and resolve the root cause. Consider restarting ClickHouse after clearing safe logs.

### **Best Practices for ClickHouse Storage Management**

- Use partitioning: Partition large tables by time (e.g., daily, monthly) to enable faster drops and better merge control.
- Archive old data: Move cold data to object storage (S3, etc.) or external databases for long-term storage.
- Avoid oversized inserts: Insert in smaller chunks to avoid bloating parts and reduce memory pressure during merges.
- Rotate logs: If ClickHouse logs to file, configure log rotation:

```yaml
logging:
  driver: "json-file"
  options:
    max-size: "10m"
    max-file: "3"
```

- Use ZSTD compression: Prefer ZSTD over LZ4 for better compression ratio at the cost of slightly higher CPU.
- Monitor merges and disk pressure: Use system.metrics and system.events to track merge performance, part counts, and disk usage trends.
- Backup externally: Don’t store backups on the same disk. Use Elestio backup options to archive to remote or cloud storage.