How-To Guides Creating a Database ClickHouse is a high-performance columnar database designed for real-time analytical processing. It’s known for its blazing speed, horizontal scalability, and efficient use of disk I/O. Proper setup is essential for taking advantage of ClickHouse’s full capabilities, including fault tolerance, secure access, and high query performance. This guide walks through various ways to run and connect to ClickHouse: using the ClickHouse CLI ( clickhouse-client ), Docker containers, and command-line tools for scripting and automation. Best practices are highlighted throughout to ensure robust deployments. Creating using clickhouse-client The ClickHouse command-line interface ( clickhouse-client ) is a built-in tool used to connect to and manage ClickHouse servers. It supports both local and remote connections and allows for SQL-based interaction with the database engine. Connect to ClickHouse: If you’re running ClickHouse locally (via package manager or Docker), you can start the CLI with: clickhouse-client For remote connections, specify the hostname, port (default 9000), and user credentials: clickhouse-client -h --port -u --password Once connected, you can run SQL queries directly from the shell. Running ClickHouse Using Docker Docker provides a fast, reproducible way to run ClickHouse in isolated environments. This is ideal for local development or self-contained production setups. Access Elestio Terminal If you’re using Elestio for ClickHouse hosting, log into the Elestio dashboard. Go to your ClickHouse service, then navigate to Tools > Terminal to open a pre-authenticated shell session. Now change the directory: cd /opt/app/ Access the ClickHouse Container Shell Elestio-managed services run on Docker Compose. Use this to enter the ClickHouse container: docker-compose exec clickhouse bash Access ClickHouse CLI from Inside the Container Once inside the container, the clickhouse-client tool is available. Run it like this (add --password if needed): clickhouse-client -u --port --password You are now connected to the running ClickHouse instance inside the container. Test Connectivity Try creating a database and querying data to verify functionality: CREATE DATABASE test_db; CREATE TABLE test_db.test_table (id UInt32, message String) ENGINE = MergeTree() ORDER BY id; INSERT INTO test_db.test_table VALUES (1, 'Hello ClickHouse'); SELECT * FROM test_db.test_table; Expected Output: 1 Hello ClickHouse This confirms read/write operations and query functionality. Connecting Using clickhouse-client in Scripts clickhouse-client can be used non-interactively for scripting, automation, and cron-based jobs. For example, to insert data from a shell script: echo "INSERT INTO test_db.test_table VALUES (2, 'Automated')" | \ clickhouse-client -h -u --password This is useful for automated ETL jobs, health checks, or backup pipelines. Best Practices for Setting Up ClickHouse Use Clear Naming for Databases and Tables Adopt consistent naming conventions for databases, tables, and columns. Use lowercase, underscore-separated names like: user_events_2024 product_sales_agg This improves clarity in multi-schema environments and helps with automation and maintenance scripts. Choose the Right Engine and Indexing Strategy ClickHouse supports various table engines like MergeTree , ReplacingMergeTree , and SummingMergeTree . Pick the engine that best matches your use case and define ORDER BY keys carefully to optimize performance. Example: CREATE TABLE logs ( timestamp DateTime, service String, message String ) ENGINE = MergeTree() ORDER BY (timestamp, service); Inappropriate engine selection can lead to poor query performance or high disk usage. Enable Authentication and Secure Access Always configure user-level authentication and restrict access in production. Add users and passwords in users.xml or via SQL: CREATE USER secure_user IDENTIFIED WITH plaintext_password BY 'strong_password'; GRANT ALL ON *.* TO secure_user; Use TLS for encrypted connections by enabling SSL in the config.xml file: 9440 /etc/clickhouse-server/certs/server.crt /etc/clickhouse-server/certs/server.key Configure Data Persistence and Storage Paths ClickHouse stores data on disk by default, but ensure proper mounting, storage separation, and backup routines. In config.xml : /var/lib/clickhouse/ /var/lib/clickhouse/tmp/ /var/lib/clickhouse/user_files/ Use RAID, SSDs, or networked volumes depending on your availability and performance needs. Monitor and Tune Performance Use built-in introspection tools like: SELECT * FROM system.metrics; SELECT * FROM system.query_log ORDER BY event_time DESC LIMIT 10; SELECT * FROM system.parts; For real-time observability, integrate with Grafana, Prometheus, or use ClickHouse Keeper metrics . Also review: system.mutations for long-running mutation jobs system.errors for crash/debug info system.replication_queue for sync issues in replicated tables Common Issues and Their Solutions Issue Cause Solution Authentication failure Wrong password or no user set Double-check credentials; use --password flag Cannot connect to localhost Service not running or incorrect port Ensure ClickHouse is running and check the port SSL/TLS handshake failed Incorrect certificate paths or permissions Verify file locations in config.xml and restart service Queries are slow Poor ORDER BY design or unoptimized table engine Reevaluate schema design and use indexes effectively Data lost after restart Misconfigured data path or ephemeral container Ensure proper disk volume mounts and storage persistence Upgrading to Major Version Upgrading a database service on Elestio can be done without creating a new instance or performing a full manual migration. Elestio provides a built-in option to change the database version directly from the dashboard. This is useful for cases where the upgrade does not involve breaking changes or when minimal manual involvement is preferred. The version upgrade process is handled by Elestio internally, including restarting the database service if required. This method reduces the number of steps involved and provides a way to keep services up to date with minimal configuration changes. Log In and Locate Your Service To begin the upgrade process, log in to your Elestio dashboard and navigate to the specific database service you want to upgrade. It is important to verify that the correct instance is selected, especially in environments where multiple databases are used for different purposes such as staging, testing, or production. The dashboard interface provides detailed information for each service, including version details, usage metrics, and current configuration. Ensure that you have access rights to perform upgrades on the selected service. Identifying the right instance helps avoid accidental changes to unrelated environments. Back Up Your Data Before starting the upgrade, create a backup of your database. A backup stores the current state of your data, schema, indexes, and configuration, which can be restored if something goes wrong during the upgrade. In Elestio, this can be done through the  Backups  tab by selecting  Back up now  under Manual local backups and  Download the backup file. Scheduled backups may also be used, but it is recommended to create a manual one just before the upgrade. Keeping a recent backup allows quick recovery in case of errors or rollback needs. This is especially important in production environments where data consistency is critical. Select the New Version Once your backup is secure, proceed to the  Overview  and then  Software > Update config  tab within your database service page. Here, you'll find an option labeled  ENV . In the  ENV  menu, change the desired database version to  SOFTWARE_VERSION . After confirming the version, Elestio will begin the upgrade process automatically. During this time, the platform takes care of the version change and restarts the database if needed. No manual commands are required, and the system handles most of the operational aspects in the background. Monitor the Upgrade Process The upgrade process may include a short downtime while the database restarts. Once it is completed, it is important to verify that the upgrade was successful and the service is operating as expected. Start by checking the logs available in the Elestio dashboard for any warnings or errors during the process. Then, review performance metrics to ensure the database is running normally and responding to queries. Finally, test the connection from your client applications to confirm that they can interact with the upgraded database without issues. Installing and Updating an Extension ClickHouse supports custom extensions via [User Defined Functions (UDFs)], external dictionaries, and shared libraries that extend its core capabilities with custom logic, formats, or integrations. These behave similarly to modules or plugins in other systems and must be configured at server startup. Common examples include integration with geospatial libraries, custom UDFs, or external dictionary sources like MySQL or HTTP. In Elestio-hosted ClickHouse instances or any Docker Compose-based setup, extensions can be added by mounting external libraries or configuration files and referencing them in config.xml or users.xml . This guide walks through how to install, load, and manage ClickHouse extensions using Docker Compose along with best practices and common troubleshooting steps. Installing and Enabling ClickHouse Extensions ClickHouse extensions are typically compiled as shared objects ( .so ) files or defined as configuration files for dictionaries or formats. These files must be mounted into the container and referenced explicitly in the server’s configuration files. Example: Load Custom Shared Library UDF Suppose you have a compiled UDF called libexample_udf.so . To include it in a Docker Compose setup: Update  docker-compose.yml Mount the shared library into the container: services: clickhouse: image: clickhouse/clickhouse-server:latest volumes: - ./modules/libexample_udf.so:/usr/lib/clickhouse/user_defined/libexample_udf.so - ./configs/config.xml:/etc/clickhouse-server/config.xml ports: - "8123:8123" - "9000:9000" ./modules/libexample_udf.so : local path to the shared library on the host. /usr/lib/clickhouse/user_defined/ : default directory for user libraries inside the container. Make sure the file exists before running Docker Compose. Configure  config.xml to Load the UDF In your custom config.xml : example_udf udf libexample_udf.so The library path must match the volume mount location. Restart the ClickHouse Service After updating the Compose and configuration files, restart the service: docker-compose down docker-compose up -d This will reload ClickHouse with the specified UDF. Verify the Extension is Loaded Connect using the ClickHouse CLI or HTTP interface and run: SELECT example_udf('test input'); If successful, the function will return expected results from the loaded library. You can also confirm the server loaded your shared library by inspecting logs: docker-compose logs clickhouse Look for lines that indicate the library was found and loaded. Managing External Dictionaries ClickHouse supports loading external data sources (like MySQL, HTTP APIs, or files) as dictionaries Mount Dictionary Configuration In docker-compose.yml : volumes: - ./configs/dictionaries/:/etc/clickhouse-server/dictionaries/ Reference in  config.xml /etc/clickhouse-server/dictionaries/*.xml Example dictionary file ( mysql_dictionary.xml ): mysql_dict mysql-host root password test cities
id name String
Use the dictionary in queries: SELECT dictGetString('mysql_dict', 'name', toUInt64(42)); Updating or Removing Extensions ClickHouse doesn’t support unloading UDFs or dictionaries at runtime. To modify or remove an extension: 1.  Stop the container : docker-compose down 2. Edit config files : Replace or remove the entry in config.xml or dictionary config. Replace or remove the .so file if applicable. 3. Restart the container : docker-compose up -d Always test changes in staging before deploying to production. Troubleshooting Common Extension Issues Issue Cause Resolution ClickHouse fails to start Invalid config or missing .so file Run docker-compose logs clickhouse and fix missing files or XML syntax UDF not recognized Wrong library path or missing permissions Ensure volume mount is correct and file is executable inside container Dictionary not available Config file not found or misconfigured XML Double-check dictionaries_config and validate with SHOW DICTIONARIES Segmentation fault Invalid shared library or ABI mismatch Recompile UDF for correct platform, verify against installed ClickHouse version Query fails silently Dictionary or UDF not fully loaded Recheck server logs for errors during startup Security Considerations ClickHouse extensions especially shared libraries run with the same privileges as the ClickHouse server. Be cautious: Only load trusted .so files from verified sources. Ensure clickhouse user has restricted permissions inside the container. Never expose dictionary or UDF paths to writable directories from external systems. Avoid using custom UDFs or dictionaries from unknown sources in production environments without a thorough code review. Creating Manual Backups Regular backups are essential when running a ClickHouse deployment, especially if you’re using it for persistent analytics or time-series data. While Elestio handles automated backups by default, you may want to create manual backups before configuration changes, retain a local archive, or test backup automation. This guide walks through multiple methods for creating ClickHouse backups on Elestio, including dashboard snapshots, command-line approaches, and Docker Compose-based setups. It also explains backup storage, retention, and automation using scheduled jobs. Manual Service Backups on Elestio If you’re using Elestio’s managed ClickHouse service, the simplest way to perform a full backup is directly through the Elestio dashboard. This creates a snapshot of your current ClickHouse dataset and stores it in Elestio’s infrastructure. These snapshots can be restored later from the same interface, which is helpful when making critical changes or testing recovery workflows. To trigger a manual ClickHouse backup on Elestio: Log in to the Elestio dashboard. Navigate to your ClickHouse service or cluster. Click the Backups tab in the service menu. Choose Back up now to generate a manual snapshot. This method is recommended for quick, reliable backups without needing to use the command line. Manual Backups Using Docker Compose If your ClickHouse instance is deployed via Docker Compose (as is common on Elestio-hosted environments), you can manually back up ClickHouse by either copying its internal storage files or using the native BACKUP SQL command (available in ClickHouse v21.12+). These approaches allow you to maintain control over backup logic and frequency. Access Elestio Terminal Go to your deployed ClickHouse service in the Elestio dashboard, navigate to Tools > Terminal , and log in using the credentials provided. Locate the ClickHouse Container Directory Navigate to your app directory: cd /opt/app/ This is the working directory of your Docker Compose project, which contains the docker-compose.yml file. Trigger a Backup (Using SQL) If you’re using ClickHouse with backup support enabled, you can execute: docker-compose exec clickhouse clickhouse-client --query="BACKUP DATABASE default TO Disk('/backups/backup_$(date +%F)')" This creates a full backup of the default database inside the container at /backups . Copy Backup Files from the Container Use docker cp to move the backup directory to your host system: docker cp $(docker-compose ps -q clickhouse):/backups/backup_$(date +%F) ./clickhouse_backup_$(date +%F) This gives you a restorable backup snapshot for storage or future recovery. Backup Storage & Retention Best Practices After creating backups, it’s important to store them securely and manage retention properly. ClickHouse backups can grow large depending on the volume and compression of your data. Guidelines to Follow: Use clear naming: clickhouse_backup_2025_06_09 Store off-site or on cloud storage (e.g. AWS S3, Backblaze, encrypted storage) Retain: 7 daily backups, 4 weekly backups, and 3–6 monthly backups Automate old file cleanup with cron jobs or retention scripts Optionally compress backups with tar , gzip , or xz to reduce space Automating ClickHouse Backups (cron) Manual backup commands can be scheduled using tools like cron on Linux-based systems. This allows you to regularly back up your database without needing to run commands manually. Automating the process also reduces the risk of forgetting backups and ensures more consistent retention. Example: Daily Backup at 3 AM Edit the crontab: crontab -e Add a job like: 0 3 * * * docker-compose -f /opt/app/docker-compose.yml exec clickhouse \ clickhouse-client --query="BACKUP DATABASE default TO Disk('/backups/backup_$(date +\%F)')" && \ docker cp $(docker-compose -f /opt/app/docker-compose.yml ps -q clickhouse):/backups/backup_$(date +\%F) /backups/clickhouse_backup_$(date +\%F) Make sure /backups/ exists and is writable by the cron user. You can also compress the file or upload to cloud storage in the same script: tar -czf /backups/clickhouse_backup_$(date +\%F).tar.gz /backups/clickhouse_backup_$(date +\%F) rclone copy /backups/clickhouse_backup_$(date +\%F).tar.gz remote:clickhouse-backups Backup Format and Restore Notes Format Description Restore Method /backups/backup_ SQL-based backup using BACKUP command Use RESTORE DATABASE from the same Disk location .tar.gz or .tar archive Filesystem snapshot of /var/lib/clickhouse Stop ClickHouse, extract data back into the directory, then restart To restore from a backup: Stop ClickHouse : docker-compose down Restore via SQL : docker-compose exec clickhouse clickhouse-client --query="RESTORE DATABASE default FROM Disk('/backups/backup_2025-06-09')" Or restore from file-based archive : tar -xzf clickhouse_backup_2025-06-09.tar.gz -C /opt/app/data/clickhouse/ docker-compose up -d Restoring a Backup Restoring ClickHouse backups is essential for disaster recovery, staging environment duplication, or rolling back to a known state. Elestio supports backup restoration both through its web dashboard and manually through Docker Compose and command-line methods. This guide explains how to restore ClickHouse backups from SQL-based snapshots or file-based archives, covering both full and partial restore scenarios, and includes solutions for common restoration issues. Restoring from a Backup via Terminal This method applies when you have a backup created using ClickHouse’s native BACKUP command or a direct copy of the data directory. To restore the backup, you must stop the running ClickHouse container, replace the data files, and restart the container to load the restored dataset. Stop the ClickHouse Container Shut down the ClickHouse container cleanly to avoid issues with open file handles or inconsistent state: docker-compose down Replace the Backup Files If your backup was created using the native ClickHouse BACKUP command and saved to /backups/backup_2025_06_09 , copy it into the appropriate path within the container or bind mount. Example: cp -r ./clickhouse_backup_2025_06_09 /opt/app/backups/backup_2025_06_09 Make sure this path corresponds to the volumes specified in your docker-compose.yml . For example: volumes: - ./backups:/backups - ./data:/var/lib/clickhouse If you’re restoring from a tarball archive, extract it into the correct volume mount: tar -xzf clickhouse_backup_2025_06_09.tar.gz -C /opt/app/data/ Restart ClickHouse Start the ClickHouse container again: docker-compose up -d ClickHouse will load the data either from the standard data directory or, if using the backup snapshot method, you can explicitly restore the database using SQL (next section). Restoring via Docker Compose Terminal If you’re using backups made with the SQL BACKUP command, ClickHouse also provides a built-in method to restore via the RESTORE command. Copy the Backup Directory into the Container docker cp ./clickhouse_backup_2025_06_09 $(docker-compose ps -q clickhouse):/backups/backup_2025_06_09 Restore with ClickHouse SQL Enter the container terminal: docker-compose exec clickhouse bash Then run the restore command: clickhouse-client --query="RESTORE DATABASE default FROM Disk('/backups/backup_2025_06_09')" This will restore the default database and its contents from the previously created backup directory. Partial Restores in ClickHouse ClickHouse supports more granular restore operations using SQL syntax. You can restore individual tables or databases if the backup was created using the native BACKUP command. Restore a Single Table clickhouse-client --query="RESTORE TABLE default.events FROM Disk('/backups/backup_2025_06_09')" This restores just the events table from the default database without affecting other tables. Restore Specific Schemas or Data You can also export and import CSV or TSV snapshots for partial data management: clickhouse-client --query="SELECT * FROM default.events FORMAT CSV" > events.csv clickhouse-client --query="INSERT INTO default.events FORMAT CSV" < events.csv Common Errors & How to Fix Them Restoring ClickHouse data can occasionally fail due to permission issues, path mismatches, unsupported formats, or version conflicts. Here are some frequent issues and their solutions. 1. ClickHouse Fails to Start After Restore Error: DB::Exception: Corrupted data part ... Cause:  The backup directory is incomplete or corrupted, or the file was not extracted properly. Resolution: Re-verify that the backup files were copied completely. Use tar -tzf to inspect archive contents before extracting. Make sure you’re restoring on the same ClickHouse version that created the backup. 2. RESTORE Command Fails with Permission Denied Error: DB::Exception: Cannot read from backup: Permission denied Cause: The container cannot access the /backups/ directory due to permissions. Resolution: Ensure the backup directory is readable by the ClickHouse process. Use chmod -R 755 /opt/app/backups/ to adjust permissions if needed. 3. Data Not Restored Cause: The RESTORE command did not include the correct database/table name or no data existed in the backup path. Resolution: Use clickhouse-client --query="SHOW DATABASES" to confirm no restore happened. Run ls /backups/backup_2025_06_09/ inside the container to verify backup contents. 4. Permission Denied When Copying Files Error: cp: cannot create regular file ‘/opt/app/data/’: Permission denied Resolution: Ensure your terminal session or script has write access to the target directory. Use sudo if needed: sudo cp -r ./clickhouse_backup_2025_06_09 /opt/app/data/ Identifying Slow Queries Slow queries can impact ClickHouse performance, especially under high load or with inefficient queries or schema design. Whether you’re using ClickHouse on Elestio via the dashboard, accessing it inside a Docker Compose container, or running CLI queries, ClickHouse offers built-in tools to detect, diagnose, and optimize performance bottlenecks. This guide explains how to capture slow queries using system tables, measure query latency, and improve performance through tuning and query optimization. Inspecting Slow Queries from the Terminal ClickHouse logs query profiling information by default, which you can access via system tables. This allows you to identify long-running or resource-intensive queries directly from SQL. Connect to ClickHouse via Terminal Use the ClickHouse client to connect to your instance: clickhouse-client -h --port --user --password Replace , , , and with your credentials from the Elestio dashboard. View Recent Slow Queries ClickHouse logs query performance stats in the system.query_log table. To view the 10 most recent queries that took longer than 1 second: SELECT query_start_time, query_duration_ms, query FROM system.query_log WHERE type = 'QueryFinish' AND query_duration_ms > 1000 ORDER BY query_start_time DESC LIMIT 10; You can adjust the query_duration_ms threshold to capture slower or more critical queries. Analyzing Inside Docker Compose If your ClickHouse instance is running inside Docker Compose, you can inspect query logs and system performance from inside the container. Access the ClickHouse Container Open a shell session inside the running container: docker-compose exec clickhouse bash Then run the ClickHouse client: clickhouse-client --user root If a password is required, append --password to the command. Query the  system.query_log Inside the Container Run the same slow query inspection SQL as above to analyze performance issues: SELECT query_start_time, query_duration_ms, query FROM system.query_log WHERE type = 'QueryFinish' AND query_duration_ms > 1000 ORDER BY query_start_time DESC LIMIT 10; Using the  System Metrics & Events Tables ClickHouse includes system tables that expose performance-related metrics in real time. Check Overall Query Performance You can use the system.metrics table to view metrics like query execution time, memory usage, and background operations: SELECT * FROM system.metrics WHERE value != 0 ORDER BY value DESC; For cumulative statistics like total queries processed, check the system.events table: SELECT * FROM system.events WHERE value > 0 ORDER BY value DESC; Understanding and Resolving Common Bottlenecks Slow performance in ClickHouse is often caused by suboptimal queries, improper indexing (i.e., no primary key usage), disk I/O, or high memory usage. Common Causes of Slow Queries: Large table scans: Caused by missing filtering conditions or lack of primary key usage. JOINs on unindexed keys: Inefficient joins can result in full-table scans. High cardinality aggregations: Especially costly without optimization (e.g., using uniqExact() ). High insert latency: Triggered by too frequent small batch writes. Disk bottlenecks: Heavy merges or large result sets can overload I/O. Best Practices to Avoid Slow Queries: Use appropriate filtering: Always filter with indexed columns (usually primary keys). Avoid SELECT *: Specify only the needed columns. Use sampling when possible: ClickHouse supports SAMPLE clause on MergeTree tables. Use LIMIT: Avoid returning large result sets when debugging. Optimize JOINs: Prefer ANY INNER JOIN or JOIN ... USING for performance. Optimizing with Configuration Changes ClickHouse performance can be tuned via its configuration files ( config.xml and users.xml ) or environment variables. For Docker Compose setups, these can be overridden via docker-compose.override.yml . Adjust Query and Memory Settings Dynamically Some performance-related settings can be changed per session or globally: SET max_memory_usage = 2000000000; SET max_threads = 4; SET log_queries = 1; To make permanent changes, modify your config.xml or users.xml inside the container volume mount. Detect and terminate long-running queries ClickHouse is a high-performance, column-oriented OLAP database, but poorly optimized or long-running queries can still impact performance especially in resource-constrained environments like Elestio. Because ClickHouse executes large queries across multiple threads and can consume high memory and disk I/O, monitoring and controlling slow or blocking operations is essential. This guide explains how to detect , analyze , and terminate long-running queries using terminal tools, Docker Compose setups, and ClickHouse’s internal system tables. It also outlines prevention strategies to help maintain system health. Monitoring Long-Running Queries ClickHouse exposes query execution data through system tables like system.processes and system.query_log . These allow you to monitor currently executing and historical queries for duration, memory usage, and user activity. Check Active Queries via Terminal To list currently running queries and their duration: SELECT query_id, user, elapsed, memory_usage, query FROM system.processes ORDER BY elapsed DESC; elapsed is the query runtime in seconds. memory_usage is in bytes. This lets you pinpoint queries that are taking too long or consuming excessive memory. Monitor Query Load in Real Time ClickHouse doesn’t have a MONITOR-like command, but you can simulate real-time monitoring by repeatedly querying system.processes : watch -n 2 'clickhouse-client --query="SELECT elapsed, query FROM system.processes ORDER BY elapsed DESC LIMIT 5"' This updates every 2 seconds and shows the top 5 longest-running queries. Terminating Problematic Queries Safely If you identify a query that is consuming too many resources or blocking critical workloads, you can terminate it by its query_id . Kill a Query by ID KILL QUERY WHERE query_id = ''; The can be found in the system.processes table. This forces termination of the query while leaving the user session intact. To forcibly kill all long-running queries (e.g., >60 seconds): KILL QUERY WHERE elapsed > 60 SYNC; Use SYNC to wait for the termination to complete before proceeding. Managing Inside Docker Compose If ClickHouse is running inside Docker Compose on Elestio, follow these steps: Access the ClickHouse Container docker-compose exec clickhouse bash Then run: clickhouse-client --user default If authentication is enabled, add --password . You can now run queries like: SELECT query_id, elapsed, query FROM system.processes; Or terminate: KILL QUERY WHERE query_id = ''; Analyzing Query History ClickHouse logs completed queries (including failures) in the system.query_log table. View Historical Long-Running Queries SELECT query_start_time, query_duration_ms, user, query FROM system.query_log WHERE type = 'QueryFinish' AND query_duration_ms > 1000 ORDER BY query_start_time DESC LIMIT 10; This helps identify patterns or repeat offenders. Understanding Query Latency with Profiling Tools ClickHouse provides advanced metrics via system.metrics , system.events , and system.asynchronous_metrics . Generate a Performance Snapshot SELECT * FROM system.metrics WHERE value != 0 ORDER BY value DESC; Use to analyze memory pressure, merge operations, disk reads/writes, and thread usage. To examine detailed breakdowns of CPU usage or IO latency: SELECT * FROM system.events WHERE value > 0 ORDER BY value DESC; Best Practices to Prevent Long-Running Queries Preventing long-running queries is vital for maintaining ClickHouse performance, especially under high concurrency or on shared infrastructure. Avoid Full Table Scans:  Use filters on primary key or indexed columns. Avoid queries without WHERE clauses on large tables. SELECT count() FROM logs WHERE date >= '2024-01-01'; Limit Result Set Sizes: Avoid returning millions of rows to clients. Use LIMIT and paginated access. SELECT * FROM logs ORDER BY timestamp DESC LIMIT 100; Optimize Joins and Aggregations: Use ANY INNER JOIN for faster lookups. Avoid joining two huge datasets unless one is pre-aggregated or dimensionally small. Avoid High Cardinality Aggregates:  Functions like uniqExact() are CPU-intensive. Prefer approximate variants ( uniq() ) when precision isn’t critical. Set Query Timeouts and Memory Limits:  Limit resource usage per query: SET max_execution_time = 30; SET max_memory_usage = 1000000000; Use Partitions and Projections:  Partition large datasets by time (e.g., toYYYYMM(date) ) to reduce scanned rows. Use projections for fast pre-aggregated access. Preventing Full Disk Issues Running out of disk space in a ClickHouse environment can cause query failures, part merge errors, and even full service downtime. ClickHouse is highly dependent on disk for storing columnar data, part files, metadata, temporary sort buffers, and backups. On platforms like Elestio, infrastructure is managed, but users are still responsible for monitoring storage, managing data retention, and optimizing resource usage. This guide explains how to monitor and clean up disk usage, configure safe retention policies, and implement long-term strategies to prevent full disk scenarios in ClickHouse when running under Docker Compose Monitoring Disk Usage Inspect the host system storage Run this on the host machine to check which mount point is filling up: df -h This shows usage across all mounted volumes. Look for the mount used by your ClickHouse volume—usually mapped to something like /var/lib/docker/volumes/clickhouse_data/_data . Check disk usage from inside the container Enter the ClickHouse container shell: docker-compose exec clickhouse bash Inside, check total ClickHouse disk usage: du -sh /var/lib/clickhouse To inspect usage of specific folders like data/ , tmp/ , or store/ : ls -lh /var/lib/clickhouse Configuring Alerts and Cleaning Up Storage Inspect Docker’s storage usage On the host, check space used by containers, images, volumes: docker system df Identify and remove unused Docker volumes List all Docker volumes: docker volume ls Remove unused volumes (only if you’re sure they’re not needed): docker volume rm Warning:  Never delete your active ClickHouse data volume unless you’ve backed it up. Drop data manually using SQL To free space by removing outdated partitions or tables: ALTER TABLE logs DROP PARTITION '2024-01'; TRUNCATE TABLE temp_events; Clean up local backups If you’re storing backups under /var/lib/clickhouse/backup , list and delete old ones: ls -lh /var/lib/clickhouse/backup rm -rf /var/lib/clickhouse/backup/backup- Ensure important backups are offloaded before removing. Managing Temporary Files Monitor temporary file usage Check the temp directory inside the container: du -sh /var/lib/clickhouse/tmp Old files may remain if queries or merges crashed. Clean up when the system is idle. Redirect temporary paths to persistent storage Modify the tmp_path in config.xml to use a volume-backed directory: /var/lib/clickhouse/tmp/ Restart the container after applying changes. Best Practices for Disk Space Management Avoid storing binary blobs: Do not store large files like PDFs or images in ClickHouse. Use external object storage and only store references. Use TTL to expire old data: Automatically delete old data based on timestamps: ALTER TABLE logs MODIFY TTL created_at + INTERVAL 90 DAY; Drop old partitions regularly: If partitioned by month/day, remove outdated partitions: ALTER TABLE logs DROP PARTITION '2023-12'; Enable efficient compression: Use ZSTD for better compression and lower disk usage: CREATE TABLE logs (...) ENGINE = MergeTree() SETTINGS compression = 'ZSTD'; Split large inserts into smaller batches: Avoid memory and disk spikes during large ingest operations. Optimize background merge load: Tune merge concurrency and thresholds using: 8 Limit disk spill during queries: Prevent massive temp usage during large operations: 500000000 Rotate Docker logs: Prevent logs from filling up your disk using log rotation: logging: driver: "json-file" options: max-size: "10m" max-file: "3" Monitor disk usage from ClickHouse itself: Track table-level disk usage using system tables: SELECT table, sum(bytes_on_disk) AS size FROM system.parts GROUP BY table ORDER BY size DESC; Offload backups to remote storage: Backups inside containers should be copied off-host. Use Elestio’s backup tool or mount a backup volume: volumes: - /mnt/backups:/backups Checking Database Size and Related Issues As your ClickHouse data grows especially with large analytical workloads or high-ingestion pipelines it’s important to track how storage is being used. Unchecked growth can lead to full disks, failed inserts, increased merge times, and slower queries. While Elestio handles the infrastructure, ClickHouse storage optimization and cleanup remain your responsibility. This guide explains how to inspect disk usage, analyze table size, detect inefficiencies, and manage ClickHouse storage effectively under a Docker Compose setup. Checking Table Size and Disk Usage ClickHouse stores data in columnar parts on disk, organized by partitions and merges. You can inspect disk consumption using SQL queries and Docker commands. Check total disk space used by ClickHouse From the host machine: docker system df Identify the Docker volume associated with ClickHouse, then check disk usage: docker volume ls sudo du -sh /var/lib/docker/volumes//_data Inspect space used per table Connect to ClickHouse from the container: docker-compose exec clickhouse clickhouse-client Run: SELECT database, table, formatReadableSize(sum(bytes_on_disk)) AS size_on_disk FROM system.parts WHERE active GROUP BY database, table ORDER BY sum(bytes_on_disk) DESC; This shows total size used by each active table on disk. View storage location inside container ClickHouse typically writes data under /var/lib/clickhouse : docker-compose exec clickhouse ls -lh /var/lib/clickhouse/store This contains all table parts and metadata. Review sizes and delete orphaned data if needed. Detecting Bloat and Inefficiencies ClickHouse can accumulate unnecessary disk usage due to unoptimized merges, redundant partitions, or abandoned tables. Check for unmerged parts A high number of unmerged parts can slow down queries and increase disk usage: SELECT database, table, count() AS part_count FROM system.parts WHERE active GROUP BY database, table ORDER BY part_count DESC; Tables with many small parts may need a manual merge trigger. Detect inactive or outdated parts Look for inactive parts still occupying disk: SELECT name, active, remove_time FROM system.parts WHERE active = 0 LIMIT 50; These parts are safe to delete if they’re old and not part of ongoing operations. Analyze storage by partition To pinpoint heavy partitions: SELECT table, partition_id, formatReadableSize(sum(bytes_on_disk)) AS size FROM system.parts WHERE active GROUP BY table, partition_id ORDER BY sum(bytes_on_disk) DESC; Large partitions can indicate hot data or poor partitioning strategy. Optimizing and Reclaiming ClickHouse Storage ClickHouse provides several tools to optimize disk usage and clear unnecessary files. Drop old partitions manually For time-series or event tables, drop outdated partitions: ALTER TABLE logs DROP PARTITION '2023-12'; Use partition pruning to maintain data freshness. Optimize tables to force merges To reduce part count and improve compression: OPTIMIZE TABLE logs FINAL; Use FINAL sparingly it can be resource-intensive. Clean up old tables or unused databases Drop stale or abandoned tables: DROP TABLE old_analytics; Drop entire databases if needed: DROP DATABASE dev_test; Always ensure no production data is affected. Managing and Optimizing Files on Disk ClickHouse stores metadata, parts, WAL logs, and temp files under /var/lib/clickhouse . You should monitor this path inside the container and from the host. Monitor disk from inside container docker-compose exec clickhouse du -sh /var/lib/clickhouse To drill down: docker-compose exec clickhouse du -sh /var/lib/clickhouse/* Identify unexpectedly large directories like /store , /tmp , or /data . Purge temporary files and logs ClickHouse writes to /var/lib/clickhouse/tmp and /var/log/clickhouse-server/ : docker-compose exec clickhouse du -sh /var/lib/clickhouse/tmp docker-compose exec clickhouse du -sh /var/log/clickhouse-server/ Clear if disk is nearing full. Rotate or truncate logs if necessary. Clean WALs and outdated mutations If mutations or insert queues are stuck: SELECT * FROM system.mutations WHERE is_done = 0; Investigate and resolve the root cause. Consider restarting ClickHouse after clearing safe logs. Best Practices for ClickHouse Storage Management Use partitioning: Partition large tables by time (e.g., daily, monthly) to enable faster drops and better merge control. Archive old data: Move cold data to object storage (S3, etc.) or external databases for long-term storage. Avoid oversized inserts: Insert in smaller chunks to avoid bloating parts and reduce memory pressure during merges. Rotate logs: If ClickHouse logs to file, configure log rotation: logging: driver: "json-file" options: max-size: "10m" max-file: "3" Use ZSTD compression: Prefer ZSTD over LZ4 for better compression ratio at the cost of slightly higher CPU. Monitor merges and disk pressure: Use system.metrics and system.events to track merge performance, part counts, and disk usage trends. Backup externally: Don’t store backups on the same disk. Use Elestio backup options to archive to remote or cloud storage.