Get docker from here https://www.docker.com/community-edition.
Run docker-compose up -d
to start services in background.
This will start the following services in the order:
- 3 Zookeeper services.
- 6 Clickhouse services.
- 1 Nginx service.
Run docker-compose logs -f -t
to tail the logs.
Run docker-compose down
to stop and remove services.
Run ./ch_client.sh
to use the native client & connect to a clickhouse service.
You can use the plain old curl
or a http client in any programming language to access the clickhouse services at http://localhost:8123
You can connect to a specific clickhouse service by using any of the following paths
clickhouse_shard_1_replica_1
clickhouse_shard_1_replica_2
clickhouse_shard_2_replica_1
clickhouse_shard_2_replica_2
clickhouse_shard_3_replica_1
clickhouse_shard_3_replica_2
http://localhost:8123/clickhouse_shard_1_replica_1
The easiest way to get started and play with clickhouse on the web browser is to use http://ui.tabix.io/#/login
- name:
any name
- host:port:
http://localhost:8123
- user:
not required
- password:
not required
This connects to Nginx service. Nginx distributes the requests to the clickhouse services in a round-robin fashion.
You can connect to a specific clickhouse service by using any of the following paths
clickhouse_shard_1_replica_1
clickhouse_shard_1_replica_2
clickhouse_shard_2_replica_1
clickhouse_shard_2_replica_2
clickhouse_shard_3_replica_1
clickhouse_shard_3_replica_2
- name:
any name
- host:port:
http://localhost:8123/clickhouse_shard_1_replica_1
- user:
not required
- password:
not required
All services are connected to internal network chcompose_main
of Docker.
zoo1
zoo2
zoo3
clickhouse_shard_1_replica_1
clickhouse_shard_1_replica_2
clickhouse_shard_2_replica_1
clickhouse_shard_2_replica_2
clickhouse_shard_3_replica_1
clickhouse_shard_3_replica_2
nginx_proxy
To enable replication ZooKeeper is required. ClickHouse will take care of data consistency on all replicas and run restore procedure after failure automatically. This comes pre-configured so nothing to do here.
- You must run the following command on all the 6 clickhouse services using native client option or over http. Note that you cannot create table from TabIx as it can only be used for quering. This will create
local_table
on all the 6 services.
CREATE TABLE IF NOT EXISTS events
(
id String,
data String,
created_at DateTime,
created_at_date Date
) ENGINE = MergeTree(created_at_date, (id, created_at_date), 8192);
- After you've created the
local_table
on all the 6 clickhouse services you need to create thedistributed_table
on all 6 services as:
CREATE TABLE IF NOT EXISTS events_all AS events
ENGINE = Distributed(analytics, default, events, rand());
-
Note that here I'm using
sharding_key
asrand()
, which distributes the columns randomly. You can pass asharding_key
as well according to your business / application logic. -
Now you can insert some data using the native client. If you insert into
local_table
, data will be inserted just to single shard and replicated any node on that shard. -
If you insert into
distributed_table
, the data will be split by server and inserted into local tables on each shard of your cluster automatically. This is more convenient, but gives you less control over data distribution. -
Note that if you're going to insert data into
distributed_table
you just need to do it on a single service. Pick any one of them and connect via native client. The data is automatically distributed. No need to insert it on each service.