Skip to main content

Java Language Client Options for ClickHouse

There are three options for connecting to ClickHouse using Java:

Java Client

Provides the most flexible and performant way to integrate your app with ClickHouse.

Environment requirements

Compatibility with ClickHouse

Client versionClickHouse
0.6.022.8+

Installation

<dependency>
<groupId>com.clickhouse</groupId>
<!-- or clickhouse-grpc-client if you prefer gRPC -->
<artifactId>clickhouse-http-client</artifactId>
<version>0.6.0</version>
</dependency>

Since version 0.5.0, the driver uses a new client http library that needs to be added as a dependency.

<dependency>
<groupId>org.apache.httpcomponents.client5</groupId>
<artifactId>httpclient5</artifactId>
<version>5.2.3</version>
</dependency>

Supported data types

FormatSupportComment
AggregatedFunction⚠️ does not support SELECT * FROM table ...
Array(*)
Bool
Date*
DateTime*
Decimal*SET output_format_decimal_trailing_zeros=1 in 21.9+ for consistency
Enum*can be treated as both string and integer
Geo TypesPoint, Ring, Polygon, and MultiPolygon
Int*, UInt*UInt64 is mapped to long
IPv*
Map(*)
Nested(*)
Object('JSON')
SimpleAggregateFunction
*String
Tuple(*)
UUID

Driver API

Connect to ClickHouse

URL Syntax: protocol://host[:port][/database][?param[=value][&param[=value]][#tag[,tag]], for example:

  • http://localhost:8443?ssl=true&sslmode=NONE
  • http://(https://[email protected]:443
  • tcp://localhost?!auto_discovery#experimental),(grpc://localhost#experimental)?failover=3#test
ClickHouseNodes servers = ClickHouseNodes.of(
"jdbc:ch:http://server1.domain,server2.domain,server3.domain/my_db"
+ "?load_balancing_policy=random&health_check_interval=5000&failover=2");

Query

try (ClickHouseClient client = ClickHouseClient.newInstance(ClickHouseProtocol.HTTP);
ClickHouseResponse response = client.read(servers)
.format(ClickHouseFormat.RowBinaryWithNamesAndTypes)
.query("select * from numbers(:limit)")
.params(1000)
.executeAndWait()) {
ClickHouseResponseSummary summary = response.getSummary();
long totalRows = summary.getTotalRowsToRead();
}

Streaming Query

try (ClickHouseClient client = ClickHouseClient.newInstance(ClickHouseProtocol.HTTP);
ClickHouseResponse response = client.read(servers)
.format(ClickHouseFormat.RowBinaryWithNamesAndTypes)
.query("select * from numbers(:limit)")
.params(1000)
.executeAndWait()) {
for (ClickHouseRecord r : response.records()) {
int num = r.getValue(0).asInteger();
// type conversion
String str = r.getValue(0).asString();
LocalDate date = r.getValue(0).asDate();
}
}

Insert

try (ClickHouseClient client = ClickHouseClient.newInstance(ClickHouseProtocol.HTTP);
ClickHouseResponse response = client.read(servers).write()
.format(ClickHouseFormat.RowBinaryWithNamesAndTypes)
.query("insert into my_table select c2, c3 from input('c1 UInt8, c2 String, c3 Int32')")
.data(myInputStream) // load data into a table and wait untilit's completed
.executeAndWait()) {
ClickHouseResponseSummary summary = response.getSummary();
summary.getWrittenRows();
}

Multiple queries

Execute multiple queries in a worker thread one after another within same session:

CompletableFuture<List<ClickHouseResponseSummary>> future = ClickHouseClient.send(servers.apply(servers.getNodeSelector()),
"create database if not exists my_base",
"use my_base",
"create table if not exists test_table(s String) engine=Memory",
"insert into test_table values('1')('2')('3')",
"select * from test_table limit 1",
"truncate table test_table",
"drop table if exists test_table");
List<ClickHouseResponseSummary> results = future.get();

Named Parameters

You can pass parameters by name rather than relying solely on their position in the parameter list. This capability is available using params function.

try (ClickHouseClient client = ClickHouseClient.newInstance(ClickHouseProtocol.HTTP);
ClickHouseResponse response = client.read(servers)
.format(ClickHouseFormat.RowBinaryWithNamesAndTypes)
.query("select * from my_table where name=:name limit :limit")
.params("Ben", 1000)
.executeAndWait()) {
//...
}
}
Parameters

All params signatures involving String type (String, String[], Map<String, String>) assume the keys being passed are valid ClickHouse SQL strings. For instance:

try (ClickHouseClient client = ClickHouseClient.newInstance(ClickHouseProtocol.HTTP);
ClickHouseResponse response = client.read(servers)
.format(ClickHouseFormat.RowBinaryWithNamesAndTypes)
.query("select * from my_table where name=:name")
.params(Map.of("name","'Ben'"))
.executeAndWait()) {
//...
}
}

If you prefer not to parse String objects to ClickHouse SQL manually, you can use the helper function ClickHouseValues.convertToSqlExpression located at com.clickhouse.data:

try (ClickHouseClient client = ClickHouseClient.newInstance(ClickHouseProtocol.HTTP);
ClickHouseResponse response = client.read(servers)
.format(ClickHouseFormat.RowBinaryWithNamesAndTypes)
.query("select * from my_table where name=:name")
.params(Map.of("name", ClickHouseValues.convertToSqlExpression("Ben's")))
.executeAndWait()) {
//...
}
}

In the example above, ClickHouseValues.convertToSqlExpression will escape the inner single quote, and surround the variable with a valid single quotes.

Other types, such as Integer, UUID, Array and Enum will be converted automatically inside params.

JDBC Driver

clickhouse-jdbc implements the standard JDBC interface. Being built on top of clickhouse-client, it provides additional features like custom type mapping, transaction support, and standard synchronous UPDATE and DELETE statements, etc., so that it can be easily used with legacy applications and tools.

clickhouse-jdbc API is synchronous, and generally, it has more overheads(e.g., SQL parsing and type mapping/conversion, etc.). Consider clickhouse-client when performance is critical or if you prefer a more direct way to access ClickHouse.

Environment requirements

Compatibility with ClickHouse

Client versionClickHouse
0.6.022.8+

Installation

<dependency>
<groupId>com.clickhouse</groupId>
<artifactId>clickhouse-jdbc</artifactId>
<version>0.6.0</version>
<!-- use uber jar with all dependencies included, change classifier to http for smaller jar -->
<classifier>all</classifier>
</dependency>

Since version 0.5.0, we are using Apache HTTP Client that's packed the Client. Since there is not a shared version of the package, you need to add a logger as a dependency.

<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>2.0.9</version>
</dependency>

Configuration

Driver Class: com.clickhouse.jdbc.ClickHouseDriver

URL Syntax: jdbc:(ch|clickhouse)[:<protocol>]://endpoint1[,endpoint2,...][/<database>][?param1=value1&param2=value2][#tag1,tag2,...], for example:

  • jdbc:ch://localhost is same as jdbc:clickhouse:http://localhost:8123
  • jdbc:ch:https://localhost is same as jdbc:clickhouse:http://localhost:8443?ssl=true&sslmode=STRICT
  • jdbc:ch:grpc://localhost is same as jdbc:clickhouse:grpc://localhost:9100

Connection Properties:

PropertyDefaultDescription
continueBatchOnErrorfalseWhether to continue batch processing when error occurred
createDatabaseIfNotExistfalseWhether to create database if it does not exist
custom_http_headerscomma separated custom http headers, for example: User-Agent=client1,X-Gateway-Id=123
custom_http_paramscomma separated custom http query parameters, for example: extremes=0,max_result_rows=100
nullAsDefault00 - treat null value as is and throw exception when inserting null into non-nullable column; 1 - treat null value as is and disable null-check for inserting; 2 - replace null to default value of corresponding data type for both query and insert
jdbcCompliancetrueWhether to support standard synchronous UPDATE/DELETE and fake transaction
typeMappingsCustomize mapping between ClickHouse data type and Java class, which will affect result of both getColumnType() and getObject(Class<?>). For example: UInt128=java.lang.String,UInt256=java.lang.String
wrapperObjectfalseWhether getObject() should return java.sql.Array / java.sql.Struct for Array / Tuple.

Note: please refer to JDBC specific configuration for more.

Supported data types

FormatSupportComment
AggregatedFunction⚠️ does not support SELECT * FROM table ...
Array(*)
Bool
Date*
DateTime*
Decimal*SET output_format_decimal_trailing_zeros=1 in 21.9+ for consistency
Enum*can be treated as both string and integer
Geo TypesPoint, Ring, Polygon, and MultiPolygon
Int*, UInt*UInt64 is mapped to long
IPv*
Map(*)
Nested(*)
Object('JSON')
SimpleAggregateFunction
*String
Tuple(*)
UUID

Driver API

Connect to ClickHouse

String url = "jdbc:ch://my-server/system"; // use http protocol and port 8123 by default

Properties properties = new Properties();

ClickHouseDataSource dataSource = new ClickHouseDataSource(url, properties);
try (Connection conn = dataSource.getConnection("default", "password");
Statement stmt = conn.createStatement()) {
}

Query


try (Connection conn = dataSource.getConnection(...);
Statement stmt = conn.createStatement()) {
ResultSet rs = stmt.executeQuery("select * from numbers(50000)");
while(rs.next()) {
// ...
}
}

Insert

note
  • Use PreparedStatement instead of Statement
  • Use input function whenever possible
With input table function

An option with great performance characteristics:

try (PreparedStatement ps = conn.prepareStatement(
"insert into mytable select col1, col2 from input('col1 String, col2 DateTime64(3), col3 Int32')")) {
// The column definition will be parsed so the driver knows there are 3 parameters: col1, col2 and col3
ps.setString(1, "test"); // col1
ps.setObject(2, LocalDateTime.now()); // col2, setTimestamp is slow and not recommended
ps.setInt(3, 123); // col3
ps.addBatch(); // parameters will be write into buffered stream immediately in binary format
...
ps.executeBatch(); // stream everything on-hand into ClickHouse
}
Insert

It's easier to use but slower performance compare to input function:

try (PreparedStatement ps = conn.prepareStatement("insert into mytable(* except (description))")) {
ps.setString(1, "test"); // id
ps.setObject(2, LocalDateTime.now()); // timestamp
ps.addBatch(); // parameters will be write into buffered stream immediately in binary format
...
ps.executeBatch(); // stream everything on-hand into ClickHouse
}
Insert with placeholders

An option that is not recommended as it requires a very large SQL query:

try (PreparedStatement ps = conn.prepareStatement("insert into mytable values(trim(?),?,?)")) {
ps.setString(1, "test"); // id
ps.setObject(2, LocalDateTime.now()); // timestamp
ps.setString(3, null); // description
ps.addBatch(); // append parameters to the query
...
ps.executeBatch(); // issue the composed query: insert into mytable values(...)(...)...(...)
}

Advanced API

Connect to ClickHouse with SSL

To establish a secure JDBC connection to ClickHouse using SSL, you need to configure your JDBC properties to include SSL parameters. This typically involves specifying SSL properties such as sslmode and sslrootcert in your JDBC URL or Properties object.

SSL Properties

NameDefault ValueOptional ValuesDescription
sslfalsetrue, falseWhether to enable SSL/TLS for the connection
sslmodestrictstrict, noneWhether to verify SSL/TLS certificate
sslrootcertPath to SSL/TLS root certificates
sslcertPath to SSL/TLS certificate
sslkeyRSA key in PKCS#8 format
key_store_typeJKS, PKCS12Specifies the type or format of the keystore/truststore file
trust_storePath to the truststore file
key_store_passwordPassword needed to access the keystore file specified in the keystore config

These properties ensure that your Java application communicates with the ClickHouse server over an encrypted connection, enhancing data security during transmission.

  String url = "jdbc:ch://your-server:8443/system";

Properties properties = new Properties();
properties.setProperty("ssl", "true");
properties.setProperty("sslmode", "strict"); // NONE to trust all servers; STRICT for trusted only
properties.setProperty("sslrootcert", "/mine.crt");
try (Connection con = DriverManager
.getConnection(url, properties)) {

try (PreparedStatement stmt = con.prepareStatement(

// place your code here

}
}

Handling DateTime and time zones

Please to use java.time.LocalDateTime or java.time.OffsetDateTime instead of java.sql.Timestamp, and java.time.LocalDate instead of java.sql.Date.

try (PreparedStatement ps = conn.prepareStatement("select date_time from mytable where date_time > ?")) {
ps.setObject(2, LocalDateTime.now());
ResultSet rs = ps.executeQuery();
while(rs.next()) {
LocalDateTime dateTime = (LocalDateTime) rs.getObject(1);
}
...
}

Handling AggregateFunction

note

As of now, only groupBitmap is supported.

// batch insert using input function
try (ClickHouseConnection conn = newConnection(props);
Statement s = conn.createStatement();
PreparedStatement stmt = conn.prepareStatement(
"insert into test_batch_input select id, name, value from input('id Int32, name Nullable(String), desc Nullable(String), value AggregateFunction(groupBitmap, UInt32)')")) {
s.execute("drop table if exists test_batch_input;"
+ "create table test_batch_input(id Int32, name Nullable(String), value AggregateFunction(groupBitmap, UInt32))engine=Memory");
Object[][] objs = new Object[][] {
new Object[] { 1, "a", "aaaaa", ClickHouseBitmap.wrap(1, 2, 3, 4, 5) },
new Object[] { 2, "b", null, ClickHouseBitmap.wrap(6, 7, 8, 9, 10) },
new Object[] { 3, null, "33333", ClickHouseBitmap.wrap(11, 12, 13) }
};
for (Object[] v : objs) {
stmt.setInt(1, (int) v[0]);
stmt.setString(2, (String) v[1]);
stmt.setString(3, (String) v[2]);
stmt.setObject(4, v[3]);
stmt.addBatch();
}
int[] results = stmt.executeBatch();
...
}

// use bitmap as query parameter
try (PreparedStatement stmt = conn.prepareStatement(
"SELECT bitmapContains(my_bitmap, toUInt32(1)) as v1, bitmapContains(my_bitmap, toUInt32(2)) as v2 from {tt 'ext_table'}")) {
stmt.setObject(1, ClickHouseExternalTable.builder().name("ext_table")
.columns("my_bitmap AggregateFunction(groupBitmap,UInt32)").format(ClickHouseFormat.RowBinary)
.content(new ByteArrayInputStream(ClickHouseBitmap.wrap(1, 3, 5).toBytes()))
.asTempTable()
.build());
ResultSet rs = stmt.executeQuery();
Assert.assertTrue(rs.next());
Assert.assertEquals(rs.getInt(1), 1);
Assert.assertEquals(rs.getInt(2), 0);
Assert.assertFalse(rs.next());
}

Configuring HTTP library

The ClickHouse JDBC connector supports three HTTP libraries: HttpClient, HttpURLConnection, and Apache HttpClient.

note

HttpClient is only supported in JDK 11 or above.

The JDBC driver uses HttpClient by default. You can change the HTTP library used by the ClickHouse JDBC connector by setting the following property:

properties.setProperty("http_connection_provider", "APACHE_HTTP_CLIENT");

Here is a full list of the corresponding values:

Property ValueHTTP Library
HTTP_CLIENTHTTPClient
HTTP_URL_CONNECTIONHttpURLConnection
APACHE_HTTP_CLIENTApache HttpClient

Resolving JDBC Timeout on Large Inserts

When performing large inserts in ClickHouse with long execution times, you may encounter JDBC timeout errors like:

Caused by: java.sql.SQLException: Read timed out, server myHostname [uri=https://hostname.aws.clickhouse.cloud:8443]

These errors can disrupt the data insertion process and affect system stability. To address this issue you need to adjust a few timeout settings in the client's OS.

Mac OS

On Mac OS, the following settings can be adjusted to resolve the issue:

  • net.inet.tcp.keepidle: 60000
  • net.inet.tcp.keepintvl: 45000
  • net.inet.tcp.keepinit: 45000
  • net.inet.tcp.keepcnt: 8
  • net.inet.tcp.always_keepalive: 1
Linux

On Linux, the equivalent settings alone may not resolve the issue. Additional steps are required due to the differences in how Linux handles socket keep-alive settings. Follow these steps:

  1. Adjust the following Linux kernel parameters in /etc/sysctl.conf or a related configuration file:

    • net.inet.tcp.keepidle: 60000
    • net.inet.tcp.keepintvl: 45000
    • net.inet.tcp.keepinit: 45000
    • net.inet.tcp.keepcnt: 8
    • net.inet.tcp.always_keepalive: 1
    • net.ipv4.tcp_keepalive_intvl: 75
    • net.ipv4.tcp_keepalive_probes: 9
    • net.ipv4.tcp_keepalive_time: 60 (You may consider lowering this value from the default 300 seconds)
  2. After modifying the kernel parameters, apply the changes by running the following command:

    sudo sysctl -p

After Setting those settings, you need to ensure that your client enables the Keep Alive option on the socket:

properties.setProperty("socket_keepalive", "true");
note

Currently, you must use Apache HTTP Client library when setting the socket keep-alive, as the other two HTTP client libraries supported by clickhouse-java do not allow setting socket options. For a detailed guide, see Configuring HTTP library.

Alternatively, you can add equivalent parameters to the JDBC URL.

The default socket and connection timeout for the JDBC driver is 30 seconds. The timeout can be increased to support large data insert operations. Use the options method on ClickHouseClient together with the SOCKET_TIMEOUT and CONNECTION_TIMEOUT options as defined by ClickHouseClientOption:

final int MS_12H = 12 * 60 * 60 * 1000; // 12 h in ms
final String sql = "insert into table_a (c1, c2, c3) select c1, c2, c3 from table_b;";

try (ClickHouseClient client = ClickHouseClient.newInstance(ClickHouseProtocol.HTTP)) {
client.read(servers).write()
.option(ClickHouseClientOption.SOCKET_TIMEOUT, MS_12H)
.option(ClickHouseClientOption.CONNECTION_TIMEOUT, MS_12H)
.query(sql)
.executeAndWait();
}

Configuring node discovery, load balancing, and failover

Node discovery

Java client provides the ability to discover ClickHouse nodes automatically. Auto-discovery is disabled by default. To manually enable it, set auto_discovery to true:

properties.setProperty("auto_discovery", "true");

Or in the connection URL:

jdbc:ch://my-server/system?auto_discovery=true

If auto-discovery is enabled, there is no need to specify all ClickHouse nodes in the connection URL. Nodes specified in the URL will be treated as seeds, and the Java client will automatically discover more nodes from system tables and/or clickhouse-keeper or zookeeper.

The following options are responsible for auto-discovery configuration:

PropertyDefaultDescription
auto_discoveryfalseWhether the client should discover more nodes from system tables and/or clickhouse-keeper/zookeeper.
node_discovery_interval0Node discovery interval in milliseconds, zero or negative value means one-time discovery.
node_discovery_limit100Maximum number of nodes that can be discovered at a time; zero or negative value means no limit.
Load balancing

The Java client chooses a ClickHouse node to send requests to, according to the load-balancing policy. In general, the load-balancing policy is responsible for the following things:

  1. Get a node from a managed node list.
  2. Managing node's status.
  3. Optionally schedule a background process for node discovery (if auto-discovery is enabled) and run a health check.

Here is a list of options to configure load balancing:

PropertyDefaultDescription
load_balancing_policy""The load-balancing policy can be one of:
  • firstAlive - request is sent to the first healthy node from the managed node list
  • random - request is sent to a random node from the managed node list
  • roundRobin - request is sent to each node from the managed node list, in turn.
  • full qualified class name implementing ClickHouseLoadBalancingPolicy - custom load balancing policy
  • If it is not specified the request is sent to the first node from the managed node list
    load_balancing_tags""Load balancing tags for filtering out nodes. Requests are sent only to nodes that have the specified tags
    health_check_interval0Health check interval in milliseconds, zero or negative value means one-time.
    health_check_methodClickHouseHealthCheckMethod.SELECT_ONEHealth check method. Can be one of:
  • ClickHouseHealthCheckMethod.SELECT_ONE - check with select 1 query
  • ClickHouseHealthCheckMethod.PING - protocol-specific check, which is generally faster
  • node_check_interval0Node check interval in milliseconds, negative number is treated as zero. The node status is checked if the specified amount of time has passed since the last check.
    The difference between health_check_interval and node_check_interval is that the health_check_interval option schedules the background job, which checks the status for the list of nodes (all or faulty), but node_check_interval specifies the amount of time has passed since the last check for the particular node
    check_all_nodesfalseWhether to perform a health check against all nodes or just faulty ones.
    Failover and retry

    Java client provides configuration options to set up failover and retry behavior for failed queries:

    PropertyDefaultDescription
    failover0Maximum number of times a failover can happen for a request. Zero or a negative value means no failover. Failover sends the failed request to a different node (according to the load-balancing policy) in order to recover from failover.
    retry0Maximum number of times retry can happen for a request. Zero or a negative value means no retry. Retry sends a request to the same node and only if the ClickHouse server returns the NETWORK_ERROR error code
    repeat_on_session_locktrueWhether to repeat execution when the session is locked until timed out(according to session_timeout or connect_timeout). The failed request is repeated if the ClickHouse server returns the SESSION_IS_LOCKED error code

    Adding custom http headers

    Java client support HTTP/S transport layer in case we want to add custom HTTP headers to the request. We should use the custom_http_headers property, and the headers need to be , separated. The header key/value should be divided using =

    Java Client support

    options.put("custom_http_headers", "X-ClickHouse-Quota=test, X-ClickHouse-Test=test");

    JDBC Driver

    properties.setProperty("custom_http_headers", "X-ClickHouse-Quota=test, X-ClickHouse-Test=test");

    R2DBC driver

    R2DBC wrapper of async Java client for ClickHouse.

    Environment requirements

    Compatibility with ClickHouse

    Client versionClickHouse
    0.6.022.8+

    Installation

    <dependency>
    <groupId>com.clickhouse</groupId>
    <!-- change to clickhouse-r2dbc_0.9.1 for SPI 0.9.1.RELEASE -->
    <artifactId>clickhouse-r2dbc</artifactId>
    <version>0.6.0</version>
    <!-- use uber jar with all dependencies included, change classifier to http or grpc for smaller jar -->
    <classifier>all</classifier>
    <exclusions>
    <exclusion>
    <groupId>*</groupId>
    <artifactId>*</artifactId>
    </exclusion>
    </exclusions>
    </dependency>

    Supported data types

    FormatSupportComment
    AggregatedFunction⚠️ does not support SELECT * FROM table ...
    Array(*)
    Bool
    Date*
    DateTime*
    Decimal*SET output_format_decimal_trailing_zeros=1 in 21.9+ for consistency
    Enum*can be treated as both string and integer
    Geo TypesPoint, Ring, Polygon, and MultiPolygon
    Int*, UInt*UInt64 is mapped to long
    IPv*
    Map(*)
    Nested(*)
    Object('JSON')
    SimpleAggregateFunction
    *String
    Tuple(*)
    UUID

    Driver API

    Connect to ClickHouse

    ConnectionFactory connectionFactory = ConnectionFactories
    .get("r2dbc:clickhouse:http://{username}:{password}@{host}:{port}/{database}");

    Mono.from(connectionFactory.create())
    .flatMapMany(connection -> connection

    Query

    connection
    .createStatement("select domain, path, toDate(cdate) as d, count(1) as count from clickdb.clicks where domain = :domain group by domain, path, d")
    .bind("domain", domain)
    .execute())
    .flatMap(result -> result
    .map((row, rowMetadata) -> String.format("%s%s[%s]:%d", row.get("domain", String.class),
    row.get("path", String.class),
    row.get("d", LocalDate.class),
    row.get("count", Long.class)))
    )
    .doOnNext(System.out::println)
    .subscribe();

    Insert

    connection
    .createStatement("insert into clickdb.clicks values (:domain, :path, :cdate, :count)")
    .bind("domain", click.getDomain())
    .bind("path", click.getPath())
    .bind("cdate", LocalDateTime.now())
    .bind("count", 1)
    .execute();