In the previous chapter, ClickHouse Keeper Tests, we ensured that our servers can coordinate and agree on a leader. We verified the internal nervous system of the cluster.
But a healthy brain and nervous system are useless if the entity cannot speak to the outside world. ClickHouse is unique because it is "multilingual." It speaks its own native language, but it also speaks HTTP, MySQL, PostgreSQL, gRPC, and Prometheus.
This brings us to Protocol Tests.
Imagine you run a restaurant (ClickHouse).
If your kitchen cooks a perfect burger, but the waiter doesn't understand the order, the customer leaves unhappy.
The Challenge: We need to ensure that when a user connects with a standard MySQL tool (like mysql command line) or a Prometheus scraper, ClickHouse understands the request perfectly and responds in the correct format.
Central Use Case: We want to write a test that:
SELECT "Hello".This is the piece of code inside ClickHouse that sits at the door. It listens on a specific port (e.g., 9004). It translates "Foreign" requests into "Native" ClickHouse internal commands.
This is the default way clickhouse-client talks to the server (usually port 9000). It is highly optimized for speed.
These are layers that mimic other databases.
These tests are a subset of Integration Tests. We use the same machinery (ClickHouseCluster), but we change how we talk to the server.
First, we must tell ClickHouse to open its ears to the specific protocol. We do this via an XML config file.
<!-- configs/mysql_protocol.xml -->
<clickhouse>
<!-- Tell ClickHouse to listen for MySQL on port 9004 -->
<mysql_port>9004</mysql_port>
</clickhouse>
Explanation: By default, this port might be closed. We explicitly open port 9004 for our test.
We start the cluster with this configuration.
import pytest
from helpers.cluster import ClickHouseCluster
import pymysql # Standard Python library for MySQL
cluster = ClickHouseCluster(__file__)
node = cluster.add_instance('node', main_configs=['configs/mysql_protocol.xml'])
@pytest.fixture(scope="module", autouse=True)
def started_cluster():
cluster.start()
yield
cluster.shutdown()
Explanation: We import pymysql. This is crucial. We are NOT using a ClickHouse client. We are using a tool designed for a completely different database to prove compatibility.
Now we write the test. We connect to the ClickHouse container using the MySQL driver.
def test_mysql_protocol_select():
# Connect to ClickHouse as if it were MySQL
conn = pymysql.connect(
user='default',
password='',
host='127.0.0.1',
port=node.port, # This is mapped to 9004
)
# Create a cursor to execute commands
cursor = conn.cursor()
cursor.execute("SELECT 1")
Explanation: The pymysql.connect function sends a specific "Handshake" packet. ClickHouse must recognize this handshake and reply correctly, or the connection will fail immediately.
We check the result.
# Fetch the result
result = cursor.fetchone()
# Verify the data
assert result[0] == 1
conn.close()
Explanation: If result is 1, it means ClickHouse successfully:
We use the same logic for other protocols.
We use the standard Python requests library.
import requests
def test_http_protocol():
# Send a POST request to port 8123 (Standard HTTP)
url = f"http://{node.ip_address}:8123/"
response = requests.post(url, data="SELECT 1")
assert response.status_code == 200
assert response.text.strip() == "1"
ClickHouse exposes metrics (like CPU usage) for Prometheus to scrape.
def test_prometheus_endpoint():
# Prometheus scrapes /metrics
url = f"http://{node.ip_address}:9363/metrics"
response = requests.get(url)
# Check for a specific metric key
assert "ClickHouseProfileEvents_Query" in response.text
How does ClickHouse pretend to be MySQL? It uses a Handler Pattern.
MySQLHandler looks at the first byte.Here is the flow:
The code for these handlers resides in src/Server/.
For example, MySQLHandler.cpp acts as the translator.
// Simplified Logic inside src/Server/MySQLHandler.cpp
void MySQLHandler::readPacket()
{
// Read the packet type from the wire
PacketType type = readType();
if (type == COM_QUERY)
{
// Extract the SQL string
String query = readString();
// Pass it to the internal engine
executeQuery(query);
}
}
Explanation: This C++ code reads raw bytes from the network. It switches based on PacketType. If it receives a COM_QUERY (standard MySQL command), it extracts the text and runs it.
The hardest part is mapping data types.
VARCHAR, DECIMAL.String, Decimal128.
The protocol tests heavily verify that these conversions happen without losing precision. We often write tests that insert DateTime via MySQL and read it back to ensure the timezone didn't shift.
Protocol Tests ensure the ecosystem around ClickHouse works.
In this chapter, we learned about Protocol Tests.
pymysql or requests) to simulate real-world usage.Now that we know we can talk to clients, what about talking to other systems? ClickHouse often needs to pull data from Kafka or read files from S3. These are complex external dependencies.
In the next chapter, we will explore External Integrations Tests.
Next Chapter: External Integrations Tests
Generated by Code IQ