In the previous chapter, Stateless Queries, we learned how to check if ClickHouse can answer simple questions like 1 + 1. We ran these tests against a single, isolated Docker Server Image.
But ClickHouse is rarely used alone. It is designed to work in clusters, replicate data between servers, and talk to external systems like Kafka or S3.
How do we test what happens if a server crashes? Or if the network fails? A simple SQL query cannot simulate a crashing server.
This brings us to the Integration Test Job Script.
Imagine you are testing a new car.
The Challenge: To run an "Integration Test" for a database, you need a complex environment. You might need:
Setting this up manually for every test would take hours. We need a script that can create this entire "virtual world," run the test, and then destroy itβall in a few minutes.
Central Use Case: We want a script that can:
test_replication.py).
To solve this, we use a Python script named integration_test_job.py. It uses the following tools:
Pytest is a standard framework for testing in Python. While our Stateless queries were just SQL files, Integration tests are actual Python programs. This allows us to write logic like: "Start Server A, insert data, kill Server A, check if data is on Server B."
The script integration_test_job.py is the manager. It doesn't write the tests; it organizes them. It prepares the Docker environment and tells Pytest which files to execute.
We have hundreds of integration tests. Running them one by one would take too long (10+ hours). Batching means we split the tests into groups (e.g., 5 groups).
This allows us to run everything in parallel.
The CI system (Praktika) invokes this script. It tells the script which "slice" of tests to run.
The script accepts arguments to know which batch of tests it is responsible for.
# integration_test_job.py (Simplified)
import argparse
def parse_args():
parser = argparse.ArgumentParser()
# Which "slice" of tests to run? (e.g., 1/5)
parser.add_argument("--shard", type=int, default=1)
parser.add_argument("--shards", type=int, default=1)
# Where is the ClickHouse binary?
parser.add_argument("--binary", required=True)
return parser.parse_args()
Explanation:
--shard 1 and --shards 5 means: "I am worker #1 out of 5 total workers." The script uses this math to pick only 20% of the tests to run.
The script scans the directory tests/integration/ to find all available test files.
import os
def get_all_tests(test_dir):
# Find all files starting with "test_" and ending with ".py"
all_files = os.listdir(test_dir)
tests = [f for f in all_files if f.startswith("test_") and f.endswith(".py")]
# Sort them to ensure every runner sees the same list
return sorted(tests)
Explanation: We look for files like test_replicated_merge_tree.py. Sorting is crucial: if Runner 1 and Runner 2 have different lists, they might skip a test or run it twice.
When this script runs, it acts as a bridge between the Test Code and the Docker Environment.
pytest.Here is the flow:
The core of the script is simply building a command line string to launch pytest.
import subprocess
def run_test(test_name):
# Construct the command
cmd = [
"pytest",
"-v", # Verbose output
f"tests/integration/{test_name}"
]
print(f"Running {test_name}...")
# Execute the command and wait for it to finish
subprocess.check_call(cmd)
Explanation:
subprocess.check_call to run Pytest as if we typed it in the terminal.How does the test know which version of ClickHouse to use? The Job Script passes this information via Environment Variables.
import os
def set_environment(clickhouse_binary_path):
# Tell the integration tests where the binary is
os.environ['CLICKHOUSE_BINARY'] = clickhouse_binary_path
# We can also pass the Docker Image tag here
os.environ['CLICKHOUSE_IMAGE'] = "clickhouse/server:latest"
Explanation: When the Python test runs, it reads os.environ['CLICKHOUSE_BINARY']. This ensures that the test uses the exact binary we just built in Chapter 4, not some old version installed on the machine.
The Integration Test Job Script is the "Site Manager" for our construction work.
In this chapter, we learned about the Integration Test Job Script.
Now that we have the "Manager" script ready to run our tests, we need to look at the tests themselves. What does an actual Integration Test look like? How do we write Python code to make a database crash on purpose?
In the next chapter, we will write our first real Integration Test.
Next Chapter: Integration Tests
Generated by Code IQ