TiDB to ClickHouse
BladePipe supports data replication from TiDB to ClickHouse. View supported migration, sync, verification, and connector capabilities.
| Function | Description |
|---|---|
Schema Migration | If the target schema does not exist, BladePipe will automatically generate and execute CREATE statements based on the source metadata and the mapping rule. |
Full Data Migration | Migrate data by sequentially scanning data in tables and writing it in batches to the target database. |
Incremental Data Sync | Sync of common DML like INSERT, UPDATE, DELETE is supported. |
Data Verification | Verify all existing data. Scheduled DataTasks are supported. |
Subscription Modification | Add, delete, or modify the subscribed tables with support for historical data migration. For more information, see Modify Subscription. |
Position Resetting | Reset positions by timestamp to consume again the incremental data that has not been collected as garbage by TiKV in a past period. |
Table Name Mapping | Support the mapping rules, namely, keeping the name the same as that in Source, converting the text to lowercase, converting the text to uppercase, truncating the name by "_digit" suffix. |
DDL Sync |
|
Metadata Retrieval | Retrieve the target metadata with filtering conditions or target primary keys set from the source table. |
Advanced Functions
| Function | Description |
|---|---|
Write in Append Mode | INSERT and UPDATE statements are written in batches in append mode, and DELETE statements are executed individually through ALTER statements. |
Scheduled Table Optimization | By setting the parameter autoOptimizeThresholdSec, the tables are optimized regularly. |
Scheduled Full Data Migration | For more information, see Create Scheduled Full Data DataJob. |
Custom Code | For more information, see Custom Code Processing, Debug Custom Code and Logging in Custom Code. |
Data Filtering Conditions | Support data filtering using WHERE conditions, with SQL-92 as the SQL language. For more information, see Data Filtering. |
Limits
| Limit | Description |
|---|---|
Special Operation | Too many DELETE operations (>50 records/second) significantly affect data synchronization performance. |
Target Table Engine | Only the following table engines and corresponding source table types are supported:
|
Prerequisites
| Prerequisite | Description |
|---|---|
Permissions for Account | |
Connection to PD Nodes | Make sure that BladePipe Workers can communicate with PD nodes.
|
TiKV GC Frequency | Set GC cycle to 24 hours or more in TiDB Server.
|
TiKV Historical Data Caching | Adjust the size based on task needs.
|
Parameters
| Parameter | Description |
|---|---|
printDetailLog | Print received incremental data. It is used for determining if the source database has incremental data. |
pdHost | PD node address for DataJob requests. Format: [PD_IP]:[PD_PORT], multiple PD nodes separated by , |
cdcGrpcTimeout | Timeout for gRPC channel of PD nodes to DataJob, in ms. |
cdcStubTimeout | Timeout for each stub in gRPC channel, in ms. Auto-resubscribe the stub in case of time out. |
fastFailKeywords | A comma-separated array of strings. When an exception message contains any of these keywords, the task will skip reconnection attempts and restart directly. For example, DEADLINE_EXCEEDED means the task will restart directly instead of reconnecting when a gRPC timeout exception occurs. |
Tips: To modify the general parameters, see General Parameters and Functions.
Prerequisites
| Prerequisite | Description |
|---|---|
Permissions for Account | SELECT, INSERT and common DDL permissions. |
Port Preparation | Allow the migration and sync node (Worker) to connect to the ClickHouse port (e.g., 8123). |
Parameters
| Parameter | Description |
|---|---|
multiReplica | Whether there are multiple replicas in a cluster. |
clusterName | Cluster name. When multiReplica is true, the ON CLUSTER clusterName clause is automatically added to DDL/DML. |
ckTableEngine | The following table engines are currently supported:
|
autoOptimizeThresholdSec | Interval of scheduled table optimization (optimize table final). If the value <=0, it means the feature is disabled. |
enableTimeRangeClamping | Whether to enable time range clamping. Forces date and time values to be constrained within the valid ClickHouse JDBC range. Values outside this range will be clamped to the minimum or maximum values. Disabled by default (false). Ranges after clamping(UTC):
|
Tips: To modify the general parameters, see General Parameters and Functions.