SAP HANA to Doris
BladePipe supports data replication from SAP HANA to Doris. View supported migration, sync, verification, and connector capabilities.
| Function | Description |
|---|---|
Schema Migration | If the target schema does not exist, BladePipe will automatically generate and execute CREATE statements based on the source metadata and the mapping rule. |
Full Data Migration | Migrate data by sequentially scanning data in tables and writing it in batches to the target database. |
Incremental Data Sync | Sync of common DML like INSERT, UPDATE, DELETE is supported. |
Data Verification and Correction | Verify all existing data. Optionally, you can correct the inconsistent data based on verification results. Scheduled DataTasks are supported. |
Subscription Modification | Add, delete, or modify the subscribed tables with support for historical data migration. For more information, see Modify Subscription. |
Position Resetting | Reset positions by data ID or timestamp. Allow re-consumption of CDC data in a past period. |
Table Name Mapping | Support the mapping rules, namely, keeping the name the same as that in Source, converting the text to lowercase, converting the text to uppercase, truncating the name by "_digit" suffix. |
Metadata Retrieval | Retrieve the target metadata with filtering conditions or target primary keys set from the source table. |
Advanced Functions
| Function | Description |
|---|---|
Trigger-based Incremental Data Sync | The DataJob automatically creates triggers on tables. These triggers capture INSERT, UPDATE, and DELETE events and write them to the CDC tables. |
Removal of Target Data before Full Data Migration | Remove the existing data in the Target before running the Full Data Migration, applicable for DataJobs reruning and scheduled Full Data migrations. |
Recreating Target Table | Recreate target tables before running the Full Data Migration, applicable for DataJobs reruning and scheduled Full Data migrations. |
Stream Load | Use Stream Load to write data to Doris/SelectDB BE. By default, batch write is adopted, with dynamic adjustment of data flush interval and batch size. |
Handling of Zero Value for Time | Allow setting zero value for time to different data types to prevent errors when writing to the Target. |
Custom Table Properties | Include settings for properties such as bucket count and replica count. |
Setting Data Partitions | When creating a DataJob, specify partition definitions at the table level (static or dynamic). Automatically add these partition definitions during schema migration. |
Scheduled Full Data Migration | For more information, see Create Scheduled Full Data DataJob. |
Custom Code | For more information, see Custom Code Processing, Debug Custom Code and Logging in Custom Code. |
Data Filtering Conditions | Support data filtering using WHERE conditions, with SQL-92 as the SQL language. For more information, see Data Filtering. |
Limits
| Limit | Description |
|---|---|
DDL Change Handling | BladePipe captures data changes in a source SAP HANA instance through triggers. DDL synchronization is not supported. If there are DDL changes, follow the steps in Change Schema in a Source SAP HANA Instance. |
Hana Data Types in Incremental Sync | In the incremental data sync phase with a source Hana instance, it does not allow capturing changes for TEXT, BIN_TEXT, ST_POINT, and ST_GEOMETRY data types by triggers. |
Target Table Type | Only support Unique key model(Unique). |
Source Table Type | Migration and sync of tables without primary keys are not supported. |
Data Type | Do not support binary data such as BINARY, BLOB. |
Incremental Data Write Conflict Resolution Rule | Using Stream Load method, the primary key is used for full row replacement. |
Parameters
| Parameter | Description |
|---|---|
sysTriggerDataSchema | The schema name where the trigger writes incremental data. |
sysTriggerDataTable | The table name where the trigger writes incremental data. |
incrPagingCount | The total amount of data queried each time by the trigger during incremental data synchronization. |
incrIdleSleepSecond | The interval between queries for the trigger during idle period of incremental data synchronization (in seconds). |
incrScanIntervalMs | The interval between data queries for the trigger during incremental data synchronization (in milliseconds). |
autoCheckTriggerAndReInstall | Check the trigger status and reinstall it when the DataJob starts. |
triggerDataCleanEnabled | Enable scheduled cleanup of trigger incremental data. |
triggerDataCleanIntervalMin | The cleanup interval for trigger incremental data (in minutes). |
triggerDataRetentionMin | The retention time for trigger incremental data (in minutes). |
dbHeartbeatEnable | Configure whether to enable heartbeat for the source database. |
needTriggerDataJsonEscape | Whether to escape characters (\) in the trigger incremental JSON. |
triggerDataJsonQuotation | Custom quotation marks for trigger incremental JSON. |
triggerParamBathSize | Set the number of columns involved per variable in the trigger template. |
fullBeforeImageEnabled | Enable the trigger to record the complete data before all column changes. |
Tips: To modify the general parameters, see General Parameters and Functions.
Prerequisites
| Prerequisite | Description |
|---|---|
Permissions for Account | SELECT and DDL permissions (optional) |
Port Preparation | Allow the migration and sync node (Worker) to connect to the Doris/SelectDB FE QueryPort and FE/BE HttpPort. |
Parameters
| Parameter | Description |
|---|---|
host | MySQL port, corresponding to Doris/SelectDB FE QueryPort. |
httpHost | Host for Doris stream load, corresponding to Doris/SelectDB FE/BE HttpPort. |
totalDataInMemMb | Maximum data size allowed in memory when writing in batches; If the data size exceeds the memory limit, or the wait time exceeds asyncFlushIntervalSec, then data is flushed to the write queue. |
asyncFlushIntervalSec | Interval to wait for flushing when writing in batches; If the wait time exceeds asyncFlushIntervalSec, or the data size exceeds totalDataInMemMb, then data is flushed to the write queue. |
flushBatchMb | Maximum batch size per table; If the batch size exceeds this limit, then data is flushed to the write queue. |
realFlushPauseSec | Wait time to flush data to Doris/SelectDB using stream load. 0 means no wait is needed. |
soTimeoutSec | TCP socket timeout (so_timeout) during QueryPort operations. |
enableTimeZoneProcess | Enable time zone conversion for time fields. |
timezone | Timezone in the Target, e.g., +08:00 Asia/Shanghai America/New_York. |
maxInSizePerQuery | Maximum number of IN clause values per query during secondary verification. Queries exceeding this limit will be automatically split. |
Tips: To modify the general parameters, see General Parameters and Functions.