部署DM Ansible集群过程:https://github.com/pingcap/tidb-tools/blob/docs/docs/dm/zh_CN/maintenance/dm-ansible.md
DM架构以及原理:https://pingcap.com/blog-cn/tidb-ecosystem-tools-3/
DM用户文档:https://github.com/pingcap/tidb-tools/tree/docs/docs/dm/zh_CN
ps: 可用于mysql-mysql同步
DM 主要分为三个组件部分:
1. DM-master:管理整个 DM 集群,维护集群的拓扑信息,监控各个 DM-worker 实例的运行状态;进行数据同步任务的拆解与分发,监控数据同步任务的执行状态;在进行合库合表的增量数据同步时,协调各 DM-worker 上 DDL 的执行或跳过;提供数据同步任务管理的统一入口。
2. DM-worker:与上游 MySQL 实例一一对应,执行具体的全量、增量数据同步任务;将上游 MySQL 的 binlog 拉取到本地并持久化保存;根据定义的数据同步任务,将上游 MySQL 数据全量导出成 SQL 文件后导入到下游 TiDB,或解析本地持久化的 binlog 后增量同步到下游 TiDB;编排 DM-master 拆解后的数据同步子任务,监控子任务的运行状态。
3. DM-ctl:命令行交互工具,通过连接到 DM-master 后,执行 DM 集群的管理与数据同步任务的管理(task.yaml)
其中DM-master、DM-worker、DM-ctl 都存在自己得配置文件分别为DM-master.toml、DM-worker.toml、DM-ctl.toml,dmctl.toml集群部署完成,该三个就自动部署无需更改,dmctl.toml配置文件主要用于dmctl 访问master-addr进入交互命令行,进行任务task.yaml管理,task.yaml 任务文件需要按照自己得需求手动配置,再开启dm-worker任务
DM 工作流程
1. 根据task.yaml 确认任务模式 task-mode: all # full/incremental/all ,默认all (full + incremental),当前使用full模式
2、dmctl开启task,worker接受任务开始工作
3. full模式下,worker进行mydumper 全量导出,loader导入,并且将relay log持久化本地目录中,loader完成再进行relay log通过syncer 增量同步
当前dm ansibgle inventory.ini配置文件
[tidb@ip-172-16-30-86 dm-ansible-latest]$ cat inventory.ini
## DM modules
[dm_master_servers]
dm_master ansible_host=172.16.30.86
[dm_worker_servers]
dm-worker1 ansible_host=172.16.30.88 source_id="mysql-replica-01" server_id=101 mysql_host=172.16.30.88 mysql_user=root mysql_password=Wi43k7ZjH5zdR96BwzsC+UApPqmrxh0= mysql_port=3308
dm-worker2 ansible_host=172.16.30.89 source_id="mysql-replica-02" server_id=102 mysql_host=172.16.30.89 mysql_user=root mysql_password=Wi43k7ZjH5zdR96BwzsC+UApPqmrxh0= mysql_port=3308
## Monitoring modules
[prometheus_servers]
prometheus ansible_host=172.16.30.86 prometheus_port=9909
[grafana_servers]
grafana ansible_host=172.16.30.86 grafana_port=3099
[alertmanager_servers]
alertmanager ansible_host=172.16.30.86 alertmanager_port=8097
## Global variables
[all:vars]
cluster_name = dm-cluster-test
ansible_user = tidb
dm_version = latest
deploy_dir = /data/tidb/deploy_dm
grafana_admin_user = "admin"
grafana_admin_password = "admin"
[tidb@ip-172-16-30-86 dm-ansible-latest]$ cat inventory.ini
## DM modules
[dm_master_servers]
dm_master ansible_host=172.16.30.86
[dm_worker_servers]
dm-worker1 ansible_host=172.16.30.88 source_id="mysql-replica-01" server_id=101 mysql_host=172.16.30.88 mysql_user=root mysql_password=Wi43k7ZjH5zdR96BwzsC+UApPqmrxh0= mysql_port=3308
dm-worker2 ansible_host=172.16.30.89 source_id="mysql-replica-02" server_id=102 mysql_host=172.16.30.89 mysql_user=root mysql_password=Wi43k7ZjH5zdR96BwzsC+UApPqmrxh0= mysql_port=3308
## Monitoring modules
[prometheus_servers]
prometheus ansible_host=172.16.30.86 prometheus_port=9909
[grafana_servers]
grafana ansible_host=172.16.30.86 grafana_port=3099
[alertmanager_servers]
alertmanager ansible_host=172.16.30.86 alertmanager_port=8097
## Global variables
[all:vars]
cluster_name = dm-cluster-test
ansible_user = tidb
dm_version = latest
deploy_dir = /data/tidb/deploy_dm
grafana_admin_user = "admin"
grafana_admin_password = "admin"
dm相关操作:https://github.com/pingcap/tidb-tools/blob/docs/docs/dm/zh_CN/get-started.md
另外注意:task.yaml 需要使用相对路径,使用绝对路径无响应
COPY 并且编辑 {ansible deploy}/conf/task.yaml.example,生成如下任务配置文件 task.yaml 到dmctl 工具目录下
当前task.yaml配置如下:
两个mysql单实例同步到下游TiDB集群,实例分别为mysql-replica-01、mysql-replica-02(对应inventory.ini文件中得[dm_worker_servers]内得source_id),并且连个实例我并未使用filter、字段映射以及黑白名单(被注释),只使用了router 重命名库以及表
[tidb@ip-172-16-30-86 dmctl]$ cat task.yaml
---
name: mysql # global unique
task-mode: all # full/incremental/all
is-sharding: true # whether multi dm-worker do one sharding job
meta-schema: "dm_meta" # meta schema in downstreaming database to store meta informaton of dm
remove-meta: false # remove meta from downstreaming database, now we delete checkpoint and online ddl information
enable-heartbeat: false # whether to enable heartbeat for calculating lag between master and syncer
# timezone: "Asia/Shanghai" # target database timezone, all timestamp event in binlog will translate to format time based on this timezone, default use local timezone
target-database:
host: "172.16.30.86"
port: 5000
user: "root"
password: ""
mysql-instances: # one or more source database, config more source database for sharding merge
-
source-id: "mysql-replica-01" # unique in all instances, used as id when save checkpoints, configs, etc.
# binlog pos used to as start pos for syncer, for different task-mode, this maybe used or not
# `full` / `all`:
# never be used
# `incremental`:
# if `remove-meta` is true, this will be used
# else if checkpoints already exists in `meta-schema`, this will not be used
# otherwise, this will be used
meta:
binlog-name: mysql-bin.000001
binlog-pos: 4
route-rules: ["user-route-rules-schema-01", "user-route-rules-01"]
#filter-rules: ["user-filter-01"]
#column-mapping-rules: ["instance-01"]
#black-white-list: "instance-01"
# `mydumper-config-name` and `mydumper` should only set one
mydumper-config-name: "global" # ref `mydumpers` config
# mydumper:
# mydumper-path: "./mydumper"
# threads: 16
# `loader-config-name` and `loader` should only set one
loader-config-name: "global" # ref `loaders` config
#loader: # local loader rule
# pool-size: 32
# `syncer-config-name` and `syncer` should only set one
syncer-config-name: "global" # ref `syncers` config
#syncer:
# worker-count: 32
-
source-id: "mysql-replica-02"
meta:
binlog-name: mysql-bin.000001
binlog-pos: 320
route-rules: ["user-route-rules-schema-02", "user-route-rules-02"]
#filter-rules: ["user-filter-02"]
#column-mapping-rules: ["instance-02"]
#black-white-list: "instance-02"
mydumper:
mydumper-path: "./bin/mydumper"
threads: 4
chunk-filesize: 64
skip-tz-utc: true
extra-args: "-B singlemysql --no-locks"
loader: # local loader rule
pool-size: 16
dir: "./dumped_data" # must be unique between tasks for the same instance
syncer:
worker-count: 16
batch: 100
max-retry: 100
# other common configs shared by all instances
routes: # schema/table route mapping
user-route-rules-schema-01:
schema-pattern: "single*"
target-schema: "dmmysql"
user-route-rules-01:
schema-pattern: "single*"
table-pattern: "single"
target-schema: "dmmysql"
target-table: "dmsingle"
user-route-rules-schema-02:
schema-pattern: "single*"
target-schema: "dmtidb"
user-route-rules-02:
schema-pattern: "single*"
table-pattern: "single"
target-schema: "dmtidb"
target-table: "dmsingle"
filters: # filter rules, mysql instance can ref rules in it
user-filter-01:
schema-pattern: "test_*"
table-pattern: "t_*"
events: ["truncate table", "drop table"] # ignore truncate/drop table ddl
action: Ignore
user-filter-02:
schema-pattern: "test_*"
events: ["All DML"] # only do all DML events
action: Do
black-white-list:
instance-01:
do-dbs: ["~^single.*"]
ignore-dbs: ["mysql"]
do-tables:
- db-name: "~^single.*"
tbl-name: "?"
instance-02:
do-dbs: ["~^single.*"]
ignore-dbs: ["mysql"]
do-tables:
- db-name: "~^single.*"
tbl-name: "?"
column-mappings: # column mapping rules, mysql instance can ref rules in it
instance-01:
schema-pattern: "test_*"
table-pattern: "t_*"
expression: "partition id" # handle sharding partition id
source-column: "id"
target-column: "id"
arguments: ["1", "test_", "t_"]
instance-02:
schema-pattern: "test_*"
table-pattern: "t_*"
expression: "partition id" # handle sharding partition id
source-column: "id"
target-column: "id"
arguments: ["2", "test_", "t_"]
mydumpers: # mydumper process unit specific configs, mysql instance can ref one config in it
global:
mydumper-path: "./bin/mydumper"
threads: 4
chunk-filesize: 64
skip-tz-utc: true
extra-args: "-B singlemysql --no-locks"
loaders: # loader process unit specific configs, mysql instance can ref one config in it
global:
pool-size: 16
dir: "./dumped_data"
syncers: # syncer process unit specific configs, mysql instance can ref one config in it
global:
worker-count: 16
batch: 100
max-retry: 100
开启worker任务,可以查看dm-worker.log查看日志变化,可以清楚看到Mydumper、loader、syncer过程
[tidb@ip-172-16-30-86 dmctl]$ ./dmctl -config dmctl.toml
Welcome to dmctl
Release Version: v1.0.0-alpha-14-g05c70da
Git Commit Hash: 05c70da45d2439d1d54498f210b868a9ca26a303
Git Branch: master
UTC Build Time: 2019-02-22 08:29:27
Go Version: go version go1.11.2 linux/amd64
» start-task ./task.yaml
{
"result": true,
"msg": "",
"workers": [
{
"result": true,
"worker": "172.16.30.88:8262",
"msg": ""
},
{
"result": true,
"worker": "172.16.30.89:8262",
"msg": ""
}
]
}
»
验证TiDB是否正常
mysql> show databases;
+--------------------+
| Database |
+--------------------+
| INFORMATION_SCHEMA |
| PERFORMANCE_SCHEMA |
| bike |
| bikeshare |
| dm_meta |
| dmmysql |
| dmtidb |
| marvin |
| mysql |
| test |
| tidb |
+--------------------+
11 rows in set (0.00 sec)
mysql> use dmmysql;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql>
mysql> show tables;
+-------------------+
| Tables_in_dmmysql |
+-------------------+
| dmsingle |
| mysql |
+-------------------+
2 rows in set (0.00 sec)
mysql> select * from dmsingle;
+----+--------+
| id | name |
+----+--------+
| 1 | MARVIN |
| 2 | MARVIN |
| 3 | TiKV |
+----+--------+
3 rows in set (0.00 sec)
mysql> use dmtidb;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> show tables;
+------------------+
| Tables_in_dmtidb |
+------------------+
| dmsingle |
| marvin |
+------------------+
2 rows in set (0.00 sec)
mysql> select * from dmsingle;
+----+------+
| id | name |
+----+------+
| 1 | dd |
| 2 | dd |
+----+------+
2 rows in set (0.00 sec)
版权声明:本文为博主原创文章,未经博主允许不得转载。



