签到成功

知道了

CNDBA社区CNDBA社区

TiDB DM数据迁移多Source使用

2019-02-26 10:41 6062 0 原创 TiDB
作者: Marvinn

部署DM Ansible集群过程:https://github.com/pingcap/tidb-tools/blob/docs/docs/dm/zh_CN/maintenance/dm-ansible.md
DM架构以及原理:https://pingcap.com/blog-cn/tidb-ecosystem-tools-3/
DM用户文档:https://github.com/pingcap/tidb-tools/tree/docs/docs/dm/zh_CN
ps: 可用于mysql-mysql同步
DM 主要分为三个组件部分:

http://www.cndba.cn/Marvinn/article/3280
http://www.cndba.cn/Marvinn/article/3280

1. DM-master:管理整个 DM 集群,维护集群的拓扑信息,监控各个 DM-worker 实例的运行状态;进行数据同步任务的拆解与分发,监控数据同步任务的执行状态;在进行合库合表的增量数据同步时,协调各 DM-worker 上 DDL 的执行或跳过;提供数据同步任务管理的统一入口。

2. DM-worker:与上游 MySQL 实例一一对应,执行具体的全量、增量数据同步任务;将上游 MySQL 的 binlog 拉取到本地并持久化保存;根据定义的数据同步任务,将上游 MySQL 数据全量导出成 SQL 文件后导入到下游 TiDB,或解析本地持久化的 binlog 后增量同步到下游 TiDB;编排 DM-master 拆解后的数据同步子任务,监控子任务的运行状态。

3. DM-ctl:命令行交互工具,通过连接到 DM-master 后,执行 DM 集群的管理与数据同步任务的管理(task.yaml)

其中DM-master、DM-worker、DM-ctl 都存在自己得配置文件分别为DM-master.toml、DM-worker.toml、DM-ctl.toml,dmctl.toml集群部署完成,该三个就自动部署无需更改,dmctl.toml配置文件主要用于dmctl 访问master-addr进入交互命令行,进行任务task.yaml管理,task.yaml 任务文件需要按照自己得需求手动配置,再开启dm-worker任务

DM 工作流程

1. 根据task.yaml 确认任务模式 task-mode: all  # full/incremental/all ,默认all (full + incremental),当前使用full模式
2、dmctl开启task,worker接受任务开始工作
3. full模式下,worker进行mydumper 全量导出,loader导入,并且将relay log持久化本地目录中,loader完成再进行relay log通过syncer 增量同步

当前dm ansibgle inventory.ini配置文件

[tidb@ip-172-16-30-86 dm-ansible-latest]$ cat inventory.ini 
## DM modules
[dm_master_servers]
dm_master ansible_host=172.16.30.86

[dm_worker_servers]
dm-worker1 ansible_host=172.16.30.88 source_id="mysql-replica-01" server_id=101 mysql_host=172.16.30.88 mysql_user=root mysql_password=Wi43k7ZjH5zdR96BwzsC+UApPqmrxh0= mysql_port=3308
dm-worker2 ansible_host=172.16.30.89 source_id="mysql-replica-02" server_id=102 mysql_host=172.16.30.89 mysql_user=root mysql_password=Wi43k7ZjH5zdR96BwzsC+UApPqmrxh0= mysql_port=3308

## Monitoring modules
[prometheus_servers]
prometheus ansible_host=172.16.30.86 prometheus_port=9909

[grafana_servers]
grafana ansible_host=172.16.30.86 grafana_port=3099

[alertmanager_servers]
alertmanager ansible_host=172.16.30.86 alertmanager_port=8097

## Global variables
[all:vars]
cluster_name = dm-cluster-test

ansible_user = tidb

dm_version = latest

deploy_dir = /data/tidb/deploy_dm

grafana_admin_user = "admin"
grafana_admin_password = "admin"
[tidb@ip-172-16-30-86 dm-ansible-latest]$ cat inventory.ini 
## DM modules
[dm_master_servers]
dm_master ansible_host=172.16.30.86

[dm_worker_servers]
dm-worker1 ansible_host=172.16.30.88 source_id="mysql-replica-01" server_id=101 mysql_host=172.16.30.88 mysql_user=root mysql_password=Wi43k7ZjH5zdR96BwzsC+UApPqmrxh0= mysql_port=3308
dm-worker2 ansible_host=172.16.30.89 source_id="mysql-replica-02" server_id=102 mysql_host=172.16.30.89 mysql_user=root mysql_password=Wi43k7ZjH5zdR96BwzsC+UApPqmrxh0= mysql_port=3308

## Monitoring modules
[prometheus_servers]
prometheus ansible_host=172.16.30.86 prometheus_port=9909

[grafana_servers]
grafana ansible_host=172.16.30.86 grafana_port=3099

[alertmanager_servers]
alertmanager ansible_host=172.16.30.86 alertmanager_port=8097

## Global variables
[all:vars]
cluster_name = dm-cluster-test

ansible_user = tidb

dm_version = latest

deploy_dir = /data/tidb/deploy_dm

grafana_admin_user = "admin"
grafana_admin_password = "admin"

dm相关操作:https://github.com/pingcap/tidb-tools/blob/docs/docs/dm/zh_CN/get-started.mdhttp://www.cndba.cn/Marvinn/article/3280http://www.cndba.cn/Marvinn/article/3280http://www.cndba.cn/Marvinn/article/3280

http://www.cndba.cn/Marvinn/article/3280

另外注意:task.yaml 需要使用相对路径,使用绝对路径无响应
COPY 并且编辑 {ansible deploy}/conf/task.yaml.example,生成如下任务配置文件 task.yaml 到dmctl 工具目录下

当前task.yaml配置如下:
两个mysql单实例同步到下游TiDB集群,实例分别为mysql-replica-01、mysql-replica-02(对应inventory.ini文件中得[dm_worker_servers]内得source_id),并且连个实例我并未使用filter、字段映射以及黑白名单(被注释),只使用了router 重命名库以及表http://www.cndba.cn/Marvinn/article/3280

[tidb@ip-172-16-30-86 dmctl]$ cat task.yaml 
---
name: mysql # global unique
task-mode: all  # full/incremental/all
is-sharding: true  # whether multi dm-worker do one sharding job
meta-schema: "dm_meta"  # meta schema in downstreaming database to store meta informaton of dm
remove-meta: false  # remove meta from downstreaming database, now we delete checkpoint and online ddl information
enable-heartbeat: false  # whether to enable heartbeat for calculating lag between master and syncer
# timezone: "Asia/Shanghai" # target database timezone, all timestamp event in binlog will translate to format time based on this timezone, default use local timezone

target-database:
  host: "172.16.30.86"
  port: 5000
  user: "root"
  password: ""

mysql-instances:             # one or more source database, config more source database for sharding merge
  -
    source-id: "mysql-replica-01" # unique in all instances, used as id when save checkpoints, configs, etc.

    # binlog pos used to as start pos for syncer, for different task-mode, this maybe used or not
    # `full` / `all`:
    #    never be used
    # `incremental`:
    #    if `remove-meta` is true, this will be used
    #    else if checkpoints already exists in `meta-schema`, this will not be used
    #    otherwise, this will be used
    meta:
      binlog-name: mysql-bin.000001
      binlog-pos: 4
    route-rules: ["user-route-rules-schema-01", "user-route-rules-01"]
    #filter-rules: ["user-filter-01"]
    #column-mapping-rules: ["instance-01"]
    #black-white-list:  "instance-01"

    # `mydumper-config-name` and `mydumper` should only set one
    mydumper-config-name: "global"   # ref `mydumpers` config
#    mydumper:
#      mydumper-path: "./mydumper"
#      threads: 16

    # `loader-config-name` and `loader` should only set one
    loader-config-name: "global"    # ref `loaders` config
    #loader:                  # local loader rule
    #  pool-size: 32
    # `syncer-config-name` and `syncer` should only set one

    syncer-config-name: "global"    # ref `syncers` config
    #syncer:
    #  worker-count: 32

  -
    source-id: "mysql-replica-02"
    meta:
      binlog-name: mysql-bin.000001
      binlog-pos: 320
    route-rules: ["user-route-rules-schema-02", "user-route-rules-02"]
    #filter-rules: ["user-filter-02"]
    #column-mapping-rules: ["instance-02"]
    #black-white-list: "instance-02"

    mydumper:
      mydumper-path: "./bin/mydumper"
      threads: 4
      chunk-filesize: 64
      skip-tz-utc: true
      extra-args: "-B singlemysql --no-locks"

    loader:                  # local loader rule
      pool-size: 16
      dir: "./dumped_data"   # must be unique between tasks for the same instance

    syncer:
      worker-count: 16
      batch: 100
      max-retry: 100

# other common configs shared by all instances

routes:                      # schema/table route mapping
  user-route-rules-schema-01:
    schema-pattern: "single*"
    target-schema: "dmmysql"
  user-route-rules-01:
    schema-pattern: "single*"
    table-pattern: "single"
    target-schema: "dmmysql"
    target-table: "dmsingle"
  user-route-rules-schema-02:
    schema-pattern: "single*"
    target-schema: "dmtidb"
  user-route-rules-02:
    schema-pattern: "single*"
    table-pattern: "single"
    target-schema: "dmtidb"
    target-table: "dmsingle"


filters:                     # filter rules, mysql instance can ref rules in it
  user-filter-01:
    schema-pattern: "test_*"
    table-pattern: "t_*"
    events: ["truncate table", "drop table"]  # ignore truncate/drop table ddl
    action: Ignore
  user-filter-02:
    schema-pattern: "test_*"
    events: ["All DML"]             # only do all DML events
    action: Do

black-white-list:
  instance-01:
    do-dbs: ["~^single.*"]
    ignore-dbs: ["mysql"]
    do-tables:
    - db-name: "~^single.*"
      tbl-name: "?"

  instance-02:
    do-dbs: ["~^single.*"]
    ignore-dbs: ["mysql"]
    do-tables:
    - db-name: "~^single.*"
      tbl-name: "?"

column-mappings:             # column mapping rules, mysql instance can ref rules in it
  instance-01:
    schema-pattern: "test_*"
    table-pattern: "t_*"
    expression: "partition id"       # handle sharding partition id
    source-column: "id"
    target-column: "id"
    arguments: ["1", "test_", "t_"]
  instance-02:
    schema-pattern: "test_*"
    table-pattern: "t_*"
    expression: "partition id"       # handle sharding partition id
    source-column: "id"
    target-column: "id"
    arguments: ["2", "test_", "t_"]

mydumpers:                   # mydumper process unit specific configs, mysql instance can ref one config in it
  global:
    mydumper-path: "./bin/mydumper"
    threads: 4
    chunk-filesize: 64
    skip-tz-utc: true
    extra-args: "-B singlemysql --no-locks"

loaders:                     # loader process unit specific configs, mysql instance can ref one config in it
  global:
    pool-size: 16
    dir: "./dumped_data"

syncers:                     # syncer process unit specific configs, mysql instance can ref one config in it
  global:
    worker-count: 16
    batch: 100
    max-retry: 100

开启worker任务,可以查看dm-worker.log查看日志变化,可以清楚看到Mydumper、loader、syncer过程http://www.cndba.cn/Marvinn/article/3280http://www.cndba.cn/Marvinn/article/3280

[tidb@ip-172-16-30-86 dmctl]$ ./dmctl -config dmctl.toml 
Welcome to dmctl
Release Version: v1.0.0-alpha-14-g05c70da
Git Commit Hash: 05c70da45d2439d1d54498f210b868a9ca26a303
Git Branch: master
UTC Build Time: 2019-02-22 08:29:27
Go Version: go version go1.11.2 linux/amd64
» start-task ./task.yaml
{
    "result": true,
    "msg": "",
    "workers": [
        {
            "result": true,
            "worker": "172.16.30.88:8262",
            "msg": ""
        },
        {
            "result": true,
            "worker": "172.16.30.89:8262",
            "msg": ""
        }
    ]
}
»

验证TiDB是否正常

http://www.cndba.cn/Marvinn/article/3280

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| INFORMATION_SCHEMA |
| PERFORMANCE_SCHEMA |
| bike               |
| bikeshare          |
| dm_meta            |
| dmmysql            |
| dmtidb             |
| marvin             |
| mysql              |
| test               |
| tidb               |
+--------------------+
11 rows in set (0.00 sec)

mysql> use dmmysql;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> 
mysql> show tables;
+-------------------+
| Tables_in_dmmysql |
+-------------------+
| dmsingle          |
| mysql             |
+-------------------+
2 rows in set (0.00 sec)

mysql> select * from dmsingle;
+----+--------+
| id | name   |
+----+--------+
|  1 | MARVIN |
|  2 | MARVIN |
|  3 | TiKV   |
+----+--------+
3 rows in set (0.00 sec)
mysql> use dmtidb;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> show tables;
+------------------+
| Tables_in_dmtidb |
+------------------+
| dmsingle         |
| marvin           |
+------------------+
2 rows in set (0.00 sec)

mysql> select * from dmsingle;
+----+------+
| id | name |
+----+------+
|  1 | dd   |
|  2 | dd   |
+----+------+
2 rows in set (0.00 sec)

版权声明:本文为博主原创文章,未经博主允许不得转载。

用户评论
* 以下用户言论只代表其个人观点,不代表CNDBA社区的观点或立场
Marvinn

Marvinn

关注

路漫漫其修远兮、吾将上下而求索

  • 99
    原创
  • 0
    翻译
  • 2
    转载
  • 36
    评论
  • 访问:458376次
  • 积分:449
  • 等级:中级会员
  • 排名:第12名
精华文章
    最新问题
    查看更多+
    热门文章
      热门用户
      推荐用户
        Copyright © 2016 All Rights Reserved. Powered by CNDBA · 皖ICP备2022006297号-1·

        QQ交流群

        注册联系QQ