1. 部署pb
测试过程只需要在原有drainer.toml配置文件下游为 Mysql 修改为 pb,重新启动, pump 配置文件不变,它只是负责捕获 TiDB binlog日志并将它排序.默认即可,这也是为什么 ansible可以手动部署pump,而drainer 需要二进制部署原因
ansible 部署 TiDB Binlog参考链接:https://github.com/pingcap/docs-cn/blob/e5fede0709c543e289bc29726404183bef956ad1/v1.0/tools/tidb-binlog.md
关于drainer.toml配置文件模板存在于tidb-ansilbe 安装解压目录下得配置文件目录 conf/ 中,对应pb 增量备份drainer.toml配置文件如下:
[tidb@ip-172-16-30-86 bin]$ cat drainer.toml
# drainer Configuration.
# drainer 提供服务的地址(默认 "127.0.0.1:8249")
addr = "172.16.30.86:8249"
# 向 pd 查询在线 pump 的时间间隔 (默认 10,单位 秒)
detect-interval = 10
# drainer 数据存储位置路径 (默认 "data.drainer")
data-dir = "/data/tidb/deploy_tidb/data.drainer"
# pd 集群节点的地址 (默认 "http://127.0.0.1:2379")
pd-urls = "http://172.16.30.86:2479,http://172.16.30.87:2479,http://172.16.30.88:2479"
# log 文件路径
log-file = "drainer.log"
# Syncer Configuration.
[syncer]
## db 过滤列表 (默认 "INFORMATION_SCHEMA,PERFORMANCE_SCHEMA,mysql,test"),
## 不支持对 ignore schemas 的 table 进行 rename DDL 操作
ignore-schemas = "INFORMATION_SCHEMA,PERFORMANCE_SCHEMA,mysql,bike,bikeshare,test,tidb"
# 输出到下游数据库一个事务的 sql 数量 (default 1)
txn-batch = 1
# 同步下游的并发数,该值设置越高同步的吞吐性能越好 (default 1)
worker-count = 1
# 是否禁用拆分单个 binlog 的 sqls 的功能,如果设置为 true,则按照每个 binlog
# 顺序依次还原成单个事务进行同步( 下游服务类型为 mysql, 该项设置为 False )
disable-dispatch = true
# drainer 下游服务类型 (默认为 mysql)
# 参数有效值为 "mysql", "pb"
db-type = "pb"
# replicate-do-db priority over replicate-do-table if have same db name
# and we support regex expression ,
# 以 '~' 开始声明使用正则表达式
replicate-do-db = ["marvin"]
#[[syncer.replicate-do-table]]
#db-name ="marvin"
#tbl-name = "log"
#[[syncer.replicate-do-table]]
#db-name ="test"
#tbl-name = "~^a.*"
# db-type 设置为 mysql 时,下游数据库服务器参数
#[syncer.to]
#host = "172.16.30.89"
#user = "root"
#password = "123456"
#port = 3308
# db-type 设置为 pb 时,存放 binlog 文件的目录
[syncer.to]
dir = "data.drainer"
[tidb@ip-172-16-30-86 bin]$ cat drainer.toml
# drainer Configuration.
# drainer 提供服务的地址(默认 "127.0.0.1:8249")
addr = "172.16.30.86:8249"
# 向 pd 查询在线 pump 的时间间隔 (默认 10,单位 秒)
detect-interval = 10
# drainer 数据存储位置路径 (默认 "data.drainer")
data-dir = "/data/tidb/deploy_tidb/data.drainer"
# pd 集群节点的地址 (默认 "http://127.0.0.1:2379")
pd-urls = "http://172.16.30.86:2479,http://172.16.30.87:2479,http://172.16.30.88:2479"
# log 文件路径
log-file = "drainer.log"
# Syncer Configuration.
[syncer]
## db 过滤列表 (默认 "INFORMATION_SCHEMA,PERFORMANCE_SCHEMA,mysql,test"),
## 不支持对 ignore schemas 的 table 进行 rename DDL 操作
ignore-schemas = "INFORMATION_SCHEMA,PERFORMANCE_SCHEMA,mysql,bike,bikeshare,test,tidb"
# 输出到下游数据库一个事务的 sql 数量 (default 1)
txn-batch = 1
# 同步下游的并发数,该值设置越高同步的吞吐性能越好 (default 1)
worker-count = 1
# 是否禁用拆分单个 binlog 的 sqls 的功能,如果设置为 true,则按照每个 binlog
# 顺序依次还原成单个事务进行同步( 下游服务类型为 mysql, 该项设置为 False )
disable-dispatch = true
# drainer 下游服务类型 (默认为 mysql)
# 参数有效值为 "mysql", "pb"
db-type = "pb"
# replicate-do-db priority over replicate-do-table if have same db name
# and we support regex expression ,
# 以 '~' 开始声明使用正则表达式
replicate-do-db = ["marvin"]
#[[syncer.replicate-do-table]]
#db-name ="marvin"
#tbl-name = "log"
#[[syncer.replicate-do-table]]
#db-name ="test"
#tbl-name = "~^a.*"
# db-type 设置为 mysql 时,下游数据库服务器参数
#[syncer.to]
#host = "172.16.30.89"
#user = "root"
#password = "123456"
#port = 3308
# db-type 设置为 pb 时,存放 binlog 文件的目录
[syncer.to]
dir = "data.drainer"
pump/drainer 配置
使用 ansible 部署的 pump 服务,已经在启动参数设置 metrics 。
drainer 启动时可以设置 —metrics-addr 和 —metrics-interval 两个参数,其中 metrics-addr 设为 Push Gateway 的地址,metrics-interval 为 push 的频率,单位为秒,默认值为15
重新启动drainer进程
$ nohup ./bin/drainer -config drainer.toml &
2. 产生增量,测试pb
mysql> user marvin;
ERROR 1105 (HY000): line 1 column 4 near " marvin" (total length 11)
mysql> use marvin;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> show tables;
+------------------+
| Tables_in_marvin |
+------------------+
| t1 |
| t2 |
| tidb |
+------------------+
3 rows in set (0.00 sec)
mysql> select count(*) from t2;
+----------+
| count(*) |
+----------+
| 0 |
+----------+
1 row in set (0.01 sec)
mysql> desc t2;
+-------+-------------+------+------+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+------+---------+-------+
| id | int(11) | NO | PRI | NULL | |
| name | varchar(10) | YES | | NULL | |
+-------+-------------+------+------+---------+-------+
2 rows in set (0.00 sec)
mysql> insert into t2 values(1,'FF');
Query OK, 1 row affected (0.01 sec)
3. pbReadder解析增量binlog
解析pb binlog工具
$ wget http://download.pingcap.org/pb_reader-latest-linux-amd64.tar.gz
$ tar -zxvf pb_reader-latest-linux-amd64.tar.gz
查看产生多少个binlog
[tidb@ip-172-16-30-86 bin]$ ls -l ../../data.drainer/
total 4
-rw------- 1 tidb tidb 90 Mar 4 16:38 binlog-0000000000000000-20190304160822
解析binlog
[tidb@ip-172-16-30-86 bin]$ ls
pbReader
[tidb@ip-172-16-30-86 bin]$ ./pbReader -binlog-file ../../data.drainer/binlog-0000000000000000-20190304160822
2019/03/04 16:58:22 file.go:142: [error] bad file name .lock
2019/03/04 16:58:22 file.go:130: [warning] ignored file .lock in wal
binlog type: DML; commit ts: 406765888237993986
schema: marvin; table: t2; type: Insert
id(int): 1
name(varchar): FF
可以看到当前正是之前测试插入的数据(1,'FF')
4. Reparo恢复
下载链接
$ wget https://download.pingcap.org/reparo-latest-linux-amd64.tar.gz
$ tar -xvf reparo-latest-linux-amd64.tar.gz
使用参考文档:https://www.pingcap.com/docs-cn/tools/reparo/
启动恢复
$ bin/reparo -config reparo.toml > reparo.log
因为日志模式debug模式,可以查看到具体的SQL信息,采用的replace模式插入
reparo.log
2019/03/04 17:41:20 db.go:232: [debug] [query][sql]SHOW COLUMNS FROM `marvin`.`t2`
2019/03/04 17:41:20 db.go:232: [debug] [query][sql]SHOW INDEX FROM `marvin`.`t2`
2019/03/04 17:41:20 mysql.go:173: [debug] id(int): 1
2019/03/04 17:41:20 mysql.go:173: [debug] name(varchar): FF
2019/03/04 17:41:20 mysql.go:52: [debug] insert sql REPLACE INTO `marvin`.`t2` (id,name) VALUES (?,?);, args [1 FF], keys [1]
2019/03/04 17:41:20 reparo.go:288: [debug] [reparo] add job takes 0.000026 seconds, is_ddl false, job &{binlogTp:1 sql:REPLACE INTO `marvin`.`t2` (id,name) VALUES (?,?); args:[1 FF] key:1}
2019/03/04 17:41:20 reparo.go:157: [debug] offset 90 ts 406765888237993986, datetime 2019-03-04 16:38:41 +0800 CST
2019/03/04 17:41:20 reparo.go:128: [info] read file data.drainer/binlog-0000000000000000-20190304160822 end
2019/03/04 17:41:20 mysql.go:46: [debug] [reparo] execute sql takes 0.001743 seconds, is_ddl false, length of sqls 1
2019/03/04 17:41:20 reparo.go:162: [info] [reparo] recovery is done, takes 0.104700 seconds
5.验证数据的恢复
[tidb@ip-172-16-30-89 bin]$ mysql -uroot -p -h172.16.30.89 -P3308
Enter password:
Welcome to the MySQL monitor. Commands end with ; or /g.
Your MySQL connection id is 423
Server version: 5.7.25-log MySQL Community Server (GPL)
Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '/h' for help. Type '/c' to clear the current input statement.
mysql> use marvin;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> select * from t2;
+----+------+
| id | name |
+----+------+
| 1 | FF |
+----+------+
1 row in set (0.00 sec)
6. 源端再次插入,再次恢复
源端插入:
mysql> insert into t2 values(2,'FF');
Query OK, 1 row affected (0.00 sec)
插入一段时间后,查看解析binlog-0000000000000000-20190304160822 存在新添加的记录,即两条记录在同一个binlog,但是目标端并没有进行恢复,所以 reparo 工具属于一次性恢复
mysql> select * from t2;
+----+------+
| id | name |
+----+------+
| 1 | FF |
+----+------+
1 row in set (0.00 sec)
mysql> select * from t2;
+----+------+
| id | name |
+----+------+
| 1 | FF |
+----+------+
1 row in set (0.00 sec)
注意: reparo 工具属于一次性恢复,既按配置文件一次性恢复所在目录下所有符合条件的数据,然后结束进程,再次新插入,需要再次开启恢复,而且原恢复的会被replace再插入进行了重恢复(之前记录 17:41:20 插入第一条 1,ff 数据记录),如下:
2019/03/04 17:47:30 reparo.go:120: [info] seek to file data.drainer/binlog-0000000000000000-20190304160822 offset 0 got 0
2019/03/04 17:47:30 db.go:232: [debug] [query][sql]SHOW COLUMNS FROM `marvin`.`t2`
2019/03/04 17:47:30 db.go:232: [debug] [query][sql]SHOW INDEX FROM `marvin`.`t2`
2019/03/04 17:47:30 mysql.go:173: [debug] id(int): 1
2019/03/04 17:47:30 mysql.go:173: [debug] name(varchar): FF
2019/03/04 17:47:30 mysql.go:52: [debug] insert sql REPLACE INTO `marvin`.`t2` (id,name) VALUES (?,?);, args [1 FF], keys [1]
2019/03/04 17:47:30 reparo.go:288: [debug] [reparo] add job takes 0.000061 seconds, is_ddl false, job &{binlogTp:1 sql:REPLACE INTO `marvin`.`t2` (id,name) VALUES (?,?); args:[1 FF] key:1}
2019/03/04 17:47:30 reparo.go:157: [debug] offset 90 ts 406765888237993986, datetime 2019-03-04 16:38:41 +0800 CST
2019/03/04 17:47:30 mysql.go:173: [debug] id(int): 2
2019/03/04 17:47:30 mysql.go:173: [debug] name(varchar): FF
2019/03/04 17:47:30 mysql.go:52: [debug] insert sql REPLACE INTO `marvin`.`t2` (id,name) VALUES (?,?);, args [2 FF], keys [2]
2019/03/04 17:47:30 reparo.go:288: [debug] [reparo] add job takes 0.000004 seconds, is_ddl false, job &{binlogTp:1 sql:REPLACE INTO `marvin`.`t2` (id,name) VALUES (?,?); args:[2 FF] key:2}
2019/03/04 17:47:30 reparo.go:157: [debug] offset 180 ts 406766943669059586, datetime 2019-03-04 17:45:47 +0800 CST
2019/03/04 17:47:30 reparo.go:128: [info] read file data.drainer/binlog-0000000000000000-20190304160822 end
2019/03/04 17:47:30 mysql.go:46: [debug] [reparo] execute sql takes 0.002020 seconds, is_ddl false, length of sqls 1
2019/03/04 17:47:30 mysql.go:46: [debug] [reparo] execute sql takes 0.002212 seconds, is_ddl false, length of sqls 1
2019/03/04 17:47:30 reparo.go:162: [info] [reparo] recovery is done, takes 0.106354 seconds
再次验证目标端Mysql端:
mysql> select * from t2;
+----+------+
| id | name |
+----+------+
| 1 | FF |
| 2 | FF |
+----+------+
2 rows in set (0.00 sec)
数据插入完成.
版权声明:本文为博主原创文章,未经博主允许不得转载。



