TiDB binlog 增量备份pd、 pbReadder解析以及Reparo恢复 -- 中国DBA社区

1. 部署pb

测试过程只需要在原有drainer.toml配置文件下游为 Mysql 修改为 pb，重新启动, pump 配置文件不变，它只是负责捕获 TiDB binlog日志并将它排序.默认即可，这也是为什么 ansible可以手动部署pump，而drainer 需要二进制部署原因
ansible 部署 TiDB Binlog参考链接：https://github.com/pingcap/docs-cn/blob/e5fede0709c543e289bc29726404183bef956ad1/v1.0/tools/tidb-binlog.md
关于drainer.toml配置文件模板存在于tidb-ansilbe 安装解压目录下得配置文件目录 conf/ 中，对应pb 增量备份drainer.toml配置文件如下:

[tidb@ip-172-16-30-86 bin]$ cat drainer.toml 
# drainer Configuration.

# drainer 提供服务的地址(默认 "127.0.0.1:8249")
addr = "172.16.30.86:8249"

# 向 pd 查询在线 pump 的时间间隔 (默认 10，单位 秒)
detect-interval = 10

# drainer 数据存储位置路径 (默认 "data.drainer")
data-dir = "/data/tidb/deploy_tidb/data.drainer"

# pd 集群节点的地址 (默认 "http://127.0.0.1:2379")
pd-urls = "http://172.16.30.86:2479,http://172.16.30.87:2479,http://172.16.30.88:2479"

# log 文件路径
log-file = "drainer.log"

# Syncer Configuration.
[syncer]

## db 过滤列表 (默认 "INFORMATION_SCHEMA,PERFORMANCE_SCHEMA,mysql,test"),
## 不支持对 ignore schemas 的 table 进行 rename DDL 操作
ignore-schemas = "INFORMATION_SCHEMA,PERFORMANCE_SCHEMA,mysql,bike,bikeshare,test,tidb"

# 输出到下游数据库一个事务的 sql 数量 (default 1)
txn-batch = 1

# 同步下游的并发数，该值设置越高同步的吞吐性能越好 (default 1)
worker-count = 1

# 是否禁用拆分单个 binlog 的 sqls 的功能，如果设置为 true，则按照每个 binlog
# 顺序依次还原成单个事务进行同步( 下游服务类型为 mysql, 该项设置为 False )
disable-dispatch = true 

# drainer 下游服务类型 (默认为 mysql)
# 参数有效值为 "mysql", "pb"
db-type = "pb"

# replicate-do-db priority over replicate-do-table if have same db name
# and we support regex expression ,
# 以 '~' 开始声明使用正则表达式
replicate-do-db = ["marvin"]

#[[syncer.replicate-do-table]]
#db-name ="marvin"
#tbl-name = "log"

#[[syncer.replicate-do-table]]
#db-name ="test"
#tbl-name = "~^a.*"

# db-type 设置为 mysql 时，下游数据库服务器参数
#[syncer.to]
#host = "172.16.30.89"
#user = "root"
#password = "123456"
#port = 3308

# db-type 设置为 pb 时,存放 binlog 文件的目录
[syncer.to]
dir = "data.drainer"
[tidb@ip-172-16-30-86 bin]$ cat drainer.toml 
# drainer Configuration.

# drainer 提供服务的地址(默认 "127.0.0.1:8249")
addr = "172.16.30.86:8249"

# 向 pd 查询在线 pump 的时间间隔 (默认 10，单位 秒)
detect-interval = 10

# drainer 数据存储位置路径 (默认 "data.drainer")
data-dir = "/data/tidb/deploy_tidb/data.drainer"

# pd 集群节点的地址 (默认 "http://127.0.0.1:2379")
pd-urls = "http://172.16.30.86:2479,http://172.16.30.87:2479,http://172.16.30.88:2479"

# log 文件路径
log-file = "drainer.log"

# Syncer Configuration.
[syncer]

## db 过滤列表 (默认 "INFORMATION_SCHEMA,PERFORMANCE_SCHEMA,mysql,test"),
## 不支持对 ignore schemas 的 table 进行 rename DDL 操作
ignore-schemas = "INFORMATION_SCHEMA,PERFORMANCE_SCHEMA,mysql,bike,bikeshare,test,tidb"

# 输出到下游数据库一个事务的 sql 数量 (default 1)
txn-batch = 1

# 同步下游的并发数，该值设置越高同步的吞吐性能越好 (default 1)
worker-count = 1

# 是否禁用拆分单个 binlog 的 sqls 的功能，如果设置为 true，则按照每个 binlog
# 顺序依次还原成单个事务进行同步( 下游服务类型为 mysql, 该项设置为 False )
disable-dispatch = true 

# drainer 下游服务类型 (默认为 mysql)
# 参数有效值为 "mysql", "pb"
db-type = "pb"

# replicate-do-db priority over replicate-do-table if have same db name
# and we support regex expression ,
# 以 '~' 开始声明使用正则表达式
replicate-do-db = ["marvin"]

#[[syncer.replicate-do-table]]
#db-name ="marvin"
#tbl-name = "log"

#[[syncer.replicate-do-table]]
#db-name ="test"
#tbl-name = "~^a.*"

# db-type 设置为 mysql 时，下游数据库服务器参数
#[syncer.to]
#host = "172.16.30.89"
#user = "root"
#password = "123456"
#port = 3308

# db-type 设置为 pb 时,存放 binlog 文件的目录
[syncer.to]
dir = "data.drainer"

pump/drainer 配置
使用 ansible 部署的 pump 服务，已经在启动参数设置 metrics 。
drainer 启动时可以设置 —metrics-addr 和 —metrics-interval 两个参数，其中 metrics-addr 设为 Push Gateway 的地址，metrics-interval 为 push 的频率，单位为秒，默认值为15

重新启动drainer进程

$ nohup ./bin/drainer -config drainer.toml &

2. 产生增量，测试pb

mysql> user marvin;
ERROR 1105 (HY000): line 1 column 4 near " marvin" (total length 11)
mysql> use marvin;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> show tables;
+------------------+
| Tables_in_marvin |
+------------------+
| t1               |
| t2               |
| tidb             |
+------------------+
3 rows in set (0.00 sec)
mysql> select count(*) from t2;
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (0.01 sec)

mysql> desc t2;
+-------+-------------+------+------+---------+-------+
| Field | Type        | Null | Key  | Default | Extra |
+-------+-------------+------+------+---------+-------+
| id    | int(11)     | NO   | PRI  | NULL    |       |
| name  | varchar(10) | YES  |      | NULL    |       |
+-------+-------------+------+------+---------+-------+
2 rows in set (0.00 sec)

mysql> insert into t2 values(1,'FF');
Query OK, 1 row affected (0.01 sec)

3. pbReadder解析增量binlog

解析pb binlog工具


$ wget http://download.pingcap.org/pb_reader-latest-linux-amd64.tar.gz
$ tar -zxvf pb_reader-latest-linux-amd64.tar.gz 


查看产生多少个binlog
[tidb@ip-172-16-30-86 bin]$ ls -l  ../../data.drainer/
total 4
-rw------- 1 tidb tidb 90 Mar  4 16:38 binlog-0000000000000000-20190304160822


解析binlog
[tidb@ip-172-16-30-86 bin]$ ls
pbReader
[tidb@ip-172-16-30-86 bin]$ ./pbReader -binlog-file ../../data.drainer/binlog-0000000000000000-20190304160822 
2019/03/04 16:58:22 file.go:142: [error] bad file name .lock
2019/03/04 16:58:22 file.go:130: [warning] ignored file .lock in wal
binlog type: DML; commit ts: 406765888237993986
schema: marvin; table: t2; type: Insert
id(int): 1 
name(varchar): FF 


可以看到当前正是之前测试插入的数据（1,'FF'）

4. Reparo恢复

下载链接
$ wget https://download.pingcap.org/reparo-latest-linux-amd64.tar.gz
$ tar -xvf reparo-latest-linux-amd64.tar.gz


使用参考文档:https://www.pingcap.com/docs-cn/tools/reparo/


启动恢复
$ bin/reparo -config reparo.toml > reparo.log


因为日志模式debug模式，可以查看到具体的SQL信息，采用的replace模式插入
reparo.log


2019/03/04 17:41:20 db.go:232: [debug] [query][sql]SHOW COLUMNS FROM `marvin`.`t2`
2019/03/04 17:41:20 db.go:232: [debug] [query][sql]SHOW INDEX FROM `marvin`.`t2`
2019/03/04 17:41:20 mysql.go:173: [debug] id(int): 1 

2019/03/04 17:41:20 mysql.go:173: [debug] name(varchar): FF 

2019/03/04 17:41:20 mysql.go:52: [debug] insert sql REPLACE INTO `marvin`.`t2` (id,name) VALUES (?,?);, args [1 FF], keys [1]
2019/03/04 17:41:20 reparo.go:288: [debug] [reparo] add job takes 0.000026 seconds, is_ddl false, job &{binlogTp:1 sql:REPLACE INTO `marvin`.`t2` (id,name) VALUES (?,?); args:[1 FF] key:1}
2019/03/04 17:41:20 reparo.go:157: [debug] offset 90 ts 406765888237993986, datetime 2019-03-04 16:38:41 +0800 CST
2019/03/04 17:41:20 reparo.go:128: [info] read file data.drainer/binlog-0000000000000000-20190304160822 end
2019/03/04 17:41:20 mysql.go:46: [debug] [reparo] execute sql takes 0.001743 seconds, is_ddl false, length of sqls 1
2019/03/04 17:41:20 reparo.go:162: [info] [reparo] recovery is done, takes 0.104700 seconds

5.验证数据的恢复

[tidb@ip-172-16-30-89 bin]$ mysql -uroot -p -h172.16.30.89 -P3308
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or /g.
Your MySQL connection id is 423
Server version: 5.7.25-log MySQL Community Server (GPL)

Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '/h' for help. Type '/c' to clear the current input statement.

mysql> use marvin;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> select * from t2;
+----+------+
| id | name |
+----+------+
|  1 | FF   |
+----+------+
1 row in set (0.00 sec)

6. 源端再次插入，再次恢复

源端插入：
mysql> insert into t2 values(2,'FF');
Query OK, 1 row affected (0.00 sec)

插入一段时间后，查看解析binlog-0000000000000000-20190304160822 存在新添加的记录，即两条记录在同一个binlog,但是目标端并没有进行恢复，所以 reparo 工具属于一次性恢复
mysql> select * from t2;
+----+------+
| id | name |
+----+------+
|  1 | FF   |
+----+------+
1 row in set (0.00 sec)

mysql> select * from t2;
+----+------+
| id | name |
+----+------+
|  1 | FF   |
+----+------+
1 row in set (0.00 sec)



注意: reparo 工具属于一次性恢复，既按配置文件一次性恢复所在目录下所有符合条件的数据，然后结束进程，再次新插入，需要再次开启恢复，而且原恢复的会被replace再插入进行了重恢复(之前记录 17:41:20 插入第一条 1，ff 数据记录)，如下：
2019/03/04 17:47:30 reparo.go:120: [info] seek to file data.drainer/binlog-0000000000000000-20190304160822 offset 0 got 0
2019/03/04 17:47:30 db.go:232: [debug] [query][sql]SHOW COLUMNS FROM `marvin`.`t2`
2019/03/04 17:47:30 db.go:232: [debug] [query][sql]SHOW INDEX FROM `marvin`.`t2`
2019/03/04 17:47:30 mysql.go:173: [debug] id(int): 1 

2019/03/04 17:47:30 mysql.go:173: [debug] name(varchar): FF 

2019/03/04 17:47:30 mysql.go:52: [debug] insert sql REPLACE INTO `marvin`.`t2` (id,name) VALUES (?,?);, args [1 FF], keys [1]
2019/03/04 17:47:30 reparo.go:288: [debug] [reparo] add job takes 0.000061 seconds, is_ddl false, job &{binlogTp:1 sql:REPLACE INTO `marvin`.`t2` (id,name) VALUES (?,?); args:[1 FF] key:1}
2019/03/04 17:47:30 reparo.go:157: [debug] offset 90 ts 406765888237993986, datetime 2019-03-04 16:38:41 +0800 CST
2019/03/04 17:47:30 mysql.go:173: [debug] id(int): 2 

2019/03/04 17:47:30 mysql.go:173: [debug] name(varchar): FF 

2019/03/04 17:47:30 mysql.go:52: [debug] insert sql REPLACE INTO `marvin`.`t2` (id,name) VALUES (?,?);, args [2 FF], keys [2]
2019/03/04 17:47:30 reparo.go:288: [debug] [reparo] add job takes 0.000004 seconds, is_ddl false, job &{binlogTp:1 sql:REPLACE INTO `marvin`.`t2` (id,name) VALUES (?,?); args:[2 FF] key:2}
2019/03/04 17:47:30 reparo.go:157: [debug] offset 180 ts 406766943669059586, datetime 2019-03-04 17:45:47 +0800 CST
2019/03/04 17:47:30 reparo.go:128: [info] read file data.drainer/binlog-0000000000000000-20190304160822 end
2019/03/04 17:47:30 mysql.go:46: [debug] [reparo] execute sql takes 0.002020 seconds, is_ddl false, length of sqls 1
2019/03/04 17:47:30 mysql.go:46: [debug] [reparo] execute sql takes 0.002212 seconds, is_ddl false, length of sqls 1
2019/03/04 17:47:30 reparo.go:162: [info] [reparo] recovery is done, takes 0.106354 seconds


再次验证目标端Mysql端:
mysql> select * from t2;
+----+------+
| id | name |
+----+------+
|  1 | FF   |
|  2 | FF   |
+----+------+
2 rows in set (0.00 sec)

数据插入完成.

签到成功

CNDBA社区

TiDB binlog 增量备份pd、 pbReadder解析以及Reparo恢复

1. 部署pb

2. 产生增量，测试pb

3. pbReadder解析增量binlog

4. Reparo恢复

5.验证数据的恢复

6. 源端再次插入，再次恢复

Marvinn

QQ交流群

注册联系QQ