1 WAL日志说明
预写式日志WAL(Write Ahead Log,也称为Xlog)是实现事务日志的标准方法,对数据文件(表和索引的载体)持久化修改之前必须先持久化相应的日志。如果要修改数据文件,必须是在这些修改操作已经记录到日志文件之后才能进行修改,即在描述这些变化的日志记录刷新到永久存储器之后。在系统崩溃时,可以使用WAL日志对openGauss进行恢复操作。
openGauss的wal日志存储在pg_xlog目录下。
[dave@www.cndba.cn pg_xlog]$ pwd
/data/openGauss/install/data/dn/pg_xlog
[dave@www.cndba.cn pg_xlog]$ ll
total 278528
-rw------- 1 omm dbgrp 16777216 Feb 19 16:38 000000010000000000000001
-rw------- 1 omm dbgrp 16777216 Feb 21 05:43 000000010000000000000002
-rw------- 1 omm dbgrp 16777216 Feb 22 18:03 000000010000000000000003
-rw------- 1 omm dbgrp 16777216 Feb 23 23:08 000000010000000000000004
-rw------- 1 omm dbgrp 16777216 Feb 25 04:13 000000010000000000000005
-rw------- 1 omm dbgrp 16777216 Feb 26 16:34 000000010000000000000006
-rw------- 1 omm dbgrp 16777216 Feb 27 21:40 000000010000000000000007
-rw------- 1 omm dbgrp 16777216 Mar 1 02:44 000000010000000000000008
-rw------- 1 omm dbgrp 16777216 Mar 2 14:48 000000010000000000000009
-rw------- 1 omm dbgrp 16777216 Mar 3 20:11 00000001000000000000000A
-rw------- 1 omm dbgrp 16777216 Mar 5 01:16 00000001000000000000000B
-rw------- 1 omm dbgrp 16777216 Mar 6 12:39 00000001000000000000000C
-rw------- 1 omm dbgrp 16777216 Mar 7 18:42 00000001000000000000000D
-rw------- 1 omm dbgrp 16777216 Mar 8 23:47 00000001000000000000000E
-rw------- 1 omm dbgrp 16777216 Mar 10 11:43 00000001000000000000000F
-rw------- 1 omm dbgrp 16777216 Mar 11 23:12 000000010000000000000010
-rw------- 1 omm dbgrp 16777216 Mar 12 22:54 000000010000000000000011
drwx------ 2 omm dbgrp 10 Feb 19 16:37 archive_status
[dave@www.cndba.cn pg_xlog]$
日志文件以段文件的形式存储的,每个段为16MB,并分割成若干页,每页8KB。
对WAL日志的命名规则:000000010000000000000001
wal文件的名称由24个十六进制组成,分为三个部分,每个部分由8个十六进制字符组成。
第一部分表示时间线:00000001表示timeline;
第二部分表示日志文件标号:00000000表示logid;
第三部分表示日志文件的段标号:00000001表示logseg.
这些数字一般情况下是顺序增长使用的,但也存在循环使用的情况。WAL日志的内容取决于记录事务的类型,在系统崩溃时可以利用WAL日志进行恢复。默认配置下,openGauss每次启动时会先读取WAL日志进行恢复。
为了进一步的减少WAL log file的I/O操作,PostgreSQL中引入了WAL log buffer,对产生的WAL log record日志进行缓存,合并I/O操作。
WAL log buffer可以理解为一个环形的共享缓存,在每次写入新的日志记录时:
- 当WAL log buffer中有足够的空间,顺序写入到缓存区中
- 当WAL log buffer写到尾部且空间不足时,从头部刷出信息后重复利用,并将头部向后移动
WAL log buffer中的数据最终必须写入WAL log file中,这会在以下时机发生:
- WAL log buffer满
- WALWriter进程周期性工作
- 事务提交
- 创建Checkpoint
WAL log buffer写入WAL log file之后,可能在操作系统的page cache中缓存,并没有真正写入到文件中,所以还需要有flush机制,openGauss支持同步和异步两种方式:
- synchronous_commit(默认值为ON)为ON,则为同步方式,写入WAL log file之后会flush,然后才返回
- synchronous_commit为OFF,则为异步方式,写入WAL log file之后立即返回
2 归档配置
WAL 日志并不是一直保存,否则会占用大量的磁盘空间。 wal_keep_segments参数控制wal日志保留的个数,默认保留最近16个。 因此生产环境为了数据库恢复,需要开启归档模式,定时将WAL 日志复制到归档目录。
2.1 查看当前状态
[dave@www.cndba.cn ~]$ gsql -d web -p 15400
Password:
gsql ((openGauss 3.0.3 build 46134f73) compiled at 2023-01-10 22:42:07 commit 0 last mr )
Non-SSL connection (SSL connection is recommended when requiring high-security)
Type "help" for help.
web=# show data_directory;
data_directory
---------------------------------
/data/openGauss/install/data/dn
(1 row)
web=# show archive_mode;
archive_mode
--------------
off
(1 row)
web=#
2.2 启用归档模式
修改opengauss配置文件中的如下3个参数:
web=# alter system set archive_mode='on';
ALTER SYSTEM SET
web=# alter system set wal_level='hot_standby';
NOTICE: please restart the database for the POSTMASTER level parameter to take effect.
ALTER SYSTEM SET
web=# alter system set archive_command='cp %p /data/openGauss/install/data/dn/archivelog/%f';
ALTER SYSTEM SET
重启openGauss,让参数修改生效。
[dave@www.cndba.cn dn]$ gs_om -t restart
Stopping cluster.
=========================================
Successfully stopped cluster.
=========================================
End stop cluster.
Starting cluster.
=========================================
[SUCCESS] hf-172-31-101-18
2023-03-13 14:04:48.471 640ebd00.1 [unknown] 140517756856000 [unknown] 0 dn_6001 01000 0 [BACKEND] WARNING: could not create any HA TCP/IP sockets
2023-03-13 14:04:48.471 640ebd00.1 [unknown] 140517756856000 [unknown] 0 dn_6001 01000 0 [BACKEND] WARNING: could not create any HA TCP/IP sockets
2023-03-13 14:04:48.472 640ebd00.1 [unknown] 140517756856000 [unknown] 0 dn_6001 01000 0 [BACKEND] WARNING: No explicit IP is configured for listen_addresses GUC.
Password:
=========================================
Successfully started.
[dave@www.cndba.cn dn]$
web=# show archive_mode;
archive_mode
--------------
on
(1 row)
web=# show archive_command;
archive_command
-----------------------------------------------------
cp %p /data/openGauss/install/data/dn/archivelog/%f
(1 row)
web=#
开启归档后在归档路径下的archive_status目录里,会有一个类似xxx.ready和xxx.done的文件。
[dave@www.cndba.cn archive_status]$ pwd
/data/openGauss/install/data/dn/pg_xlog/archive_status
[dave@www.cndba.cn archive_status]$ ll
total 0
-rw------- 1 omm dbgrp 0 Mar 13 14:05 000000010000000000000001.ready
-rw------- 1 omm dbgrp 0 Mar 13 14:14 000000010000000000000002.ready
-rw------- 1 omm dbgrp 0 Mar 13 14:04 000000010000000000000012.ready
-rw------- 1 omm dbgrp 0 Mar 13 14:13 000000010000000000000013.ready
[dave@www.cndba.cn archive_status]$
.ready表示WAL已经写满,可以调用归档命令了。
.done表示已归档完成。
查看归档文件:
[dave@www.cndba.cn archivelog]$ pwd
/data/openGauss/install/data/dn/archivelog
[dave@www.cndba.cn archivelog]$ ll
total 196608
-rw------- 1 omm dbgrp 16777216 Mar 13 15:01 000000010000000000000001
-rw------- 1 omm dbgrp 16777216 Mar 13 15:01 000000010000000000000002
-rw------- 1 omm dbgrp 16777216 Mar 13 15:01 000000010000000000000003
-rw------- 1 omm dbgrp 16777216 Mar 13 15:01 000000010000000000000004
-rw------- 1 omm dbgrp 16777216 Mar 13 15:01 000000010000000000000005
-rw------- 1 omm dbgrp 16777216 Mar 13 15:01 000000010000000000000006
-rw------- 1 omm dbgrp 16777216 Mar 13 15:01 000000010000000000000012
-rw------- 1 omm dbgrp 16777216 Mar 13 15:01 000000010000000000000013
-rw------- 1 omm dbgrp 16777216 Mar 13 15:01 000000010000000000000014
-rw------- 1 omm dbgrp 16777216 Mar 13 15:01 000000010000000000000015
-rw------- 1 omm dbgrp 16777216 Mar 13 15:01 000000010000000000000016
-rw------- 1 omm dbgrp 16777216 Mar 13 15:01 000000010000000000000017
[dave@www.cndba.cn archivelog]$
3 WAL 日志删除
我们先查看参数:wal_keep_segments
web=# select name, setting,source from pg_settings where name like '%wal%';
name | setting | source
---------------------------------+-------------+--------------------
enable_wal_shipping_compression | off | default
max_wal_senders | 16 | configuration file
wal_block_size | 8192 | default
wal_buffers | 2048 | default
wal_file_init_num | 10 | default
wal_flush_delay | 1 | default
wal_flush_timeout | 2 | default
wal_keep_segments | 16 | configuration file
wal_level | hot_standby | configuration file
wal_log_hints | on | default
wal_receiver_buffer_size | 65536 | default
wal_receiver_connect_retries | 1 | default
wal_receiver_connect_timeout | 2 | default
wal_receiver_status_interval | 5 | default
wal_receiver_timeout | 6000 | default
wal_segment_size | 2048 | default
wal_sender_timeout | 6000 | default
wal_sync_method | fdatasync | default
wal_writer_delay | 200 | default
walsender_max_send_size | 8192 | configuration file
walwriter_cpu_bind | -1 | default
walwriter_sleep_threshold | 500 | default
(22 rows)
web=#
web=# select name, setting,source from pg_settings where name ='wal_keep_segments';
name | setting | source
-------------------+---------+--------------------
wal_keep_segments | 16 | configuration file
(1 row)
默认配置下,WAL 会保留16个wal 日志,在开启归档后,在wal转归档完成后,也会删除wal日志。
web=# show archive_command;
archive_command
-----------------------------------------------------
cp %p /data/openGauss/install/data/dn/archivelog/%f
(1 row)
archive_command命令会将WAL 复制到到归档目录,然后被清理掉。在openGauss的日志里会记录到如下信息:
LOG: attempting to remove WAL segments older than log file 000000010000000000000006
但如果启动了归档模式(archive_mode=on),但没有配置archive_command,或者改命令配置的不正确,那么那么WAL文件会一直堆积,因为archive_command命令不会运行,不会触发归档命令,所以一直都不会写.done,从而导致pg_wal文件堆积,占用大量得磁盘空间。
此时可以使用pg_archivecleanup命令来清理。
查看检查点以前的WAL文件
[dave@www.cndba.cn pg_xlog]$ pg_controldata /data/openGauss/install/data/dn
pg_control version number: 923
Catalog version number: 201611171
Database system identifier: 4095258363992716874
Database cluster state: in production
pg_control last modified: Mon 13 Mar 2023 03:17:08 PM CST
Latest checkpoint location: 0/18002700
Prior checkpoint location: 0/180025E0
Latest checkpoint's REDO location: 0/18002680
Latest checkpoint's TimeLineID: 1
Latest checkpoint's full_page_writes: off
Latest checkpoint's NextXID: 69064
Latest checkpoint's NextOID: 16474
Latest checkpoint's NextMultiXactId: 2
Latest checkpoint's NextMultiOffset: 0
Latest checkpoint's oldestXID: 12663
Latest checkpoint's oldestXID's DB: 15481
Latest checkpoint's oldestActiveXID: 69064
Latest checkpoint's remove lsn: 0/8
Time of latest checkpoint: Mon 13 Mar 2023 03:17:08 PM CST
Minimum recovery ending location: 0/0
Backup start location: 0/0
Backup end location: 0/0
End-of-backup record required: no
Current wal_level setting: hot_standby
Current max_connections setting: 5050
Current max_prepared_xacts setting: 800
Current max_locks_per_xact setting: 256
Maximum data alignment: 8
Database block size: 8192
Blocks per segment of large relation: 131072
WAL block size: 8192
Bytes per WAL segment: 16777216
Maximum length of identifiers: 64
Maximum columns in an index: 32
Maximum size of a TOAST chunk: 1996
Date/time type storage: 64-bit integers
Float4 argument passing: by value
Float8 argument passing: by value
Database system TimeLine: 9
[dave@www.cndba.cn pg_xlog]$
这里表示000000010000000000000018 之前的文件都可以删除,然后执行 pg_archivecleanup命令。
删除归档使用的是pg_archivecleanup工具,但是在该版本的openGauss目录中没有搜到该工具,所以从pg里复制了一个过来,注意复制的是pg 10版本,不过不影响使用:
[root@hf-172-31-101-18 ~]usr/pgsql-10/bin/pg_archivecleanup /data/openGauss/install/app/bin/
[root@hf-172-31-101-18 bin]# chown omm:dbgrp pg_archivecleanup
[dave@www.cndba.cn ~]$ pg_archivecleanup -d /data/openGauss/install/data/dn/pg_xlog 000000010000000000000018
pg_archivecleanup: keeping WAL file "/data/openGauss/install/data/dn/pg_xlog/000000010000000000000018" and later
pg_archivecleanup: removing file "/data/openGauss/install/data/dn/pg_xlog/000000010000000000000007"
pg_archivecleanup: removing file "/data/openGauss/install/data/dn/pg_xlog/000000010000000000000008"
pg_archivecleanup: removing file "/data/openGauss/install/data/dn/pg_xlog/000000010000000000000009"
pg_archivecleanup: removing file "/data/openGauss/install/data/dn/pg_xlog/00000001000000000000000A"
pg_archivecleanup: removing file "/data/openGauss/install/data/dn/pg_xlog/00000001000000000000000B"
pg_archivecleanup: removing file "/data/openGauss/install/data/dn/pg_xlog/00000001000000000000000C"
pg_archivecleanup: removing file "/data/openGauss/install/data/dn/pg_xlog/00000001000000000000000D"
pg_archivecleanup: removing file "/data/openGauss/install/data/dn/pg_xlog/00000001000000000000000E"
pg_archivecleanup: removing file "/data/openGauss/install/data/dn/pg_xlog/00000001000000000000000F"
pg_archivecleanup: removing file "/data/openGauss/install/data/dn/pg_xlog/000000010000000000000010"
pg_archivecleanup: removing file "/data/openGauss/install/data/dn/pg_xlog/000000010000000000000011"
pg_archivecleanup: removing file "/data/openGauss/install/data/dn/pg_xlog/000000010000000000000012"
pg_archivecleanup: removing file "/data/openGauss/install/data/dn/pg_xlog/000000010000000000000013"
pg_archivecleanup: removing file "/data/openGauss/install/data/dn/pg_xlog/000000010000000000000014"
pg_archivecleanup: removing file "/data/openGauss/install/data/dn/pg_xlog/000000010000000000000015"
pg_archivecleanup: removing file "/data/openGauss/install/data/dn/pg_xlog/000000010000000000000016"
pg_archivecleanup: removing file "/data/openGauss/install/data/dn/pg_xlog/000000010000000000000017"
[dave@www.cndba.cn ~]$
4 归档文件删除
4.1 通过脚本配置归档定期删除
archive_command : 可以传入一个shell命令或者一个复杂的shell脚本。
%p : 表示将要归档的wal文件包含完整路径的信息的文件名(就是需要归档的临时文件)
%f : 代表不包含路径信息的wal文件的文件名
%% : 表示%
当启用归档后,会生成大量得归档文件,如果没有及时删除,会导致磁盘空间撑满。 所以我们可以在archive_command目录中,传入脚本,在归档得同时,删除之前得脚本:
创建shell脚本:
[dave@www.cndba.cn dn]$ cat arch.sh
#!/bin/sh
test ! -f /data/openGauss/install/data/dn/archivelog/$1 && cp --preserve=timestamps $2 /data/openGauss/install/data/dn/archivelog/$1 ; find /data/openGauss/install/data/dn/archivelog/ -type f -mtime +7 -exec rm -f {} /;
[dave@www.cndba.cn dn]$ chmod a+x arch.sh
[dave@www.cndba.cn dn]$
修改archive_command参数:
web=# alter system set archive_command = '/data/openGauss/install/data/dn/arch.sh %f %p';
ALTER SYSTEM SET
切换归档:
web=# select pg_switch_xlog();
pg_switch_xlog
----------------
0/180139E0
(1 row)
web=#
此时就直接生成了:
[dave@www.cndba.cn archivelog]$ pwd
/data/openGauss/install/data/dn/archivelog
[dave@www.cndba.cn archivelog]$ ll
total 16384
-rw------- 1 omm dbgrp 16777216 Mar 13 19:20 000000010000000000000018
[dave@www.cndba.cn archivelog]$
4.2 手工删除归档文件
web=# select * from pg_ls_waldir() order by modification asc;
name | size | modification
--------------------------+----------+------------------------
00000001000000000000001F | 16777216 | 2023-02-19 16:38:09+08
000000010000000000000020 | 16777216 | 2023-02-21 05:43:26+08
000000010000000000000021 | 16777216 | 2023-02-22 18:03:57+08
000000010000000000000022 | 16777216 | 2023-02-23 23:08:48+08
000000010000000000000023 | 16777216 | 2023-02-25 04:13:32+08
000000010000000000000024 | 16777216 | 2023-02-26 16:34:54+08
00000001000000000000001E | 16777216 | 2023-03-13 14:13:32+08
000000010000000000000019 | 16777216 | 2023-03-13 14:13:32+08
00000001000000000000001A | 16777216 | 2023-03-13 14:13:32+08
00000001000000000000001B | 16777216 | 2023-03-13 14:13:32+08
00000001000000000000001C | 16777216 | 2023-03-13 14:13:32+08
00000001000000000000001D | 16777216 | 2023-03-13 14:13:32+08
000000010000000000000018 | 16777216 | 2023-03-13 18:49:17+08
(13 rows)
[root@db1 pg_wal]# pwd
/var/lib/pgsql/14/data/pg_wal
[root@db1 pg_wal]# ll
total 81920
-rw-------. 1 postgres postgres 16777216 Aug 13 01:24 000000010000000000000008
-rw-------. 1 postgres postgres 16777216 Aug 13 01:01 000000010000000000000009
-rw-------. 1 postgres postgres 16777216 Aug 13 01:01 00000001000000000000000A
-rw-------. 1 postgres postgres 16777216 Aug 13 01:19 00000001000000000000000B
-rw-------. 1 postgres postgres 16777216 Aug 13 01:21 00000001000000000000000C
drwx------. 2 postgres postgres 6 Aug 13 01:24 archive_status
[root@db1 pg_wal]#
查看检查点以前的WAL文件
[dave@www.cndba.cn bin]$ ps -ef|grep openGauss
omm 17111 15284 0 18:50 pts/7 00:00:00 grep --color=auto openGauss
omm 36144 1 6 14:04 ? 00:19:15 /data/openGauss/install/app/bin/gaussdb -D /data/openGauss/install/data/dn
[dave@www.cndba.cn bin]$ pg_controldata /data/openGauss/install/data/dn
pg_control version number: 923
Catalog version number: 201611171
Database system identifier: 4095258363992716874
Database cluster state: in production
pg_control last modified: Mon 13 Mar 2023 06:50:18 PM CST
Latest checkpoint location: 0/18011748
Prior checkpoint location: 0/18011628
Latest checkpoint's REDO location: 0/180116C8
Latest checkpoint's TimeLineID: 1
Latest checkpoint's full_page_writes: off
Latest checkpoint's NextXID: 69064
Latest checkpoint's NextOID: 16474
Latest checkpoint's NextMultiXactId: 2
Latest checkpoint's NextMultiOffset: 0
Latest checkpoint's oldestXID: 12663
Latest checkpoint's oldestXID's DB: 15481
Latest checkpoint's oldestActiveXID: 69064
Latest checkpoint's remove lsn: 0/8
……
这里表示000000010000000000000018 之前得文件都可以删除,在系统层面删除这些归档文件即可,可以手工执行rm 也可以通过pg_archivecleanup命令进行:
[dave@www.cndba.cn ~]$ cd /data/openGauss/install/data/dn/archivelog/
[dave@www.cndba.cn archivelog]$ ll
total 196608
-rw------- 1 omm dbgrp 16777216 Mar 13 15:01 000000010000000000000001
-rw------- 1 omm dbgrp 16777216 Mar 13 15:01 000000010000000000000002
-rw------- 1 omm dbgrp 16777216 Mar 13 15:01 000000010000000000000003
-rw------- 1 omm dbgrp 16777216 Mar 13 15:01 000000010000000000000004
-rw------- 1 omm dbgrp 16777216 Mar 13 15:01 000000010000000000000005
-rw------- 1 omm dbgrp 16777216 Mar 13 15:01 000000010000000000000006
-rw------- 1 omm dbgrp 16777216 Mar 13 15:01 000000010000000000000012
-rw------- 1 omm dbgrp 16777216 Mar 13 15:01 000000010000000000000013
-rw------- 1 omm dbgrp 16777216 Mar 13 15:01 000000010000000000000014
-rw------- 1 omm dbgrp 16777216 Mar 13 15:01 000000010000000000000015
-rw------- 1 omm dbgrp 16777216 Mar 13 15:01 000000010000000000000016
-rw------- 1 omm dbgrp 16777216 Mar 13 15:01 000000010000000000000017
[dave@www.cndba.cn archivelog]$ pg_archivecleanup -d /data/openGauss/install/data/dn/archivelog/ 000000010000000000000018
pg_archivecleanup: keeping WAL file "/data/openGauss/install/data/dn/archivelog//000000010000000000000018" and later
pg_archivecleanup: removing file "/data/openGauss/install/data/dn/archivelog//000000010000000000000001"
pg_archivecleanup: removing file "/data/openGauss/install/data/dn/archivelog//000000010000000000000002"
pg_archivecleanup: removing file "/data/openGauss/install/data/dn/archivelog//000000010000000000000003"
pg_archivecleanup: removing file "/data/openGauss/install/data/dn/archivelog//000000010000000000000004"
pg_archivecleanup: removing file "/data/openGauss/install/data/dn/archivelog//000000010000000000000005"
pg_archivecleanup: removing file "/data/openGauss/install/data/dn/archivelog//000000010000000000000006"
pg_archivecleanup: removing file "/data/openGauss/install/data/dn/archivelog//000000010000000000000012"
pg_archivecleanup: removing file "/data/openGauss/install/data/dn/archivelog//000000010000000000000013"
pg_archivecleanup: removing file "/data/openGauss/install/data/dn/archivelog//000000010000000000000014"
pg_archivecleanup: removing file "/data/openGauss/install/data/dn/archivelog//000000010000000000000015"
pg_archivecleanup: removing file "/data/openGauss/install/data/dn/archivelog//000000010000000000000016"
pg_archivecleanup: removing file "/data/openGauss/install/data/dn/archivelog//000000010000000000000017"
[dave@www.cndba.cn archivelog]$ ll
total 0
[dave@www.cndba.cn archivelog]$
版权声明:本文为博主原创文章,未经博主允许不得转载。