1 故障现象
在OCP 监控中发现如下错误:
alarm_template_id=0:ob_cluster=myoceanbase-1:host=10.105.208.23:server_type=observer:error_code=4264:keyword= OBServer 程序日志
告警详情:[OBServer 程序日志] 集群:myoceanbase,主机:10.105.208.23,日志类型:observer,日志文件:/home/admin/myoceanbase/oceanbase/log/observer.log,日志级别:error,关键字=,错误码=4264,TraceId=Y0-0000000000000000-0-0,日志详情=[2023-06-26 09:52:19.865894] ERROR try_recycle_blocks (palf_env_impl.cpp:692) [70492][T1001_PalfGC][T1001][Y0-0000000000000000-0-0] [lt=16][errcode=-4264] Log out of disk space(msg="log disk space is almost full", ret=-4264, total_size(MB)=716, used_size(MB)=607, used_percent(%)=84, warn_size(MB)=573, warn_percent(%)=80, limit_size(MB)=680, limit_percent(%)=95, maximum_used_size(MB)=607, maximum_log_stream=1, oldest_log_stream=1, oldest_scn={val:1687726789591430060}) 。
这里提示[T1001]租户的log 磁盘满了。
查看租户信息,这里的[T1001]是 ocp 的meta 租户:
obclient [oceanbase]> SELECT tenant_id,TENANT_NAME ,TENANT_TYPE FROM DBA_OB_TENANTS;
+-----------+-------------+-------------+
| tenant_id | TENANT_NAME | TENANT_TYPE |
+-----------+-------------+-------------+
| 1 | sys | SYS |
| 1001 | META$1002 | META |
| 1002 | ocp | USER |
| 1011 | META$1012 | META |
| 1012 | tpcc_mysql | USER |
+-----------+-------------+-------------+
5 rows in set (0.004 sec)
2 分析过程
注意这里并不是集群的 log_disk_size,而是租户的 UNIT 的 LOG_DISK_SIZE:
obclient [(none)]> SHOW PARAMETERS LIKE '%LOG_DISK_SIZE%';
+-------+----------+---------------+----------+---------------+-----------+-------+----------------------------------------------------------------+------------+---------+---------+-------------------+
| zone | svr_type | svr_ip | svr_port | name | data_type | value | info | section | scope | source | edit_level |
+-------+----------+---------------+----------+---------------+-----------+-------+----------------------------------------------------------------+------------+---------+---------+-------------------+
| zone3 | observer | 10.105.208.23 | 2882 | log_disk_size | NULL | 971G | the size of disk space used by the log files. Range: [0, +∞) | LOGSERVICE | CLUSTER | DEFAULT | DYNAMIC_EFFECTIVE |
| zone1 | observer | 10.105.208.21 | 2882 | log_disk_size | NULL | 971G | the size of disk space used by the log files. Range: [0, +∞) | LOGSERVICE | CLUSTER | DEFAULT | DYNAMIC_EFFECTIVE |
| zone2 | observer | 10.105.208.22 | 2882 | log_disk_size | NULL | 971G | the size of disk space used by the log files. Range: [0, +∞) | LOGSERVICE | CLUSTER | DEFAULT | DYNAMIC_EFFECTIVE |
+-------+----------+---------------+----------+---------------+-----------+-------+----------------------------------------------------------------+------------+---------+---------+-------------------+
3 rows in set (0.015 sec)
查看租户 UNIT 的
select a.tenant_id,b.tenant_name,a.UNIT_CONFIG_ID,c.name,a.MAX_CPU,
a.MIN_CPU,CAST(a.MEMORY_SIZE/1024/1024/1024 as DECIMAL(15,2)) MEMORY_SIZE,
CAST(a.LOG_DISK_SIZE/1024/1024/1024 as DECIMAL(15,2)) LOG_DISK_SIZE,
a.MAX_IOPS,a.MIN_IOPS,a.IOPS_WEIGHT from dba_ob_units a,
DBA_OB_TENANTS b, DBA_OB_UNIT_CONFIGS c
where a.tenant_id=b.tenant_id
and a.unit_config_id = c.unit_config_id;
obclient [oceanbase]> select a.tenant_id,b.tenant_name,a.UNIT_CONFIG_ID,c.name,a.MAX_CPU,
-> a.MIN_CPU,CAST(a.MEMORY_SIZE/1024/1024/1024 as DECIMAL(15,2)) MEMORY_SIZE,
-> CAST(a.LOG_DISK_SIZE/1024/1024/1024 as DECIMAL(15,2)) LOG_DISK_SIZE,
-> a.MAX_IOPS,a.MIN_IOPS,a.IOPS_WEIGHT from dba_ob_units a,
-> DBA_OB_TENANTS b, DBA_OB_UNIT_CONFIGS c
-> where a.tenant_id=b.tenant_id
-> and a.unit_config_id = c.unit_config_id;
+-----------+-------------+----------------+-----------------+---------+---------+-------------+---------------+----------+----------+-------------+
| tenant_id | tenant_name | UNIT_CONFIG_ID | name | MAX_CPU | MIN_CPU | MEMORY_SIZE | LOG_DISK_SIZE | MAX_IOPS | MIN_IOPS | IOPS_WEIGHT |
+-----------+-------------+----------------+-----------------+---------+---------+-------------+---------------+----------+----------+-------------+
| 1 | sys | 1 | sys_unit_config | 1 | 1 | 16.00 | 16.00 | 10000 | 10000 | 1 |
| 1 | sys | 1 | sys_unit_config | 1 | 1 | 16.00 | 16.00 | 10000 | 10000 | 1 |
| 1 | sys | 1 | sys_unit_config | 1 | 1 | 16.00 | 16.00 | 10000 | 10000 | 1 |
| 1002 | ocp | 1001 | ocp_unit | 1 | 1 | 4.00 | 7.00 | 10000 | 10000 | 1 |
| 1002 | ocp | 1001 | ocp_unit | 1 | 1 | 4.00 | 7.00 | 10000 | 10000 | 1 |
| 1002 | ocp | 1001 | ocp_unit | 1 | 1 | 4.00 | 7.00 | 10000 | 10000 | 1 |
| 1012 | tpcc_mysql | 1008 | RU50C100G300G | 50 | 50 | 100.00 | 300.00 | 500000 | 500000 | 50 |
| 1012 | tpcc_mysql | 1008 | RU50C100G300G | 50 | 50 | 100.00 | 300.00 | 500000 | 500000 | 50 |
| 1012 | tpcc_mysql | 1008 | RU50C100G300G | 50 | 50 | 100.00 | 300.00 | 500000 | 500000 | 50 |
+-----------+-------------+----------------+-----------------+---------+---------+-------------+---------------+----------+----------+-------------+
9 rows in set (0.009 sec)
3 解决方法
连接到 SYS 租户,修改租户对应的LOG_DISK_SIZE:
obclient [oceanbase]> ALTER RESOURCE UNIT ocp_unit LOG_DISK_SIZE '12G';
Query OK, 0 rows affected (0.002 sec)
obclient [oceanbase]> select a.tenant_id,b.tenant_name,a.UNIT_CONFIG_ID,c.name,a.MAX_CPU,
-> a.MIN_CPU,CAST(a.MEMORY_SIZE/1024/1024/1024 as DECIMAL(15,2)) MEMORY_SIZE,
-> CAST(a.LOG_DISK_SIZE/1024/1024/1024 as DECIMAL(15,2)) LOG_DISK_SIZE,
-> a.MAX_IOPS,a.MIN_IOPS,a.IOPS_WEIGHT from dba_ob_units a,
-> DBA_OB_TENANTS b, DBA_OB_UNIT_CONFIGS c
-> where a.tenant_id=b.tenant_id
-> and a.unit_config_id = c.unit_config_id;
+-----------+-------------+----------------+-----------------+---------+---------+-------------+---------------+----------+----------+-------------+
| tenant_id | tenant_name | UNIT_CONFIG_ID | name | MAX_CPU | MIN_CPU | MEMORY_SIZE | LOG_DISK_SIZE | MAX_IOPS | MIN_IOPS | IOPS_WEIGHT |
+-----------+-------------+----------------+-----------------+---------+---------+-------------+---------------+----------+----------+-------------+
| 1 | sys | 1 | sys_unit_config | 1 | 1 | 16.00 | 16.00 | 10000 | 10000 | 1 |
| 1 | sys | 1 | sys_unit_config | 1 | 1 | 16.00 | 16.00 | 10000 | 10000 | 1 |
| 1 | sys | 1 | sys_unit_config | 1 | 1 | 16.00 | 16.00 | 10000 | 10000 | 1 |
| 1002 | ocp | 1001 | ocp_unit | 1 | 1 | 4.00 | 12.00 | 10000 | 10000 | 1 |
| 1002 | ocp | 1001 | ocp_unit | 1 | 1 | 4.00 | 12.00 | 10000 | 10000 | 1 |
| 1002 | ocp | 1001 | ocp_unit | 1 | 1 | 4.00 | 12.00 | 10000 | 10000 | 1 |
| 1012 | tpcc_mysql | 1008 | RU50C100G300G | 50 | 50 | 100.00 | 300.00 | 500000 | 500000 | 50 |
| 1012 | tpcc_mysql | 1008 | RU50C100G300G | 50 | 50 | 100.00 | 300.00 | 500000 | 500000 | 50 |
| 1012 | tpcc_mysql | 1008 | RU50C100G300G | 50 | 50 | 100.00 | 300.00 | 500000 | 500000 | 50 |
+-----------+-------------+----------------+-----------------+---------+---------+-------------+---------------+----------+----------+-------------+
9 rows in set (0.001 sec)
版权声明:本文为博主原创文章,未经博主允许不得转载。