HEALTH_WARN clock skew detected 意思是各个节点之间的时间不同步。
1.查看集群状态
[root@ceph-osd1 ~]# ceph status cluster 21ed0f42-69d2-450c-babf-b1a44c1b82e4 health HEALTH_ERR clock skew detected on mon.ceph-osd2, mon.ceph-osd3 --可以看到这里有问题了。时间不同步 64 pgs are stuck inactive for more than 300 seconds 64 pgs stuck inactive too few PGs per OSD (7 < min 30) Monitor clock skew detected monmap e2: 3 mons at {ceph-osd1=192.168.1.141:6789/0,ceph-osd2=192.168.1.142:6789/0,ceph-osd3=192.168.1.143:6789/0} election epoch 12, quorum 0,1,2 ceph-osd1,ceph-osd2,ceph-osd3 osdmap e60: 9 osds: 9 up, 9 in flags sortbitwise pgmap v238: 64 pgs, 1 pools, 0 bytes data, 0 objects 300 MB used, 359 GB / 359 GB avail 64 creating
#查看详细的日志
[root@ceph-osd1 ~]# ceph health detail .... too few PGs per OSD (7 < min 30) mon.ceph-osd2 addr 192.168.1.142:6789/0 clock skew 2.56682s > max 0.05s (latency 0.0020987s) mon.ceph-osd3 addr 192.168.1.143:6789/0 clock skew 2.56706s > max 0.05s (latency 0.00193141s)
1.2查看当前系统设定的值
[root@ceph-osd1 ~]# ceph --admin-daemon /var/run/ceph/ceph-osd.0.asok config show | grep clock "mon_clock_drift_allowed": "0.05",---当 mon 时间偏移 0.05 秒则不正常 "mon_clock_drift_warn_backoff": "5",---当出现 5 次偏移, 则报警 "clock_offset": "0",--mon 节点的时间偏移默认值
2.方法
一个简单的解决办法就是:--但是不推荐这种方法
2.1停掉所有节点的ntpd服务,如果有的话
# systemctl stop ntpd
2.2同步国际时间
# ntpdate time.nist.gov
2.3 配置ntp服务
这里我把NTP server放在了ceph-admin节点上,其余三个ceph-1/2/3节点都是NTP client,目的就是从根本上解决时间同步问题。(暂时没搞多server的)
在ceph-admin节点上:
修改/etc/ntp.conf,注释掉默认的四个server,添加三行配置如下:
vim /etc/ntp.conf ###comment following lines: #server 0.centos.pool.ntp.org iburst #server 1.centos.pool.ntp.org iburst #server 2.centos.pool.ntp.org iburst #server 3.centos.pool.ntp.org iburst ###add following lines: server 127.127.1.0 minpoll 4 fudge 127.127.1.0 stratum 0 restrict 192.168.56.0 mask 255.255.0.0 nomodify notrap #这一行需要根据client的IP范围设置。
修改/etc/ntp/step-tickers文件如下:
# List of NTP servers used by the ntpdate service. # 0.centos.pool.ntp.org 192.168.1.131
重启ntp服务,并查看server端是否运行正常,正常的标准就是ntpq -p指令的最下面一行是*:
[root@ceph-admin ~]# systemctl enable ntpd Created symlink from /etc/systemd/system/multi-user.target.wants/ntpd.service to /usr/lib/systemd/system/ntpd.service. [root@ceph-admin ~]# systemctl restart ntpd [root@ceph-admin ~]# ntpq -p remote refid st t when poll reach delay offset jitter ============================================================================== *LOCAL(0) .LOCL. 0 l - 16 1 0.000 0.000 0.000
至此,NTP server端已经配置完毕,下面开始配置client端。
在ceph-1/ceph-2/ceph-3三个节点上:
修改/etc/ntp.conf,注释掉四行server,添加一行server指向ceph-admin:
vim /etc/ntp.conf #server 0.centos.pool.ntp.org iburst #server 1.centos.pool.ntp.org iburst #server 2.centos.pool.ntp.org iburst #server 3.centos.pool.ntp.org iburst server 192.168.1.131
重启ntp服务并观察client是否正确连接到server端,同样正确连接的标准是ntpq -p的最下面一行以*号开头:
[root@ceph-1 ~]# systemctl enable ntpd Created symlink from /etc/systemd/system/multi-user.target.wants/ntpd.service to /usr/lib/systemd/system/ntpd.service. [root@ceph-1 ~]# systemctl restart ntpd [root@ceph-1 ~]# ntpq -p remote refid st t when poll reach delay offset jitter ============================================================================== *ceph-admin .LOCL. 1 u 1 64 1 0.329 0.023 0.000
这个过程不会持续太久,实际生产最久5min内也会达到*状态,下图给了一个未能正确连接的输出:
[root@ceph-1 ~]# ntpq -p remote refid st t when poll reach delay offset jitter ============================================================================== ceph-admin .INIT. 16 u - 64 0 0.000 0.000 0.000
3.重启mon
[root@ceph-osd2 ~]# systemctl restart ceph-mon@ceph-osd3
--@主机名
3.1再次查看集群状态,正常了
[root@ceph-osd1 ~]# ceph -w cluster 21ed0f42-69d2-450c-babf-b1a44c1b82e4 health HEALTH_ERR 64 pgs are stuck inactive for more than 300 seconds 64 pgs stuck inactive too few PGs per OSD (7 < min 30) monmap e2: 3 mons at {ceph-osd1=192.168.1.141:6789/0,ceph-osd2=192.168.1.142:6789/0,ceph-osd3=192.168.1.143:6789/0} election epoch 16, quorum 0,1,2 ceph-osd1,ceph-osd2,ceph-osd3 osdmap e60: 9 osds: 9 up, 9 in flags sortbitwise pgmap v238: 64 pgs, 1 pools, 0 bytes data, 0 objects 300 MB used, 359 GB / 359 GB avail 64 creating 2016-11-08 21:12:42.759296 mon.0 [INF] osdmap e60: 9 osds: 9 up, 9 in
版权声明:本文为博主原创文章,未经博主允许不得转载。
ceph 时间不同步