1 Osd说明
Ceph OSDs: Ceph OSD 守护进程( Ceph OSD )的功能是存储数据,处理数据的复制、恢复、回填、再均衡,并通过检查其他OSD 守护进程的心跳来向 Ceph Monitors 提供一些监控信息。当 Ceph 存储集群设定为有2个副本时,至少需要2个 OSD 守护进程,集群才能达到 active+clean 状态( Ceph 默认有3个副本,但你可以调整副本数)。
你迟早要扩容集群, Ceph 允许在运行时增加 OSD 。在 Ceph 里,一个 OSD 一般是一个 ceph-osd 守护进程,它运行在硬盘之上,如果你有多个硬盘,可以给每个硬盘启动一个 ceph-osd 守护进程。
2 添加OSD
我这里是在原来的OSD主机上进行添加的,所以安装ceph等一系列工作就不用做了,
如果是一台新的服务器,那么可以参考ceph安装,把环境弄好。
2.1 新分一个区,大小40M
[root@ceph-osd3 ~]# lvs LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert cephosd centos -wi-a----- 40.00m root centos -wi-ao---- 47.46g swap centos -wi-ao---- 2.00g
2.2 #添加
ceph-deploy osd create 命令可以分解为:
ceph-deploy osd prepare和ceph-deploy osd activate
[root@ceph-mon1 ceph-cluster]# ceph-deploy osd create ceph-osd3:/dev/centos/cephosd [ceph-osd3][WARNIN] create_partition: refusing to create journal on /dev/centos/cephosd [ceph-osd3][WARNIN] create_partition: journal size (5120M) is bigger than device (40M) ······ [ceph-osd3][WARNIN] ceph_disk.main.Error: Error: /dev/centos/cephosd device size (40M) is not big enough for journal [ceph-osd3][ERROR ] RuntimeError: command returned non-zero exit status: 1 [ceph_deploy.osd][ERROR ] Failed to execute command: /usr/sbin/ceph-disk -v prepare --cluster ceph --fs-type xfs -- /dev/centos/cephosd [ceph_deploy][ERROR ] GenericError: Failed to create 1 OSDs
报错了,日志说明的狠清楚。OSD分区大小不能小于5120M。
2.3 重新添加磁盘10G
[root@ceph-osd3 ~]# fdisk -l /dev/sde WARNING: fdisk GPT support is currently new, and therefore in an experimental phase. Use at your own discretion. Disk /dev/sde: 10.7 GB, 10737418240 bytes, 20971520 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes
2.4 再次添加
[root@ceph-mon1 ceph-cluster]# ceph-deploy osd create ceph-osd3:/dev/sde [ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf [ceph_deploy.cli][INFO ] Invoked (1.5.36): /usr/bin/ceph-deploy osd create ceph-osd3:/dev/sde [ceph_deploy.cli][INFO ] ceph-deploy options: [ceph_deploy.cli][INFO ] username : None [ceph_deploy.cli][INFO ] disk : [('ceph-osd3', '/dev/sde', None)] [ceph_deploy.cli][INFO ] dmcrypt : False [ceph_deploy.cli][INFO ] verbose : False ····· [ceph-osd3][INFO ] Running command: systemctl enable ceph.target [ceph-osd3][INFO ] checking OSD status... [ceph-osd3][DEBUG ] find the location of an executable [ceph-osd3][INFO ] Running command: /bin/ceph --cluster=ceph osd stat --format=json [ceph_deploy.osd][DEBUG ] Host ceph-osd3 is now ready for osd use.
OK添加成功。
#查看集群状态,可以看到10个osd了。而且存储空间也增加了。
[root@ceph-osd1 ~]# ceph -s cluster 3fa8936a-118a-49aa-b31c-c6c728cb3b71 health HEALTH_ERR 9 pgs are stuck inactive for more than 300 seconds 24 pgs peering 9 pgs stuck inactive too few PGs per OSD (19 < min 30) monmap e7: 1 mons at {ceph-mon1=192.168.1.131:6789/0} election epoch 13, quorum 0 ceph-mon1 osdmap e68: 10 osds: 10 up, 10 in flags sortbitwise pgmap v190: 64 pgs, 1 pools, 0 bytes data, 0 objects 311 MB used, 45669 MB / 45980 MB avail 40 active+clean 21 peering 3 remapped+peering
查看这个osd分区信息
[root@ceph-osd3 ~]# fdisk -l /dev/sde WARNING: fdisk GPT support is currently new, and therefore in an experimental phase. Use at your own discretion. Disk /dev/sde: 10.7 GB, 10737418240 bytes, 20971520 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk label type: gpt # Start End Size Type Name 1 10487808 20971486 5G unknown ceph data 2 2048 10487807 5G unknown ceph journal
可以看到10G被用来分别存ceph data数据和ceph journal支持记录 IO 操作(jewel这个版本的新特性)
3 删除OSD
由于ceph-deploy没提供一键删除OSD的功能,所以只能去OSD节点手动删除。
要想缩减集群尺寸或替换硬件,可在运行时删除 OSD 。在 Ceph 里,一个 OSD 通常是一台主机上的一个 ceph-osd 守护进程、它运行在一个硬盘之上。如果一台主机上有多个数据盘,你得挨个删除其对应 ceph-osd 。通常,操作前应该检查集群容量,看是否快达到上限了,确保删除 OSD 后不会使集群达到 near full 比率。
注意:删除 OSD 时不要让集群达到 full ratio 值,删除 OSD 可能导致集群达到或超过 full ratio 值。
3.1 踢出OSD
3.1.1 查看刚添加的OSD
[root@ceph-osd3 ceph]# ceph osd tree ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -1 0.04898 root default -2 0.01469 host ceph-osd1 0 0.00490 osd.0 up 1.00000 1.00000 1 0.00490 osd.1 up 1.00000 1.00000 2 0.00490 osd.2 up 1.00000 1.00000 -3 0.01469 host ceph-osd2 3 0.00490 osd.3 up 1.00000 1.00000 4 0.00490 osd.4 up 1.00000 1.00000 5 0.00490 osd.5 up 1.00000 1.00000 -4 0.01959 host ceph-osd3 6 0.00490 osd.6 up 1.00000 1.00000 7 0.00490 osd.7 up 1.00000 1.00000 8 0.00490 osd.8 up 1.00000 1.00000 9 0.00490 osd.9 up 1.00000 1.00000
新添加的OSD的ID是一次递增的,所以刚刚添加的那个OSD ID是9.
3.1.2 将OSD踢出集群
[root@ceph-osd3 ceph]# ceph osd out 9 marked out osd.9.
3.1.3 观察数据迁移
一旦把 OSD 踢出( out )集群, Ceph 就会开始重新均衡集群、把归置组迁出将删除的 OSD 。你可以用 ceph 工具观察此过程。
ceph -w
你会看到归置组状态从 active+clean 变为 active, some degraded objects 、迁移完成后最终回到 active+clean 状态。( Ctrl-c 中止)
3.2 停止OSD进程
踢出集群的OSD状态是up 且 out,删除前要停止OSD进程
[root@ceph-osd3 ceph]# systemctl stop ceph-osd@9
3.2.1 查看集群状态
[root@ceph-osd3 ceph]# ceph -s cluster 3fa8936a-118a-49aa-b31c-c6c728cb3b71 health HEALTH_WARN too few PGs per OSD (21 < min 30) monmap e7: 1 mons at {ceph-mon1=192.168.1.131:6789/0} election epoch 13, quorum 0 ceph-mon1 osdmap e73: 10 osds: 9 up, 9 in flags sortbitwise pgmap v386: 64 pgs, 1 pools, 0 bytes data, 0 objects 312 MB used, 45668 MB / 45980 MB avail 64 active+clean
3.2.2 查看OSD进程状态
可以看到OSD.9已结DOWN了
[root@ceph-osd3 ceph]# ceph osd tree ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -1 0.04898 root default -2 0.01469 host ceph-osd1 0 0.00490 osd.0 up 1.00000 1.00000 1 0.00490 osd.1 up 1.00000 1.00000 2 0.00490 osd.2 up 1.00000 1.00000 -3 0.01469 host ceph-osd2 3 0.00490 osd.3 up 1.00000 1.00000 4 0.00490 osd.4 up 1.00000 1.00000 5 0.00490 osd.5 up 1.00000 1.00000 -4 0.01959 host ceph-osd3 6 0.00490 osd.6 up 1.00000 1.00000 7 0.00490 osd.7 up 1.00000 1.00000 8 0.00490 osd.8 up 1.00000 1.00000 9 0.00490 osd.9 down 0 1.00000
3.3 从CRUSH map移除OSD
[root@ceph-osd3 ceph]# ceph osd crush remove osd.9 removed item id 9 name 'osd.9' from crush map
3.4 移除OSD秘钥
[root@ceph-osd3 ceph]# ceph auth del osd.9 updated
3.5 移除OSD
[root@ceph-osd3 ~]# ceph osd rm 9 removed osd.9
3.6 修改配置文件
注意:要去管理节点修改配置文件,然后再传到各个节点。
Vi /etc/ceph/ceph.conf
删掉类似下面的内容(如果存在的话)
[osd.1]
host = {hostname}
3.7 最后把配置文件传到各个节点
3.8 查看结果
OSD.9没有了
[root@ceph-osd3 ~]# ceph osd tree ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -1 0.04408 root default -2 0.01469 host ceph-osd1 0 0.00490 osd.0 up 1.00000 1.00000 1 0.00490 osd.1 up 1.00000 1.00000 2 0.00490 osd.2 up 1.00000 1.00000 -3 0.01469 host ceph-osd2 3 0.00490 osd.3 up 1.00000 1.00000 4 0.00490 osd.4 up 1.00000 1.00000 5 0.00490 osd.5 up 1.00000 1.00000 -4 0.01469 host ceph-osd3 6 0.00490 osd.6 up 1.00000 1.00000 7 0.00490 osd.7 up 1.00000 1.00000 8 0.00490 osd.8 up 1.00000 1.00000
3.9 集群状态
集群状态也是正常的,crtl+c退出
[root@ceph-osd3 ~]# ceph -w cluster 3fa8936a-118a-49aa-b31c-c6c728cb3b71 health HEALTH_WARN too few PGs per OSD (21 < min 30) monmap e7: 1 mons at {ceph-mon1=192.168.1.131:6789/0} election epoch 13, quorum 0 ceph-mon1 osdmap e77: 9 osds: 9 up, 9 in flags sortbitwise pgmap v410: 64 pgs, 1 pools, 0 bytes data, 0 objects 313 MB used, 45667 MB / 45980 MB avail 64 active+clean 2016-10-26 08:50:52.538900 mon.0 [INF] pgmap v410: 64 pgs: 64 active+clean; 0 bytes data, 313 MB used, 45667 MB / 45980 MB avail
有点感言:还是要看正宗的官方文档,中文的官方文档还是不靠谱。
官方文档(英文):http://docs.ceph.com/docs/master/rados/operations/add-or-rm-osds/#removing-osds-manual
中文文档:http://docs.ceph.org.cn/rados/operations/add-or-rm-osds#id12
版权声明:本文为博主原创文章,未经博主允许不得转载。
osd 添加 删除