部署rook并创建ceph集群
准备软件包
1 2 3
| wget https://github.com/rook/rook/archive/v1.4.4.tar.gz tar -zxvf v1.4.4.tar.gz cd rook-1.4.4/cluster/examples/kubernetes/ceph
|
修改operator.yaml
rook需要在master主机上部署node,master默认被打上了NoSchedule的taint,需要在operator上面添加容忍
1 2 3 4 5 6 7 8
| [root@d-paas-k8s-master-0 ceph-image]# vi operator.yaml CSI_PLUGIN_TOLERATIONS: | - effect: NoSchedule key: node-role.kubernetes.io/master operator: Exists - effect: NoExecute key: node-role.kubernetes.io/etcd operator: Exists
|
加载ceph-csi image
部署ceph的cluster时会部署ceph-csi,csi的docker image都是放在quay.io仓库下的。在我实际部署中发现,我们的网络环境即使配置了docker镜像加速也访问不了quay.io下的image,导致image pulling失败,csi pods启动不了。
所以我事先将ceph-csi所需的image打包成tgz,放在跳板机 /home/api/ceph-image下
部署rook之前,先将如下的image都scp到集群内所有的主机上
1 2 3 4 5 6 7 8 9 10
| [api@kfxqtyglpt ceph-image]$ pwd /home/api/ceph-image [api@kfxqtyglpt ceph-image]$ ll 总用量 1230580 -rwxr-xr-x. 1 api api 1049964544 11月 11 13:48 cephcsi.tgz -rwxr-xr-x. 1 api api 47385088 11月 11 13:49 csi-attacher.tgz -rwxr-xr-x. 1 api api 18313728 11月 11 13:49 csi-node-driver-register.tgz -rwxr-xr-x. 1 api api 49535488 11月 11 13:49 csi-provisioner.tgz -rwxr-xr-x. 1 api api 47319040 11月 11 13:49 csi-resizer.tgz -rwxr-xr-x. 1 api api 47581696 11月 11 13:49 csi-snapshotter.tgz
|
在每台主机上执行
1 2
| [root@d-paas-k8s-master-0 ~]# for i in `ls *.tgz`;do docker load -i $i;done [root@d-paas-k8s-master-0 ~]# yum install lvm2.x86_64 -y
|
部署rook operator
1 2 3 4 5
| [root@d-paas-k8s-master-0 ~]# cd /root/rook-1.4.4/cluster/examples/kubernetes/ceph [root@d-paas-k8s-master-0 ceph]# kubectl create -f common.yaml [root@d-paas-k8s-master-0 ceph]# kubectl create -f operator.yaml #等待rook-ceph下的pod都处于running状态 [root@d-paas-k8s-master-0 ceph]# kubectl get pods -n rook-ceph
|
部署ceph cluster
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68
| [root@d-paas-k8s-master-0 ceph]# kubectl create -f cluster.yaml
#等待rook-ceph下的pod都处于running或者completed状态 [root@d-paas-k8s-master-0 ceph]# kubectl get pods -n rook-ceph NAME READY STATUS RESTARTS AGE csi-cephfsplugin-4j7tt 3/3 Running 0 19h csi-cephfsplugin-l99bm 3/3 Running 0 19h csi-cephfsplugin-n5xvw 3/3 Running 0 19h csi-cephfsplugin-provisioner-598854d87f-n5ltz 6/6 Running 0 19h csi-cephfsplugin-provisioner-598854d87f-vnjbq 6/6 Running 0 19h csi-cephfsplugin-psmfq 3/3 Running 0 19h csi-rbdplugin-2zt76 3/3 Running 0 19h csi-rbdplugin-9jdwx 3/3 Running 0 19h csi-rbdplugin-ddzpk 3/3 Running 0 19h csi-rbdplugin-provisioner-dbc67ffdc-8xsbd 6/6 Running 0 19h csi-rbdplugin-provisioner-dbc67ffdc-jspd9 6/6 Running 0 19h csi-rbdplugin-wmqvm 3/3 Running 0 19h rook-ceph-crashcollector-d-paas-k8s-0-node-0-7b696f9f8d-zqb49 1/1 Running 0 19h rook-ceph-crashcollector-d-paas-k8s-0-node-1-645b49b659-gcrjz 1/1 Running 0 19h rook-ceph-crashcollector-d-paas-k8s-0-node-2-dbb5978b6-6pwhv 1/1 Running 0 19h rook-ceph-mgr-a-5977cf7cd7-dlmnj 1/1 Running 0 19h rook-ceph-mon-a-6cfc9f64cc-k8vdp 1/1 Running 0 19h rook-ceph-mon-b-574d74f4c9-lgl76 1/1 Running 0 19h rook-ceph-mon-c-fd6fcb588-rtzfz 1/1 Running 0 19h rook-ceph-operator-667756ddb6-rjr9v 1/1 Running 0 19h rook-ceph-osd-0-95dd775b6-757w5 1/1 Running 0 19h rook-ceph-osd-1-69c45949b5-8fphv 1/1 Running 0 19h rook-ceph-osd-2-847cb97d55-n87d4 1/1 Running 0 19h rook-ceph-osd-3-78b76c9475-2qsgw 1/1 Running 0 19h rook-ceph-osd-4-55c4cb85d8-zqvlq 1/1 Running 0 19h rook-ceph-osd-5-576db964d8-bwml7 1/1 Running 0 19h rook-ceph-osd-prepare-d-paas-k8s-0-node-0-ndr7g 0/1 Completed 0 21m rook-ceph-osd-prepare-d-paas-k8s-0-node-1-nfmjm 0/1 Completed 0 21m rook-ceph-osd-prepare-d-paas-k8s-0-node-2-xt9hf 0/1 Completed 0 21m rook-ceph-tools-7cc7fd5755-c44p9 1/1 Running 0 19h rook-discover-ktjcr 1/1 Running 0 19h rook-discover-mhcv7 1/1 Running 0 19h rook-discover-xppd6 1/1 Running 0 19h
#检查ceph状态 [root@d-paas-k8s-master-0 ceph]# kubectl create -f toolbox.yaml
#进入toolbox pod [root@d-paas-k8s-master-0 ceph]# kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') -- bash
#检查ceph集群状态,osd数量,osd状态 [root@rook-ceph-tools-7cc7fd5755-c44p9 /]# ceph status cluster: id: b56328d7-2256-4faa-a6f0-f8a684e1ab70 health: HEALTH_OK
services: mon: 3 daemons, quorum a,b,c (age 19h) mgr: a(active, since 24m) osd: 6 osds: 6 up (since 19h), 6 in (since 19h) #集群内总共有几块裸盘就应该有几个osd,本例三台node每台两块裸盘,总共6个裸盘,所以有6块osd
data: pools: 2 pools, 33 pgs objects: 1.66k objects, 4.8 GiB usage: 20 GiB used, 6.5 TiB / 6.5 TiB avail pgs: 33 active+clean
io: client: 1.3 MiB/s wr, 0 op/s rd, 83 op/s wr
#退出rook-ceph-tool容器 [root@rook-ceph-tools-7cc7fd5755-c44p9 /]# exit
|
创建storage class
1 2 3
| [root@d-paas-k8s-master-0 rbd]# kubectl apply -f /root/rook-1.4.4/cluster/examples/kubernetes/ceph/csi/rbd/storageclass.yaml #设置为默认storageclass [root@d-paas-k8s-master-0 rbd]# kubectl patch storageclass rook-ceph-block -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
|
本例中使用的storageclass是rbd也就是块存储,ceph还提供了文件存储的storage-class,/root/rook-1.4.4/cluster/examples/kubernetes/ceph/csi/cephfs/storageclass.yaml
rbd相较于cephfs性能要更优,但是不支持多mount,也就是每个pvc只能被一个pod挂载。
cephfs在读写大文件时性能比较优秀,读写小文件时性能较差,但是其支持多mount,所以在需要多个pod共享存储时就需要选用cephfs作为storage-class
创建pvc 验证csi
1 2 3 4 5 6 7 8 9
| [root@d-paas-k8s-master-0 rbd]# kubectl apply -f /root/rook-1.4.4/cluster/examples/kubernetes/ceph/csi/rbd/pvc.yaml persistentvolumeclaim/rbd-pvc created # 查看在default namespace下是否生成了name为rbd-pvc的pvc,status为bound [root@d-paas-k8s-master-0 rbd]# kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE rbd-pvc Bound pvc-5f1b5ce4-9d5b-40fb-b08d-5d37de576ea2 1Gi RWO rook-ceph-block 4s
# 验证完成删除rdb-pvc [root@d-paas-k8s-master-0 rbd]# kubectl delete pvc rbd-pvc
|
遇到的坑
部署完rook和ceph cluster后,尝试创建一个pvc,发现pvc状态一直时pending状态,describe pvc 提示等待csi创建pv。检查rook-ceph下的pod,发现并没有创建csi pod (csi-xxxx)。csi-pod是创建ceph cluster时,operater创建的,所以检查rook-operator log
1 2 3 4
| [root@d-paas-k8s-master-0 ~]# kubectl logs rook-ceph-operator-667756ddb6-rjr9v -nrook-ceph # E | ceph-csi: invalid csi version. failed to run CmdReporter rook-ceph-csi-detect-version successfully. failed to delete existing results ConfigMap rook-ceph-csi-detect-version. failed to delete ConfigMap rook-ceph-csi-detect-version. etcdserver: request timed out failed to complete ceph CSI version job
|
原因时operater在部署csi时会先启动一个rook-ceph-csi-detect-version 的job去和etcd做交互,具体交互什么内容目前还不清楚。但是需要这个job完成后才会去创建csi。这个job使用的image是上文提到的quay.io下的所以image无法拉下来,导致job长时间无法完成,最终超时。operator不会去创建csi。
这个问题解决方法就是安照本节开始的方法,提前把csi使用到的image都load到每台机器本地。
由于rook-ceph-csi-detect-version是和etcd做交互,那么这两方任意一个有问题都会导致csi无法创建。所以如果etcd在这期间出现了不稳定的状态,如leader重新选举导致无法提供服务,也会影响csi的创建。见https://github.com/rook/rook/issues/6291
如果node节点没有挂载裸盘,会有如下pod运行失败,影响后续storage class创建:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
| NAME READY STATUS RESTARTS AGE csi-cephfsplugin-6865x 3/3 Running 0 6m53s csi-cephfsplugin-7hdg7 3/3 Running 0 6m53s csi-cephfsplugin-pcmnh 3/3 Running 0 6m53s csi-cephfsplugin-provisioner-598854d87f-5h79c 6/6 Running 0 6m53s csi-cephfsplugin-provisioner-598854d87f-9r6z2 6/6 Running 0 6m53s csi-rbdplugin-9fkkp 3/3 Running 0 6m54s csi-rbdplugin-fjc86 3/3 Running 0 6m54s csi-rbdplugin-provisioner-dbc67ffdc-qsvs5 6/6 Running 0 6m54s csi-rbdplugin-provisioner-dbc67ffdc-zrthq 6/6 Running 0 6m54s csi-rbdplugin-x4gl4 3/3 Running 0 6m54s rook-ceph-crashcollector-t-docker02-659696b779-sjjtf 1/1 Running 0 5m55s rook-ceph-crashcollector-t-docker03-5856b9458-4rrm2 1/1 Running 0 5m5s rook-ceph-crashcollector-t-docker04-ff475547f-6mrvr 1/1 Running 0 4m27s rook-ceph-mgr-a-74d7d89b9-2bz72 1/1 Running 0 4m27s rook-ceph-mon-a-9d56b548-zgfqv 1/1 Running 0 5m55s rook-ceph-mon-b-d5f999ffb-64bs5 1/1 Running 0 5m49s rook-ceph-mon-c-56856c4cff-knb8g 1/1 Running 0 5m5s rook-ceph-operator-667756ddb6-nhbkk 1/1 Running 0 12m rook-ceph-osd-prepare-t-docker02-5rpfp 0/1 CrashLoopBackOff 5 4m26s rook-ceph-osd-prepare-t-docker03-kd5jt 0/1 CrashLoopBackOff 5 4m26s rook-ceph-osd-prepare-t-docker04-qg9kx 0/1 CrashLoopBackOff 5 4m25s rook-discover-kwwgq 1/1 Running 0 12m rook-discover-m4fb2 1/1 Running 0 12m rook-discover-rnpc5 1/1 Running 0 12m
|
如果osd创建失败,可以考虑重装rook或者重新挂在裸盘,清理请参考如下链接: https://rook.io/docs/rook/v1.4/ceph-teardown.html