Key | Value |
---|---|
Summary | In this tutorial, you will learn how to deploy a 3 node Charmed Kubernetes cluster that uses Ceph storage. We will use Juju and MAAS to deploy our cluster. |
Categories | cloud, containers, server |
Difficulty | 5 |
Author | Syed Mohammad Adnan Karim syed.karim@canonical.com |
Overview
Duration: 1:00
In this tutorial, you will learn how to deploy a 3 node Charmed Kubernetes cluster that uses Ceph storage. We will use Juju and MAAS to deploy our cluster.
What is Kubernetes?
Kubernetes clusters host containerised applications in a reliable and scalable way. Having DevOps in mind, Kubernetes makes maintenance tasks such as upgrades and security patching simple.
What is Ceph?
Ceph is a software-defined storage solution designed to address the object, block, and file storage needs of data centres adopting open source as the new norm for high-growth block storage, object stores and data lakes. Ceph provides enterprise scalable storage while keeping CAPEX and OPEX costs in line with underlying bulk commodity disk prices.
What you’ll learn
- How to deploy Charmed Kubernetes and Ceph with Juju
- How to create Ceph pools to be used with Kubernetes with Juju
- How to create PersistentVolumeClaims that use Ceph StorageClasses
What you’ll need
- 3 nodes with at least 2 disks and 1 network interface
- Access to a MAAS environment setup with the 3 nodes in the ‘Ready’ state
- A Juju controller setup to use the above MAAS cloud
- The kubectl client installed
- The bundle.yaml saved to a file
Edit bundle.yaml to contain the correct OSD devices
Duration: 2:00
Before deploying our bundle.yaml, we must ensure that our Ceph charm is configured to use the correct OSD devices.
ceph-osd:
charm: cs:ceph-osd
num_units: 3
options:
osd-devices: /dev/sdb /dev/sdc
source: distro
bindings:
"": oam-space
to:
- 1001
- 1002
- 1003
Notice that the osd-devices configuration above matches the Available disks and partitions section of the node in the image below:
Deploy the bundle.yaml
Duration: 10:00
Deploy the bundle with:
$ juju deploy ./bundle.yaml
A successful deployment should look similar to the following juju status output:
$ juju status
Model Controller Cloud/Region Version SLA Timestamp
k8s orangebox100-default OrangeBox100/default 2.8.6 unsupported 18:22:29-08:00
App Version Status Scale Charm Store Rev OS Notes
ceph-mon 15.2.7 active 3 ceph-mon jujucharms 51 ubuntu
ceph-osd 15.2.7 active 3 ceph-osd jujucharms 306 ubuntu
containerd 1.3.3 active 5 containerd jujucharms 97 ubuntu
easyrsa 3.0.1 active 1 easyrsa jujucharms 339 ubuntu
etcd 3.4.5 active 3 etcd jujucharms 544 ubuntu
flannel 0.11.0 active 5 flannel jujucharms 513 ubuntu
kubeapi-load-balancer 1.18.0 active 1 kubeapi-load-balancer jujucharms 753 ubuntu exposed
kubernetes-control-plane 1.19.6 active 2 kubernetes-control-plane jujucharms 912 ubuntu
kubernetes-worker 1.19.6 active 3 kubernetes-worker jujucharms 713 ubuntu exposed
Unit Workload Agent Machine Public address Ports Message
ceph-mon/0 active idle 0/lxd/0 172.27.100.168 Unit is ready and clustered
ceph-mon/1* active idle 1/lxd/0 172.27.100.165 Unit is ready and clustered
ceph-mon/2 active idle 2/lxd/0 172.27.100.172 Unit is ready and clustered
ceph-osd/0* active idle 0 172.27.100.107 Unit is ready (2 OSD)
ceph-osd/1 active idle 1 172.27.100.110 Unit is ready (2 OSD)
ceph-osd/2 active idle 2 172.27.100.105 Unit is ready (2 OSD)
easyrsa/0* active idle 0/lxd/1 172.27.100.174 Certificate Authority connected.
etcd/0 active idle 0/lxd/2 172.27.100.170 2379/tcp Healthy with 3 known peers
etcd/1* active idle 1/lxd/1 172.27.100.166 2379/tcp Healthy with 3 known peers
etcd/2 active idle 2/lxd/1 172.27.100.173 2379/tcp Healthy with 3 known peers
kubeapi-load-balancer/0* active idle 0/lxd/3 172.27.100.169 443/tcp Loadbalancer ready.
kubernetes-control-plane/0* active idle 1/lxd/2 172.27.100.167 6443/tcp Kubernetes control-plane running.
containerd/4 active idle 172.27.100.167 Container runtime available
flannel/4 active idle 172.27.100.167 Flannel subnet 10.1.83.1/24
kubernetes-control-plane/1 active idle 2/lxd/2 172.27.100.171 6443/tcp Kubernetes control-plane running.
containerd/3 active idle 172.27.100.171 Container runtime available
flannel/3 active idle 172.27.100.171 Flannel subnet 10.1.35.1/24
kubernetes-worker/0* active idle 0 172.27.100.107 80/tcp,443/tcp Kubernetes worker running.
containerd/1 active idle 172.27.100.107 Container runtime available
flannel/1 active idle 172.27.100.107 Flannel subnet 10.1.86.1/24
kubernetes-worker/1 active idle 1 172.27.100.110 80/tcp,443/tcp Kubernetes worker running.
containerd/0* active idle 172.27.100.110 Container runtime available
flannel/0* active idle 172.27.100.110 Flannel subnet 10.1.27.1/24
kubernetes-worker/2 active idle 2 172.27.100.105 80/tcp,443/tcp Kubernetes worker running.
containerd/2 active idle 172.27.100.105 Container runtime available
flannel/2 active idle 172.27.100.105 Flannel subnet 10.1.88.1/24
Machine State DNS Inst id Series AZ Message
0 started 172.27.100.107 node05ob100 focal default Deployed
0/lxd/0 started 172.27.100.168 juju-1be73e-0-lxd-0 focal default Container started
0/lxd/1 started 172.27.100.174 juju-1be73e-0-lxd-1 focal default Container started
0/lxd/2 started 172.27.100.170 juju-1be73e-0-lxd-2 focal default Container started
0/lxd/3 started 172.27.100.169 juju-1be73e-0-lxd-3 focal default Container started
1 started 172.27.100.110 node07ob100 focal default Deployed
1/lxd/0 started 172.27.100.165 juju-1be73e-1-lxd-0 focal default Container started
1/lxd/1 started 172.27.100.166 juju-1be73e-1-lxd-1 focal default Container started
1/lxd/2 started 172.27.100.167 juju-1be73e-1-lxd-2 focal default Container started
2 started 172.27.100.105 node06ob100 focal default Deployed
2/lxd/0 started 172.27.100.172 juju-1be73e-2-lxd-0 focal default Container started
2/lxd/1 started 172.27.100.173 juju-1be73e-2-lxd-1 focal default Container started
2/lxd/2 started 172.27.100.171 juju-1be73e-2-lxd-2 focal default Container started
The deployment should reach the above state in about 10 minutes (depending on hardware).
Congrats, we have a kubernetes cluster up and running at this point!
Verify that Ceph StorageClasses were created
Duration: 2:00
Copy the kubeconfig file from a kubernetes-control-plane node
$ mkdir -p .kube
$ juju scp kubernetes-control-plane/0:~/config .kube/
Read Kubernetes StorageClasses
$ kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
ext4-pool rbd.csi.ceph.com Delete Immediate true 5d12h
xfs-pool (default) rbd.csi.ceph.com Delete Immediate true 5d12h
Great, our storageClasses were setup as expected! Now we will need to create Ceph pools to match our storageClasses so that we can use them with our kubernetes workloads.
Create Ceph pools
Duration: 5:00
List Ceph pools
$ juju run-action --wait ceph-mon/leader list-pools
unit-ceph-mon-1:
UnitId: ceph-mon/1
id: "2"
results:
message: |
1 device_health_metrics
status: completed
timing:
completed: 2020-12-29 19:08:31 +0000 UTC
enqueued: 2020-12-29 19:08:30 +0000 UTC
started: 2020-12-29 19:08:30 +0000 UTC
Create the Ceph xfs-pool
$ juju run-action --wait ceph-mon/leader create-pool name=xfs-pool
unit-ceph-mon-1:
UnitId: ceph-mon/1
id: "5"
results:
Stderr: |
pool 'xfs-pool' created
set pool 2 size to 3
set pool 2 target_size_ratio to 0.1
enabled application 'unknown' on pool 'xfs-pool'
status: completed
timing:
completed: 2020-12-29 19:42:26 +0000 UTC
enqueued: 2020-12-29 19:42:19 +0000 UTC
started: 2020-12-29 19:42:19 +0000 UTC
List the Ceph pools again to verify that the new pool is created:
$ juju run-action --wait ceph-mon/leader list-pools
unit-ceph-mon-1:
UnitId: ceph-mon/1
id: "9"
results:
message: |
1 device_health_metrics
2 xfs-pool
status: completed
timing:
completed: 2020-12-29 19:50:14 +0000 UTC
enqueued: 2020-12-29 19:50:13 +0000 UTC
started: 2020-12-29 19:50:13 +0000 UTC
Congratulations, we have created our new Ceph pool and now we are ready to use them with kubernetes!
Verify Ceph backed PersistentVolumeClaim functionality
Duration: 5:00
Create a PersistentVolumeClaim
Use the following claim.json file to create a PersistentVolumeClaim:
$ cat claim.json
{
"kind": "PersistentVolumeClaim",
"apiVersion": "v1",
"metadata": {
"name": "myvol"
},
"spec": {
"accessModes": [
"ReadWriteOnce"
],
"resources": {
"requests": {
"storage": "4Gi"
}
},
"storageClassName": "ceph-xfs"
}
}
$ kubectl apply -f claim.json
Check the status of the PersistentVolumeClaim
$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
myvol Bound pvc-8987ad38-3888-4dd8-94c1-39868792c37e 4Gi RWO ceph-xfs 35m
Create a ReplicationController that uses the Ceph backed PVC
Use the following pod.yaml file to create a ReplicationController:
$ cat pod.yaml
apiVersion: v1
kind: ReplicationController
metadata:
name: server
spec:
replicas: 1
selector:
role: server
template:
metadata:
labels:
role: server
spec:
containers:
- name: server
image: nginx
volumeMounts:
- mountPath: /var/lib/www/html
name: myvol
volumes:
- name: myvol
persistentVolumeClaim:
claimName: myvol
$ kubectl apply -f pod.yaml
Check the status of the ReplicationController pod
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
csi-rbdplugin-hsnjl 3/3 Running 0 6d9h
csi-rbdplugin-md8zd 3/3 Running 0 6d9h
csi-rbdplugin-nhc6t 3/3 Running 0 6d9h
csi-rbdplugin-provisioner-549c6b54c6-2ts2x 6/6 Running 0 6d9h
csi-rbdplugin-provisioner-549c6b54c6-8f7v9 6/6 Running 0 6d9h
csi-rbdplugin-provisioner-549c6b54c6-l59nr 6/6 Running 1 6d9h
server-48g2s 1/1 Running 0 39m
$ kubectl describe pod server-48g2s
Name: server-48g2s
Namespace: default
Priority: 0
Node: node06ob100/172.27.100.105
...
Volumes:
myvol:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: myvol
ReadOnly: false
default-token-bptwd:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-bptwd
Optional: false
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 41m default-scheduler Successfully assigned default/server-48g2s to node06ob100
Normal SuccessfulAttachVolume 41m attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-8987ad38-3888-4dd8-94c1-39868792c37e"
Normal Pulling 41m kubelet Pulling image "nginx"
Normal Pulled 41m kubelet Successfully pulled image "nginx" in 7.936656052s
Normal Created 41m kubelet Created container server
Normal Started 41m kubelet Started container server
Log in to the container and check that the volume is mounted
$ kubectl exec -it server-48g2s -- bash
root@server-48g2s:/#
root@server-48g2s:/# df -h
Filesystem Size Used Avail Use% Mounted on
...
/dev/rbd0 4.0G 33M 4.0G 1% /var/lib/www/html
root@server-48g2s:/# exit
Now our pod has an RBD mount!
Cleanup the ReplicationController
$ kubectl delete replicationcontrollers/server
replicationcontroller "server" deleted
$ kubectl get replicationcontrollers
No resources found.
To delete the myvol pvc the replicationcontroller server must be deleted beforehand!
Wrap-up
Congratulations, you now have a highly-available multi-node Kubernetes with Ceph backed storage to orchestrate your containers.
ⓘ To test your understanding of this tutorial, complete the following steps by yourself:
- Create the ext4-pool
- Create a PersistentVolumeClaim that is backed by the ext4-pool
- Create a ReplicationController that uses the ext4-pool backed PVC