Rook with Ceph doesn't provision my Persistent Volume Claims!

Rook with Ceph doesn't provision my Persistent Volume Claims!

Do you like our work......we hire!

Never miss our publications about Open Source, big data and distributed systems, low frequency of one email every two months.

Ceph installation inside Kubernetes can be provisioned using Rook. Currently doing an internship at Adaltas, I was in charge of participating in the setup of a Kubernetes (k8s) cluster. To avoid breaking anything on our production cluster, we decided to experiment the installation of a k8s cluster on 3 virtual machines (one master node n1, two worker nodes n2 and n3) using Vagrant with VirtualBox on the backend and Ubuntu 18.10 as the OS.

During the installation of the test cluster, we encountered a problem with Rook using Ceph that prevented it from provisioning any Persistent Volume Claims (PVC). This article will detail how to make a basic installation of Rook with Ceph on virtual machines, the problem we experienced and how to solve it. But first…

…a quick reminder about the role of PVCs!

When a pod needs to store various data (logs or metrics for example) in a persistent fashion, it has to describe what kind of storage it needs (size, performance, …) in a PVC. The cluster will then provision a Persistent Volume (PV) if one matches the requirements of the PVC. The PV can either be provisionned statically if an administrator manually created a matching PV or dynamically. Manually creating PVs can be time consuming if a lot of them are required by pods, which is why it is interesting for the cluster to be able to provision them dynamically. To make the cluster able to dynamically provision a PV, the PVC must indicate a Storage Class it wants to use. If such a Storage Class is available on the cluster, a PV will be dynamically provisionned to the pod.

Here are some links you can follow if you want to learn more about PVCs, PVs and Storage Classes:

Installing Rook on your k8s cluster

The installation of a k8s cluster being out of the scope of this article, I will assume you already have a working k8s cluster up and running. If it is not the case you can easily find some documentation on the internet on how to quickly bootstrap a k8s cluster.

The process of installing Rook isn’t hard, it’s just a matter of applying some manifests. First step, clone the Rook git repo:

git clone https://github.com/rook/rook

Then switch to the latest release tag (which is v1.0.1 at the time of this writing) using:

git checkout v1.0.1

The files of interests (which are listed below) are located inside the folder cluster/examples/kubernetes/ceph.

  1. common.yaml
  2. operator.yaml
  3. cluster.yaml
  4. storageclass.yaml

Apply each of them in the order listed above using:

kubectl apply -f <file.yaml>

One last step is to set the storageclass resource, defined inside of the storageclass.yaml file we just applied, to be as the default storageclass in our cluster. This is achieved with the command:

kubectl patch storageclass rook-ceph-block \
  -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

The problem

The Rook cluster will take some time to deploy, while it pulls the Rook images and deploy the pods. After a few minutes the output of kubectl get pods -n rook-ceph should look like this:

NAME                                           READY   STATUS      RESTARTS   AGE
rook-ceph-agent-8zv7p                          1/1     Running     0          4m8s
rook-ceph-agent-ghwgl                          1/1     Running     0          4m8s
rook-ceph-mgr-a-6d8cf6d5d7-txnrj               1/1     Running     0          102s
rook-ceph-mon-a-588475cbdb-htt4h               1/1     Running     0          2m55s
rook-ceph-mon-b-5b7cdc894f-q6wwr               1/1     Running     0          2m47s
rook-ceph-mon-c-846fc479cb-96sjq               1/1     Running     0          119s
rook-ceph-operator-765ff54667-q5qk4            1/1     Running     0          4m43s
rook-ceph-osd-prepare-n2.k8s.test-d4p9w        0/2     Completed   0          80s
rook-ceph-osd-prepare-n3.k8s.test-lrkbc        0/2     Completed   0          80s
rook-discover-hxxtl                            1/1     Running     0          4m8s
rook-discover-mmdl5                            1/1     Running     0          4m8s

As we can see here, there are two pods called rook-ceph-osd-prepare... for which status is “Completed”. We expected some Object Storage Device (OSD) pods to appear once the status of the rook-ceph-osd-prepare... pods is completed but it is not the case here. Since the OSD pods are not appearing, anytime we will have a PVC, it won’t be provisioned by Rook and stays pending. We can see an example of this happening when trying to deploy a Gitlab instance with Helm. Here is the result of kubectl get pvc -n gitlab:

NAME                        STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS      AGE
gitlab-minio                Pending                                      rook-ceph-block   6m7s
gitlab-postgresql           Pending                                      rook-ceph-block   6m7s
gitlab-prometheus-server    Pending                                      rook-ceph-block   6m7s
gitlab-redis                Pending                                      rook-ceph-block   6m7s
repo-data-gitlab-gitaly-0   Pending                                      rook-ceph-block   6m6s

We can see no PVC is being provisioned even though they are assigned the correct Storage Class.

The solution

After some research we found that in fact, in order for Rook to work, we need to have a dedicated Storage Device that it can use to store the PVs. To fix this, we needed in our case to add a new virtual disk to our VMs through the VagrantFile file.

To create and attach a new virtual disk to a VirtualBox VM, we need to use the vboxmanage command or we can more conveniently define it directly in the VagrantFile like in this extract:

#[...]
  config.vm.define :n2 do |node|
    node.vm.box = box
    node.vm.hostname = "n2"
    node.vm.network :private_network, ip: "10.10.10.53"
    node.vm.provider "virtualbox" do |d|
      d.customize ["modifyvm", :id, "--memory", 4096]
      d.customize ["modifyvm", :id, "--cpus", 2]
      d.customize ["modifyvm", :id, "--ioapic", "on"]

      # Creating a virtual disk called "disk_osd-n2" with a size of 125GB
      d.customize ["createhd", "--filename", "disk_osd-n2", "--size", 125 * 1024]

      # Attaching the newly created virtual disk to our node
      d.customize ["storageattach", :id, "--storagectl", "SCSI", "--port", 3, "--device", 0, "--type", "hdd", "--medium", "disk_osd-n2.vdi"]
    end
  end
#[...]

The field following "--storagectl" in the last line needs to match the exact name of one of your VM’s storage controllers. Those names can be obtained from the command below, where the VM name comes from VBoxManage list vms:

VBoxManage showvminfo <vm-name>| grep "Storage Controller"

Select the name of a Storage Controller with free ports from the output, and replace SCSI in the above config with this name.

If we run again the whole installation process, we can see that the OSD pods are appearing:

NAME                                           READY   STATUS      RESTARTS   AGE
rook-ceph-agent-gs4sn                          1/1     Running     0          3m55s
rook-ceph-agent-hwrrf                          1/1     Running     0          3m55s
rook-ceph-mgr-a-dbdffd588-v2x2b                1/1     Running     0          75s
rook-ceph-mon-a-f5d5d4654-nmk6j                1/1     Running     0          2m28s
rook-ceph-mon-b-6c98476587-jq2s5               1/1     Running     0          104s
rook-ceph-mon-c-6f9f7f5bd6-8r8qw               1/1     Running     0          91s
rook-ceph-operator-765ff54667-vqj4p            1/1     Running     0          4m29s
rook-ceph-osd-0-5cf569ddf5-rw827               1/1     Running     0          28s   <== Here!
rook-ceph-osd-1-7577f777f9-vjxml               1/1     Running     0          22s   <== Also here
rook-ceph-osd-prepare-n2.k8s.test-bdw2g        0/2     Completed   0          51s
rook-ceph-osd-prepare-n3.k8s.test-26d86        0/2     Completed   0          51s
rook-discover-mblm6                            1/1     Running     0          3m55s
rook-discover-wsk2z                            1/1     Running     0          3m55s

And if we check the PVCs created by the Helm installation of Gitlab:

NAME                        STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      AGE
gitlab-minio                Bound    pvc-5f32d9f5-7d76-11e9-b3fe-02897c39bcfa   10Gi       RWO            rook-ceph-block   19s
gitlab-postgresql           Bound    pvc-5f342615-7d76-11e9-b3fe-02897c39bcfa   8Gi        RWO            rook-ceph-block   19s
gitlab-prometheus-server    Bound    pvc-5f34feb5-7d76-11e9-b3fe-02897c39bcfa   8Gi        RWO            rook-ceph-block   19s
gitlab-redis                Bound    pvc-5f3a0d3d-7d76-11e9-b3fe-02897c39bcfa   5Gi        RWO            rook-ceph-block   19s
repo-data-gitlab-gitaly-0   Bound    pvc-5fe63ad2-7d76-11e9-b3fe-02897c39bcfa   50Gi       RWO            rook-ceph-block   17s

The PVCs are finally provisioned!

A step further, customizing cluster.yaml

You may have noticed that we didn’t give any information to Rook on how to find an appropriate device to use for storage; it just autonomously detected the one we attached to the VM and used it. For a lot of obvious reasons, this certainly isn’t a desired behavior as in a real-life context we could have numerous devices attached having different purpose than simply providing storage to Rook. It is of course possible to customize the way it finds and use the storage. It is defined inside the cluster.yaml manifest. The related category of the manifest is storage. Below is the default configuration:

storage:
  useAllNodes: true
  useAllDevices: true # <==
  deviceFilter:
  location:
  config:

The useAllDevices field is set to true. From the official documentation of Rook: it indicates “whether all devices found on nodes in the cluster should be automatically consumed by OSDs”. The solution is to indicate to Rook where to look instead of automatically select any available device. If we set useAllDevices to false, we can use the following fields:

  1. deviceFilter to set a regex filter; for example ^sd[a-d] to find a device that starts with “sd” followed by a, b, c or d,
  2. devices to define a list of individual devices that will be used,
  3. directories to set a list of directories which will be used as the cluster storage.

It is also possible to define per-node configurations by setting useAllNodes to false, but this is out of the scope of this article. If you want to learn more about storage configuration for Rook, please take a look a the documention.

The end

Thank you for having read this article, I hope it has brought you some light if you were having the same problem!

Share this article

Canada - Morocco - France

We are a team of Open Source enthusiasts doing consulting in Big Data, Cloud, DevOps, Data Engineering, Data Science…

We provide our customers with accurate insights on how to leverage technologies to convert their use cases to projects in production, how to reduce their costs and increase the time to market.

If you enjoy reading our publications and have an interest in what we do, contact us and we will be thrilled to cooperate with you.

Support Ukrain