We are seeing what I think is the same problem. Our Ceph cluster is new and many of the OSDs have never been assigned PGs. Whenever Openshift tries to drain a Ceph node in order to reboot it, for example when applying an update, the drain never terminates. Every 5s the log reports: ...
kured was draining a node for a reboot. the pods were drained but one strictly-local volume stayed attached, even though the pod (cnpg) was stopped. the moment I detached it via the UI (without force) the instance manager stopped and the node was able to finishing draining. Contributor ...