Warning FailedAttachVolume Multi-Attach error for volume
September 10, 2020Volume is already exclusively attached to one node and can’t be attached to another
Based on my previous post, this is meant to be succinct for others encountering the ‘Multi-Attach’ error. Kubernetes does not allow multiple nodes to mount (certain) volumes concurrently. A Kubernetes bug also exists that does not forcefully detach a pv from a node after the 6 minute timeout, causing multi-attach headaches. References:
- https://github.com/kubernetes-sigs/vsphere-csi-driver/issues/221
- https://github.com/kubernetes/kubernetes/issues/65392
- https://cormachogan.com/2019/06/18/kubernetes-storage-on-vsphere-101-failure-scenarios/
How it begins
Hmm.. Pods are not restarting, what did I do this time?
kubectl describe pod { podname }
Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 43m default-scheduler Successfully assigned nginx-x2hs2 to kubew05 Warning FailedAttachVolume 43m attachdetach-controller Multi-Attach error for volume "pvc-0a5eb91b-3720-11e8-8d2b-000c29f8a512" Volume is already exclusively attached to one node and can't be attached to another Normal SuccessfulMountVolume 43m kubelet, kubew05 MountVolume.SetUp succeeded for volume "default-token-scwwd" Warning FailedMount 51s (x19 over 41m) kubelet, kubew05 Unable to mount volumes for pod "nginx-x2hs2_project(c0e45e49-3721-11e8-8d2b-000c29f8a512)": timeout expired waiting for volumes to attach/mount for pod "project"/"nginx-x2hs2".
Primary issue found:
Warning FailedAttachVolume Multi-Attach error for volume "pvc-{guid}" Volume is already exclusively attached to one node and can't be attached to another
How does one recover from this situation? There a few ways to do this.
Start by identifying the volume attachments in question:
kubectl get volumeattachments
NAME ATTACHER PV NODE ATTACHED AGE csi-9f7704015b456f146ce8c6c3bd80a5ec6cc55f4f5bfb90c61c250d0b050a283c openebs-csi.openebs.io pvc-b39248ab-5a99-439b-ad6f-780aae30626c csi-node2.mayalabs.io true 66m
Kubernetes thinks the PVC is attached to a non-existent node, and since we have passed the 6m timeout the PVCs will not be detaching automatically. The bugs linked at the beginning and end of this post go into more detail as to why they may not detach from a node automatically (potential data corruption/loss is the main one).
Remove finalizers:
kubectl edit volumeattachment csi-xxxxxxxxx
Locate and comment out the finalizer with the # symbol (warning: if you use Windows, take care of line endings and tabs):
"finalizers": [ #"external-attacher/csi-vsphere-vmware-com" ],
Remove all volumeattachments finalizers (Warning, you may not want to remove all of your volumeattachments. It was necessary in my situation:
kubectl get volumeattachments| tail -n+2 | awk '{print $1}' | xargs -I{} kubectl patch volumeattachments {} --type='merge' -p '{"metadata":{"finalizers": null}}'
Your volume attachments should clear from the unavailable node, and your pods should begin reassigning to available resources.
$ kubectl get volumeattachments| tail -n+2 | awk ‘{print $1}’ | xargs -I{} kubectl patch volumeattachments {} –type=’merge’ -p ‘{“metadata”:{“finalizers”: [“external-attacher/csi-vsphere-vmware-com”]}}’
you can repatch the volumeattachtments, once the pods are ready.