Why do I see the error Cannot complete cluster master upgrade because there is a migration in progress?
Virtual Private Cloud Classic infrastructure
You see the following error message during master upgrade.
Cannot complete cluster master upgrade because there is a migration in progress
You are upgrading the cluster master, but some resources were still being migrated from a previous update.
For example, if this was a master update from IBM Cloud Kubernetes Service version 1.31 to 1.32, the Tigera Operator didn't yet complete its migration from the previous IBM Cloud Kubernetes Service version 1.30 to 1.31 update.
To resolve the issue, first wait longer. Larger clusters take longer to complete the migration. It takes approximately 100 seconds per node after the master is successfully updated for the migration to complete. The Tigera Operator is required to perform several actions and can involve spinning up and down pods across nodes.
However, it's possible the migration could have gotten stuck. Check if the calico-typha and calico-node pods were removed from the kube-system namespace and created in the calico-system namespace.
If those resources have not been moved, there might be an issue with one or more worker nodes.
To troubleshoot the migration:
-
Check the status of the worker nodes.
kubectl get nodesIf you see a worker node that is not in the
Readystate, such as in theNotReadyorSchedulingDisabledstate, the migration might be stuck.NotReadyexample:NAME STATUS ROLES AGE VERSION 10.177.112.32 NotReady <none> 2d2h v1.30.0+IKS 10.177.112.50 Ready <none> 2d2h v1.30.0+IKS 10.177.112.52 Ready <none> 2d2h v1.30.0+IKSSchedulingDisabledexample:NAME STATUS ROLES AGE VERSION 10.177.112.32 Ready,SchedulingDisabled <none> 95m v1.30.0+IKS 10.177.112.50 Ready <none> 95m v1.30.0+IKS 10.177.112.52 Ready <none> 95m v1.30.0+IKS
Result:
When the worker nodes are healthy, the calico-typha and calico-node pods can resume scaling down in the kube-system namespace and scaling up in the calico-system namespace.
To confirm the migration is complete:
-
Verify that the
calico-typhadeployment no longer exists in thekube-systemnamespace.kubectl get deployment calico-typha -n kube-systemResult:
Error from server (NotFound): deployments.apps "calico-typha" not found -
Verify there aren't any nodes with the
projectcalico.org/operator-node-migrationlabel.kubectl get nodes -l projectcalico.org/operator-node-migrationResult:
No resources found -
If the migration is still stuck, replace or remove the problematic nodes. For more information, see Debugging worker nodes.
When you have confirmed that the migration is complete, proceed with the master update to IBM Cloud Kubernetes Service version 1.30.