Rolling Back RKE2
You can roll back the RKE2 Kubernetes version after an upgrade, using a combination of RKE2 binary downgrade and datastore restoration. Rollback can be performed on clusters of all types, including a single-node SQLite, an external datastore, or an embedded etcd. When rolling back to a previous Kubernetes minor version, you must have a datastore snapshot taken on the Kubernetes minor version you wish to roll back to.
If you cannot restore the database, you cannot roll back to a previous minor version.
Important Considerations
- Backups: Before upgrading, ensure you have a valid database or etcd snapshot from your cluster running the older version of RKE2. Without a backup, a rollback is impossible.
- Potential Data Loss: The
rke2-killall.sh
script forcefully terminates RKE2 processes and may result in data loss if applications are not properly shut down. - Version Specifics: Always verify RKE2 and component versions before and after the rollback.
Rolling Back an RKE2 Cluster
- SQLite
- Embedded etcd
- External Database
To roll back an RKE2 cluster when using a SQLite database, replace the .db
file with the copy of the .db
file you made while backing up your database.
To roll back an RKE2 cluster when using an embedded etcd (default), follow these steps:
-
If the cluster is running and the Kubernetes API is available, gracefully stop workloads by draining all nodes:
kubectl drain --ignore-daemonsets --delete-emptydir-data <NODE-ONE-NAME> <NODE-TWO-NAME> <NODE-THREE-NAME> ...
-
On each node, stop the RKE2 service and all running pod processes:
rke2-killall.sh
-
On each node, roll back the RKE2 binary to the previous version.
-
Clusters with Internet Access:
-
Server nodes:
curl -sfL https://get.rke2.io | INSTALL_RKE2_VERSION=vX.Y.Zrke2r1 sh -
-
Agent nodes:
curl -sfL https://get.rke2.io | INSTALL_RKE2_VERSION=vX.Y.Zrke2r1 INSTALL_RKE2_TYPE=agent sh -
-
-
Air-gapped Clusters:
- Download the artifacts and run the install script locally.
-
-
On the first server node or the node without a
server:
entry in its RKE2 config file, initiate the cluster restore. Refer to the Snapshot Restore Steps for more information:rke2 server --cluster-reset --cluster-reset-restore-path=<PATH-TO-SNAPSHOT>
注意This overwrites all data in the etcd datastore. Verify the snapshot's integrity before restoring. Be aware that large snapshots can take a long time to restore.
-
Start the RKE2 service on the first server node:
systemctl start rke2-server
-
On the other server nodes, remove the RKE2 database directory:
rm -rf /var/lib/rancher/rke2/server/db
-
Start the RKE2 service on the other server nodes:
systemctl start rke2-server
-
Start the RKE2 service on all agent nodes:
systemctl start rke2-agent
-
Verify the RKE2 service status with
systemctl status rke2-server
orsystemctl status rke2-agent
.
To roll back an RKE2 cluster when using an external database (e.g., PostgreSQL, MySQL), follow these steps:
-
If the cluster is running and the Kubernetes API is available, gracefully stop workloads by draining all nodes:
kubectl drain --ignore-daemonsets --delete-emptydir-data <NODE-ONE-NAME> <NODE-TWO-NAME> <NODE-THREE-NAME> ...
备注This process may disrupt running applications.
-
On each node, stop the RKE2 service and all running pod processes:
rke2-killall.sh
-
Restore a database snapshot taken before upgrading RKE2 and verify the integrity of the database. For example, if you're using PostgreSQL, run the following command:
pg_restore -U <DB-USER> -d <DB-NAME> <BACKUP-FILE>
-
On each node, roll back the RKE2 binary to the previous version.
-
Clusters with Internet Access:
-
Server nodes:
curl -sfL https://get.rke2.io | INSTALL_RKE2_VERSION=vX.Y.Zrke2r1 sh -
-
Agent nodes:
curl -sfL https://get.rke2.io | INSTALL_RKE2_VERSION=vX.Y.Zrke2r1 INSTALL_RKE2_TYPE=agent sh -
-
-
Air-gapped Clusters:
- Download the artifacts and run the install script locally.
-
-
Start the RKE2 service on each node:
systemctl start rke2-server #or rke2-agent
-
Verify the RKE2 service status with
systemctl status rke2-server
orsystemctl status rke2-agent
.
Verification
After the rollback, verify the following:
- RKE2 version:
rke2 --version
- Kubernetes cluster health:
kubectl get nodes
- Application functionality.
- Check the RKE2 logs for errors.