Proxmox upgrade from 5.x to 6.x (Debian stretch to buster)

Proxmox Virtual Environment (PVE) is an open source server virtualization management solution based on QEMU/KVM and LXC. You can manage virtual machines, containers, highly available clusters, storage and networks with an integrated, easy-to-use web interface or via CLI.

Preconditions

  • Upgrade to the latest version of Proxmox VE 5.4.
  • Reliable access to all configured storage.
  • The root account should have a password set (that you remember). Indeed, as part of the upgrade process, the sudo package will be uninstalled, and you will not be able to log in again as root if that latter has no password set
    A healthy cluster.
  • Valid and tested backup of all VMs and CTs (in case something goes wrong).
  • Correct configuration of the repository.
  • At least 1GB free disk space at root mount point.
  • Ceph: upgrade the Ceph cluster to Nautilus after you have upgraded: Follow the guide Ceph Luminous to Nautilus
  • Check known upgrade issues

Testing the upgrade
An upgrade test can easily be performed using a standalone server first. Install the Proxmox VE 5.4 ISO on some test hardware; then upgrade this installation to the latest minor version of Proxmox VE 5.4 (see Package repositories). To replicate the production setup as closely as possible, copy or create all relevant configurations to the test machine. Then start the upgrade. It is also possible to install Proxmox VE 5.4 in a VM and test the upgrade in this environment.

Actions step-by-step
The following actions need to be done on the command line of each Proxmox VE node in your cluster (via console or ssh; preferably via console to avoid interrupted ssh connections). Remember: make sure that a valid backup of all VMs and CTs has been created before proceeding.

Continuously use the pve5to6 checklist script
A small checklist program named pve5to6 is included in the latest Proxmox VE 5.4 packages. The program will provide hints and warnings about potential issues before, during and after the upgrade process. One can call it by executing:

pve5to6

Output:

 CHECKING VERSION INFORMATION FOR PVE PACKAGES =

Checking for package updates..
PASS: all packages uptodate

Checking proxmox-ve package version..
PASS: proxmox-ve package has version >= 5.4-2

Checking running kernel version..
PASS: expected running kernel '4.15.18-20-pve'.

= CHECKING CLUSTER HEALTH/SETTINGS =

SKIP: standalone node.

= CHECKING HYPER-CONVERGED CEPH STATUS =

SKIP: no hyper-converged ceph setup detected!

= CHECKING CONFIGURED STORAGES =

PASS: storage 'local' enabled and active.
PASS: storage 'local-lvm' enabled and active.

= MISCELLANEOUS CHECKS =

INFO: Checking common daemon services..
PASS: systemd unit 'pveproxy.service' is in state 'active'
PASS: systemd unit 'pvedaemon.service' is in state 'active'
PASS: systemd unit 'pvestatd.service' is in state 'active'
INFO: Checking for running guests..
PASS: no running guest detected.
INFO: Checking if the local node's hostname 'pve115' is resolvable..
INFO: Checking if resolved IP is configured on local node..
PASS: Resolved node IP '192.168.0.115' configured and active on single interface.
INFO: Check node certificate's RSA key size
PASS: Certificate 'pve-root-ca.pem' passed Debian Busters security level for TLS connections (4096 >= 2048)
PASS: Certificate 'pve-ssl.pem' passed Debian Busters security level for TLS connections (2048 >= 2048)
INFO: Checking KVM nesting support, which breaks live migration for VMs using it..
PASS: KVM nested parameter not set.

= SUMMARY =

TOTAL:    15
PASSED:   13
SKIPPED:  2
WARNINGS: 0
FAILURES: 0

This script only checks and reports things. By default, no changes to the system are made and thus none of the issues will be automatically fixed. One should have in mind that Proxmox VE can be heavily customized, so the script may not recognize all the possible problems of a particular setup!

It is recommended to re-run the script after each attempt to fix an issue. This ensures that the actions taken actually fixed the respective warning.

Cluster: always upgrade to Corosync 3 first
With Corosync 3 the on-the-wire format has changed. It is now incompatible with Corosync 2.x because it switched out the underlying multicast UDP stack with kronosnet. Configuration files generated by a Proxmox VE with version 5.2 or newer, are already compatible with the new Corosync 3.x (at least enough to process the upgrade without any issues).

Important Note: before the upgrade, stop all HA management services first—no matter which way you choose for upgrading to Corosync 3. Stopping all HA services ensures that no cluster nodes get fenced during the upgrade. This also means that there will not be any HA functionality available for the short duration of the Corosync upgrade.

First, make sure that all warnings that are reported by the checklist script and not related to Corosync are fixed or determined to be benign/false negatives. Next, stop the local resource manager “pve-ha-lrm” on each node. Only after they have been stopped, also stop the cluster resource manager “pve-ha-crm” on each node; use the GUI (Node -> Services) or the CLI by running the following command on each node:

systemctl stop pve-ha-lrm

Only after the above was done for all nodes, run the following on each node:

systemctl stop pve-ha-crm

Then add the Proxmox Corosync 3 Stretch repository:

echo "deb http://download.proxmox.com/debian/corosync-3/ stretch main" > /etc/apt/sources.list.d/corosync3.list

and run

apt update

Then make sure again that only corosync, kronosnet and their libraries will be updated or newly installed:

apt list --upgradeable

Output:

Listing... Done
corosync/stable 3.0.2-pve2~bpo9 amd64 [upgradable from: 2.4.4-pve1]
libcmap4/stable 3.0.2-pve2~bpo9 amd64 [upgradable from: 2.4.4-pve1]
libcorosync-common4/stable 3.0.2-pve2~bpo9 amd64 [upgradable from: 2.4.4-pve1]
libcpg4/stable 3.0.2-pve2~bpo9 amd64 [upgradable from: 2.4.4-pve1]
libqb0/stable 1.0.5-1~bpo9+2 amd64 [upgradable from: 1.0.3-1~bpo9]
libquorum5/stable 3.0.2-pve2~bpo9 amd64 [upgradable from: 2.4.4-pve1]
libvotequorum8/stable 3.0.2-pve2~bpo9 amd64 [upgradable from: 2.4.4-pve1]

There are two ways to proceed with the Corosync upgrade:

  • Upgrade nodes one by one. Initially, the newly upgraded node(s) will not have be quorate on their own. Once at least half of the nodes plus one have been upgraded, the upgraded partition will become quorate and the not-yet-upgraded partition will lose quorum. Once all nodes have been upgraded, they should form a healthy, quorate cluster again.
  • Upgrade all nodes simultaneously, e.g. using parallel ssh/screen/tmux.
    Note: changes to any VM/CT or the cluster in general are not allowed for the duration of the upgrade!

Pre-download the upgrade to corosync-3 on all nodes, e.g., with:

apt dist-upgrade --download-only

Then run the actual upgrade on all nodes:

apt dist-upgrade

At any point in this procedure, the local view of the cluster quorum on a node can be verified with:

pvecm status

Once the update to Corosync 3.x is done on all nodes, restart the local resource manager and cluster resource manager on all nodes:

systemctl start pve-ha-lrm
systemctl start pve-ha-crm

Move important Virtual Machines and Containers
If any VMs and CTs need to keep running for the duration of the upgrade, migrate them away from the node that is currently upgraded. A migration of a VM or CT from an older version of Proxmox VE to a newer version will always work. A migration from a newer Proxmox VE version to an older version may work, but is in general not supported. Keep this in mind when planning your cluster upgrade.

Update the configured APT repositories
First, make sure that the system is running using the latest Proxmox VE 5.4 packages:

apt update
apt dist-upgrade

Update all Debian repository entries to Buster.

sed -i 's/stretch/buster/g' /etc/apt/sources.list

Disable all Proxmox VE 5.x repositories. This includes the pve-enterprise repository, the pve-no-subscription repository and the pvetest repository.

To do so add the # symbol to comment out these repositories in the /etc/apt/sources.list.d/pve-enterprise.list and /etc/apt/sources.list files. See Package_Repositories

Add the Proxmox VE 6 Package Repository

echo "deb https://enterprise.proxmox.com/debian/pve buster pve-enterprise" > /etc/apt/sources.list.d/pve-enterprise.list

For the no-subscription repository see Package Repositories. It can be something like:

sed -i -e 's/stretch/buster/g' /etc/apt/sources.list.d/pve-install-repo.list
sed -i -e 's/stretch/buster/g' /etc/apt/sources.list.d/pve-enterprise.list

(Ceph only) Replace ceph.com repositories with proxmox.com ceph repositories

echo "deb http://download.proxmox.com/debian/ceph-luminous buster main" > /etc/apt/sources.list.d/ceph.list

If there is a backports line, remove it – currently, the upgrade has not been tested with packages from the backports repository installed.

Update the repositories data:

apt update

Upgrade the system to Debian Buster and Proxmox VE 6.0
This action will take some time depending on the system performance – up to 60 min or more. On high-performance servers with SSD storage, the dist-upgrade can be finished in 5 minutes.

Start with this step to get the initial set of upgraded packages:

apt dist-upgrade

During the steps above, you may be asked to approve some of the new packages replacing configuration files. They are not relevant to the Proxmox VE upgrade, so you can choose what you want to do.

Reboot the system in order to use the new PVE kernel

reboot

After the Proxmox VE upgrade
For Clusters
remove extra the corosync 3 repository used to upgrade corosync on PVE 5 / Stretch if not done already. If you followed the steps here you can simply execute the following command to do so:

rm /etc/apt/sources.list.d/corosync3.list

For Hyperconverged Ceph
Now you should upgrade the Ceph cluster to the Nautilus release, following the article Ceph Luminous to Nautilus.

Sources:

https://pve.proxmox.com/wiki/Upgrade_from_5.x_to_6.0

 

Leave a Reply

Your email address will not be published. Required fields are marked *