Proxmox Host SSH keys: Difference between revisions

From RoseWiki
Jump to navigation Jump to search
Created page with " May 3, 2024 Add bookmark #59 ufear said: So, if anybody runs into this. I couldn't get updatecerts to add keys for reinstalled nodes to the global /etc/pve/priv/ssh_known_hosts; however the folder /etc/pve/nodes/<nodename> contains a ssh_known_hosts file which contains the content you need; copy it over and the world is good again. Your post put me in the right track and it seems I'm able to connect by WebGUI shell from any host to any host in..."
 
No edit summary
 
(4 intermediate revisions by 2 users not shown)
Line 1: Line 1:
== Introduction ==
Proxmox nodes in a cluster communicate with each other through the Secure Shell (SSH) protocol. The SSH protocol is configured to require authentication by encryption keys. The host stores these keys in /root/.ssh/id_rsa.pub and in /etc/ssh/ssh_host_rsa_key.pub. When removing or replacing a Proxmox node, manual cleanup must be done. You must remove the old key from /etc/pve/priv/authorized_keys, and if joining a new machine in its place, add the new one. 


== Removing keys from removed machine. ==
Delete old ssh host keys on the host you're troubleshooting or have removed with:
  rm /etc/ssh/ssh_host_*
After doing this, we need to make sure the fingerprint of the old key is no longer on the other machines. Navigate to ''/etc/pve/priv/known_hosts'' and remove the fingerprint with the hostname of the machine you're treating. Also remove the key itself from /etc/pve/priv/authorized_keys.


    May 3, 2024
== Adding key ==
Then we can recreate the keys with OpenSSH.
  dpkg-reconfigure openssh-server
After doing this we must copy the new key from /etc/ssh/ssh_host_rsa_key.pub on the treated machine to /etc/pve/priv/authorized keys. You will most likely have to do this manually copying from the broken machine to a machine still in the cluster, as with the keys removed, corosync may not copy the files correctly. After this we want to make sure that the fingerprint has been updated. On the treated machine, run:
  pvecm updatecerts -f


    Add bookmark
== Manual correction if the above fails<ref>https://forum.proxmox.com/threads/pvecm-updatecert-f-not-working.135812/page-3#post-660500</ref> ==
    #59
If this fails (which it might), copy the public key of the troublesome node from ''/etc/ssh/ssh_host_rsa_key.pub'' to ''/etc/pve/nodes/<node>/ssh_known_hosts'' and prepend it with that machine's hostname. Assuming a hostname of pve1, this line should appear as
 
  pve1 ssh-rsa <key>
    ufear said:
Then restart the SSH daemon:
    So, if anybody runs into this. I couldn't get updatecerts to add keys for reinstalled nodes to the global /etc/pve/priv/ssh_known_hosts; however the folder /etc/pve/nodes/<nodename> contains a ssh_known_hosts file which contains the content you need; copy it over and the world is good again.
  systemctl restart sshd
 
 
Your post put me in the right track and it seems I'm able to connect by WebGUI shell from any host to any host in the cluster now.
 
The problem was that two of my nodes were missing ssh_known_hosts file in
Code:
 
/etc/pve/nodes/<node>/
 
(The hosts that gave me KEY CHANGED warning in WebGUI Shell)
 
I logged in to both troublesome nodes via ssh terminal and copied SSH public key from
Code:
 
/etc/ssh/ssh_host_rsa_key.pub
 
to
Code:
 
/etc/pve/nodes/<node>/ssh_known_hosts
 
file and added the node hostname in the beginning of the line before RSA public key like so:
 
Code:
 
NodeHostname ssh-rsa <the_rsa_pub_key>
 
 
after that I restarted SSH service systemctl restart sshd on both nodes (not sure if necessary)
 
This seems to have worked.

Latest revision as of 21:21, 28 May 2026

Introduction

Proxmox nodes in a cluster communicate with each other through the Secure Shell (SSH) protocol. The SSH protocol is configured to require authentication by encryption keys. The host stores these keys in /root/.ssh/id_rsa.pub and in /etc/ssh/ssh_host_rsa_key.pub. When removing or replacing a Proxmox node, manual cleanup must be done. You must remove the old key from /etc/pve/priv/authorized_keys, and if joining a new machine in its place, add the new one.

Removing keys from removed machine.

Delete old ssh host keys on the host you're troubleshooting or have removed with:

 rm /etc/ssh/ssh_host_*

After doing this, we need to make sure the fingerprint of the old key is no longer on the other machines. Navigate to /etc/pve/priv/known_hosts and remove the fingerprint with the hostname of the machine you're treating. Also remove the key itself from /etc/pve/priv/authorized_keys.

Adding key

Then we can recreate the keys with OpenSSH.

 dpkg-reconfigure openssh-server

After doing this we must copy the new key from /etc/ssh/ssh_host_rsa_key.pub on the treated machine to /etc/pve/priv/authorized keys. You will most likely have to do this manually copying from the broken machine to a machine still in the cluster, as with the keys removed, corosync may not copy the files correctly. After this we want to make sure that the fingerprint has been updated. On the treated machine, run:

 pvecm updatecerts -f

Manual correction if the above fails[1]

If this fails (which it might), copy the public key of the troublesome node from /etc/ssh/ssh_host_rsa_key.pub to /etc/pve/nodes/<node>/ssh_known_hosts and prepend it with that machine's hostname. Assuming a hostname of pve1, this line should appear as

 pve1 ssh-rsa <key>

Then restart the SSH daemon:

 systemctl restart sshd