24

How to troubleshoot broken ssh if you do GPU passthrough?

posted 24 days ago

by

berylenara@sh.itjust.works

in

24 commentshide report

This hasn’t happened to me yet but I was just thinking about it. Let’s say you have a server with an iGPU, and you use GPU passthrough to let VMs use the iGPU. And then one day the host’s ssh server breaks, maybe you did something stupid or there was a bad update. Are you fucked? How could you possibly recover, with no display and no SSH? The only thing I can think of is setting up serial access for emergencies like this, but I rarely hear about serial access nowadays so I wonder if there’s some other solution here.

Sort:

Hot Top Controversial New Old

[ - ]

qjkxbmwvz@startrek.website

3 points

23 days ago

For very simple tasks you can usually blindly log in and run commands. I’ve done this with very simple tasks, e.g., rebooting or bringing up a network interface. It’s maybe not the smartest, but basically, just type root, the root password, and dhclient eth0 or whatever magic you need. No display required, unless you make a typo…

In your specific case, you could have a shell script that stops VMs and disables passthrough, so you just log in and invoke that script. Bonus points if you create a dedicated user with that script set as their shell (or just put in the appropriate dot rc file).

report

reply

[ - ]

berylenara@sh.itjust.worksOP

2 points

23 days ago

I’ll admit I’ve done this too 😅 Not ideal but a good idea nonetheless

report

reply

[ - ]

Max-P@lemmy.max-p.me

8 points

23 days ago

I just have a boot entry that doesn’t do the passthrough, doesn’t bind to vfio-pci and doesn’t start the VMs on boot so I can inspect and troubleshoot.

report

reply

[ - ]

berylenara@sh.itjust.worksOP

3 points

23 days ago

That sounds brilliant. Have any resources to learn how to do something like this? I’ve never created custom boot entries before

report

reply

[ - ]

Max-P@lemmy.max-p.me

6 points

23 days ago

I use systemd-boot so it was pretty easy, and it should be similar in GRUB:

title My boot entry that starts the VM
linux /vmlinuz-linux-zen
initrd /amd-ucode.img
initrd /initramfs-linux-zen.img
options quiet splash root=ZSystem/linux/archlinux rw pcie_aspm=off iommu=on systemd.unit=qemu-vms.target

What you want is that part: systemd.unit=qemu-vms.target which tells systemd which target to boot to. I launch my VMs with scripts so I have the qemu-vms.target and it depends on the VMs I want to autostart. A target is a set of services to run for a desired system state, the default usually being graphical or multi-user, but really it can be anything, and use whatever set of services you want: start network, don’t start network, mount drives, don’t mount drives, entirely up to you.

https://man.archlinux.org/man/systemd.target.5.en

You can also see if there’s a predefined rescue target that fits your need and just goes to a local console: https://man.archlinux.org/man/systemd.special.7.en

report

reply

[ - ]

berylenara@sh.itjust.worksOP

2 points

22 days ago

This looks simple enough, I’ll have a crack at it this weekend. Thank you

report

reply

[ - ]

NeoNachtwaechter@lemmy.world

1 point

24 days ago

Proxmox on the host. It uses a webserver for admin stuff.

No other things that run on the host ––> no other things that break on the host.

report

reply

[ - ]

berylenara@sh.itjust.worksOP

1 point

23 days ago

If you want to lock down the web server and ssh behind a VPN, that’s where you can fuck up and lock yourself out though.

report

reply

[ - ]

mvirts@lemmy.world

1 point

24 days ago

Live boot, plug in a display?

Maybe I’m missing something here, but won’t booting from live media run a normal environment?

If you don’t have a live boot option you can also pull the disk and fix it on another machine, or put a different boot disk in the system entirely.

You can probably also disable hardware virtualization extensions in the bios to break the VM so it doesn’t steal the graphics card.

report

reply

[ - ]

berylenara@sh.itjust.worksOP

3 points

23 days ago

*

A rescue iso doesn’t work if you have encrypted disk. I thought everybody encrypted disk nowadays.

If you don’t have a live boot option you can also pull the disk and fix it on another machine, or put a different boot disk in the system entirely.

This is an interesting idea though, as long as the other machine has a different GPU then the system shouldn’t hijack it on startup.

You can probably also disable hardware virtualization extensions in the bios to break the VM so it doesn’t steal the graphics card.

AFAIK GPU passthrough is usually configured to detach the GPU from the host automatically on startup. So even if all VMs were broken, the GPU would still be detached. However as another commenter pointed out, it’s possible to detach it manually which might be safer against accidental lockouts.

report

reply

[ - ]

Max-P@lemmy.max-p.me

2 points

23 days ago

How’s the disk encrypted? I’ve never heard of anyone setting up an encrypted drive such that you can’t manually mount it with the password. It’s possible but you’d have to go out of your way to do that and only encrypt the drive with a TPM-managed key. It’s kind of a bad idea because if you lock yourself out your data’s gone.

report

reply

[ - ]

berylenara@sh.itjust.worksOP

1 point

23 days ago

I was confused on how secure boot and disk encryption worked, ignore me 😅

report

reply

[ - ]

mvirts@lemmy.world

1 point

23 days ago

😅 naa for me encryption a bigger risk than theft

That said, you should be able to decrypt your disks with the right key even on a live boot. Even if the secrets are in the tpm you should be able to use whatever your normal system uses to decrypt the disks.

If you don’t enter a password to boot, the keys are available. If you do, the password can decrypt the keys afaik.

Again, I don’t do this but that’s what I’ve picked up here and there so take it with a grain of salt I may be wrong.

report

reply

[ - ]

berylenara@sh.itjust.worksOP

2 points

23 days ago

Actually that might work. I thought that secure boot and disk encryption would prevent mounting the disk to a different system, but now I can’t think of any reason why it would. Good idea

report

reply

[ - ]

horse_battery_staple@lemmy.world

7 points

24 days ago

*

Boot to live disk.

Edit vmconfig to not start at boot.

Mount vmdisk to live disk

Fix ssh

report

reply

[ - ]

berylenara@sh.itjust.worksOP

1 point

23 days ago

*

As mentioned in another reply, this doesn’t work if you have encrypted disk. The price for security I suppose

Edit: nevermind I thought that secure boot and disk encryption would prevent you from mounting the disk to another system, but that appears to be wrong

report

reply

Linux

!linux@lemmy.ml

From Wikipedia, the free encyclopedia

Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).

Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word “Linux” in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.

Rules

Posts must be relevant to operating systems running the Linux kernel. GNU/Linux or otherwise.
No misinformation
No NSFW content
No hate speech, bigotry, etc

Related Communities

Community icon by Alpár-Etele Méder, licensed under CC BY 3.0

Community stats

6.5K
Monthly active users
4.1K
Posts
57K
Comments

Community moderators