this post was submitted on 07 Nov 2024

36 points (97.4% liked)

Selfhosted

39980 readers

781 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.
No spam posting.
Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.
Don't duplicate the full text of your blog or github here. Just post the link for folks to click.
Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).
No trolling.

Resources:

selfh.st Newsletter and index of selfhosted software and apps
awesome-selfhosted software
awesome-sysadmin resources
Self-Hosted Podcast from Jupiter Broadcasting

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 1 year ago

MODERATORS

HybridSarcasm@lemmy.world

HybridSarcasm@lemmy.hybridsarcasm.xyz

Help with ZFS Array (lemmy.ca)

submitted 2 days ago* (last edited 2 days ago) by Lem453@lemmy.ca to c/selfhosted@lemmy.world

22 comments fedilink hide all child comments

I have a ZFS pool that I made on proxmox. I noticed an error today. I think the issue is the drives got renamed at some point and how its confused. I have 5 NVME drives in total. 4 are supposed to be on the ZFS array (CT1000s) and the 5th samsung drive is the system/proxmox install drive not part of ZFS. Looks like the numering got changed and now the drive that used to be in the array labeled nvme1n1p1 is actually the samsung drive and the drive that is supposed to be in the array is now called nvme0n1.

root@pve:~# zpool status
  pool: zfspool1
 state: DEGRADED
status: One or more devices could not be used because the label is missing or
        invalid.  Sufficient replicas exist for the pool to continue
        functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-4J
  scan: scrub repaired 0B in 00:07:38 with 0 errors on Sun Oct 13 00:31:39 2024
config:

        NAME                     STATE     READ WRITE CKSUM
        zfspool1                 DEGRADED     0     0     0
          raidz1-0               DEGRADED     0     0     0
            7987823070380178441  UNAVAIL      0     0     0  was /dev/nvme1n1p1
            nvme2n1p1            ONLINE       0     0     0
            nvme3n1p1            ONLINE       0     0     0
            nvme4n1p1            ONLINE       0     0     0

errors: No known data errors

Looking at the devices:

 nvme list
Node                  Generic               SN                   Model                                    Namespace Usage                      Format           FW Rev
--------------------- --------------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
/dev/nvme4n1          /dev/ng4n1            193xx6A         CT1000P1SSD8                             1           1.00  TB /   1.00  TB    512   B +  0 B   P3CR013
/dev/nvme3n1          /dev/ng3n1            1938xxFF         CT1000P1SSD8                             1           1.00  TB /   1.00  TB    512   B +  0 B   P3CR013
/dev/nvme2n1          /dev/ng2n1            192xx10         CT1000P1SSD8                             1           1.00  TB /   1.00  TB    512   B +  0 B   P3CR010
/dev/nvme1n1          /dev/ng1n1            S5xx3L      Samsung SSD 970 EVO Plus 1TB             1         289.03  GB /   1.00  TB    512   B +  0 B   2B2QEXM7
/dev/nvme0n1          /dev/ng0n1            19xxD6         CT1000P1SSD8                             1           1.00  TB /   1.00  TB    512   B +  0 B   P3CR013

Trying to use the zpool replace command gives this error:

root@pve:~# zpool replace zfspool1 7987823070380178441 nvme0n1p1
invalid vdev specification
use '-f' to override the following errors:
/dev/nvme0n1p1 is part of active pool 'zfspool1'

where it thinks 0n1 is still part of the array even though the zpool status command shows that its not.

Can anyone shed some light on what is going on here. I don't want to mess with it too much since it does work right now and I'd rather not start again from scratch (backups).

I used smartctl -a /dev/nvme0n1 on all the drives and there don't appear to be any smart errors, so all the drives seem to be working well.

Any idea on how I can fix the array?

top 22 comments

sorted by: hot top controversial new old

[–] possiblylinux127@lemmy.zip 2 points 21 hours ago

ZFS is aware of the phyicial disks so it won't randomly start using a different disk.

The disk is no longer working. There is a hardware fault somewhere

[–] hendrik@palaver.p3x.de 16 points 2 days ago* (last edited 2 days ago) (4 children)

I don't know anything about ZFS, but in the future you might want to address them by /dev/disks/by-uuid/... or by-id and not by /dev/nvme..

[–] possiblylinux127@lemmy.zip 1 points 21 hours ago (1 children)

I don't believe this is the case

[–] hendrik@palaver.p3x.de 1 points 16 hours ago (1 children)

Care to explain?

[–] possiblylinux127@lemmy.zip 2 points 10 hours ago* (last edited 10 hours ago) (1 children)

I believe ZFS is smart enough to automatically find the disk on the system as it looks at all the other information like the disk id. It shouldn't just lose a drive.

zpool just shows the original path of the disk when it was added. Behind the scenes ZFS knows your drives. There is a chance I am totally wrong about this.

What is the output of lsblk? Any missing drives?

[–] hendrik@palaver.p3x.de 1 points 9 hours ago

Fair enough. Judging by OP's later comments, the pool is online again.

[–] Lem453@lemmy.ca 2 points 1 day ago

Thanks! I got it setup by IDs now. I originally set it up via the proxmox GUI and it defaulted to NVME names

[–] Shdwdrgn@mander.xyz 7 points 1 day ago

That is definitely true of zfs as well. In fact I have never seen a guide which suggests anything other than using the names found under /dev/disk/by-id/ or /dev/disk/by-id/uuid and that is to prevent this very problem. If the proper convention is used then you can plug the drives in through any available interface, in any order, and zfs will easily re-assemble the pool at boot.

So now this begs the question... is proxmox using some insane configuration to create drive clusters using the name they happen to boot up with???

[–] Lem453@lemmy.ca 3 points 2 days ago (2 children)

Is there a way to change this on an existing zpool?

[–] qupada@fedia.io 8 points 1 day ago (1 children)

Generally, you just need to export the pool with zpool export zfspool1, then import again with zpool import -d /dev/disk/by-id zfspool1.

I believe it should stick after that.

Whether that will apply in its current degrated state I couldn't say.

[–] Lem453@lemmy.ca 3 points 1 day ago* (last edited 1 day ago) (1 children)

Thanks, this worked. I made the ZFS array in the proxmox GUI and it used the nvmeX names by default. Interestingly, when I did zfs export, nothing seemed to happen and it -> I tried zpool import and is said no pools available to import, but then when I did zpool status it showed the array up and working with all 4 drives showing healthy and it was now using device IDs. Odd but seems to be working correctly now.

root@pve:~# zpool status
  pool: zfspool1
 state: ONLINE
  scan: resilvered 8.15G in 00:00:21 with 0 errors on Thu Nov  7 12:51:45 2024
config:

		NAME                                                                                 STATE     READ WRITE CKSUM
		zfspool1                                                                             ONLINE       0     0     0
		  raidz1-0                                                                           ONLINE       0     0     0
			nvme-eui.000000000000000100a07519e22028d6-part1                                  ONLINE       0     0     0
			nvme-nvme.c0a9-313932384532313335343130-435431303030503153534438-00000001-part1  ONLINE       0     0     0
			nvme-eui.000000000000000100a07519e21fffff-part1                                  ONLINE       0     0     0
			nvme-eui.000000000000000100a07519e21e4b6a-part1                                  ONLINE       0     0     0

errors: No known data errors

[–] hendrik@palaver.p3x.de 1 points 16 hours ago

Strange. Okay, hope that spares you from similar troubles in the future.

[–] Shdwdrgn@mander.xyz 5 points 1 day ago* (last edited 1 day ago) (1 children)

OP -- if your array is in good condition (and it looks like it is) you have an option to replace drives one by one, but this will take some time (probably over a period of days). The idea is to remove a disk from the pool by its old name, then re-add the disk under the corrected name, wait for the pool to rebuild, then do the process again with the next drive. Double-check, but I think this is the proper procedure...

zpool offline poolname /dev/nvme1n1p1

zpool replace poolname /dev/nvme1n1p1 /dev/disk/by-id/drivename

Check zpool status to confirm when the drive is done rebuilding under the new name, then move on to the next drive. This is the process I use when replacing a failed drive in a pool, and since that one drive is technically in a failed state right now, this same process should work for you to transfer over to the safe names. Keep in mind that this will probably put a lot of strain on your drives since the contents have to be rebuilt (although there is a small possibility zfs may recognize the drive contents and just start working immediately?), so be prepared in case a drive does actually fail during the process.

[–] Lem453@lemmy.ca 1 points 1 day ago (1 children)

Thanks for this! Luckily the above suggestion to export and import worked right away so this was not needed.

[–] Shdwdrgn@mander.xyz 2 points 23 hours ago

Yeah I figured there would be multiple answers for you. Just keep in mind that you DO want to get it fixed at some point to use the disk id instead of the local device name. That will allow you to change hardware or move the whole array to another computer.

[–] cyberpunk007@lemmy.ca 2 points 1 day ago

Weird. I suspect your disk is dead, but in case it's not, I'd mark that disk for replacement, then "replace" it at the software level and allow resilver to see what happens.

No idea how to do this in proxmox or the commands but I know how to do it in truenas with the GUI.

[–] just_another_person@lemmy.world 0 points 2 days ago* (last edited 2 days ago) (1 children)

When you say "the drives were renamed", do you mean you renamed them while the array was online? That sounds like what this means.

In that case, you can find out which drive is the problem, clear it, and repair the array. Should be pretty quick.

[–] Lem453@lemmy.ca 1 points 2 days ago (1 children)

I didn't rename them. I suspect it happened during a reboot or maybe a bios update that I may have done last month.

How do I clear or repair it?

[–] just_another_person@lemmy.world 0 points 1 day ago (2 children)

The device names and aliases in /dev don't just simply change between reboots. Something else happened here.

What are the path or IDs of the drives that are in there now under /dev/nvme*?

[–] Lem453@lemmy.ca 1 points 1 day ago (1 children)

I may have done a bios update around the time it went down, I don't remember for sure but I haven't added to physically changed the hardware in anyway. Its working now with the above suggestions so thanks!

[–] just_another_person@lemmy.world 1 points 1 day ago

It really shouldn't have. It doesn't make sense that all your other drives were still addressed except for this one.

[–] Shdwdrgn@mander.xyz 2 points 1 day ago

Are you sure about that? Ever hear about this supposed predictable network names in recent linux versions? Yeah those can change too. I was trying to set up a new firewall with two internal NICs plus a 4-port card, and they kept moving around. I finally figured out that if I cold-booted the NICs would come up in one order, and if I warm-booted they would come up in a completely different order (like the ports on the card would reverse which order they were detected). This was completely the fault of systemd because when I installed an older linux and used udev to map the ports, it worked exactly as predicted. These days I trust nothing.