Hi,- it appears that one of my drives has gone bad.
From what I can see it is K8K6ZT5N /dev/sdh
Some information:
zpool status -x Mypool
pool: Mypool
state: DEGRADED
status: One or more devices could not be used because the label is missing or
invalid. Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using 'zpool replace'.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-4J
scan: scrub repaired 0B in 09:28:48 with 0 errors on Sun Feb 9 09:52:50 2025
config:
NAME STATE READ WRITE CKSUM
Mypool DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
c6febe17-7fd8-43ec-9bdb-1891a17eeac6 ONLINE 0 0 0
e3fdfc6d-41ad-489d-be3f-15e16c9220d9 ONLINE 0 0 0
9142784188148639209 UNAVAIL 0 0 0 was /dev/disk/by-partuuid/0a447919-6db7-4182-8aee-15ee39a90ce4
59364659-836d-46c3-a12a-948fc26f2fc3 ONLINE 0 0 0
errors: No known data errors
K8K7P8PN OK
K8K7NZTN OK
K8K7PD9N OK
K8K7T51N OK,- but possibly not used
K8K6ZT5N failure, bad drive
disk sdg Mypool 5.5T HUS726060AL4210 K8K7T51N 0x5000cca271b73f40 12447916164639100982
disk sdh 5.5T HUS726060AL4210 K8K6ZT5N 0x5000cca271b5d120
sudo smartctl -a /dev/sdh
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.11.0-19-generic] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Vendor: HGST
Product: HUS726060AL4210
Revision: AAG0
Compliance: SPC-4
User Capacity: 6,001,175,126,016 bytes [6.00 TB]
Logical block size: 4096 bytes
LU is fully provisioned
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Logical Unit id: 0x5000cca271b5d120
Serial number: K8K6ZT5N
Device type: disk
Transport protocol: SAS (SPL-4)
Local Time is: Sun Mar 16 10:47:15 2025 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Temperature Warning: Enabled
=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK
Current Drive Temperature: 29 C
Drive Trip Temperature: 85 C
Accumulated power on time, hours:minutes 25652:52
Manufactured in week 14 of year 2019
Specified cycle count over device lifetime: 50000
Accumulated start-stop cycles: 24
Specified load-unload count over device lifetime: 600000
Accumulated load-unload cycles: 6849
Elements in grown defect list: 0
Vendor (Seagate Cache) information
Blocks sent to initiator = 23457345250000896
Error counter log:
Errors Corrected by Total Correction Gigabytes Total
ECC rereads/ errors algorithm processed uncorrected
fast | delayed rewrites corrected invocations [10^9 bytes] errors
read: 0 288 0 288 1319011 380526.948 0
write: 0 11 0 11 1086528 40832.172 0
verify: 0 0 0 0 335732 0.000 0
Non-medium error count: 0
Pending defect count:0 Pending Defects
SMART Self-test log
Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ]
Description number (hours)
# 1 Background short Completed - 24685 - [- - -]
Long (extended) Self-test duration: 53828 seconds [15.0 hours]
sudo smartctl -a /dev/sdg
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.11.0-19-generic] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Vendor: HGST
Product: HUS726060AL4210
Revision: AAG0
Compliance: SPC-4
User Capacity: 6,001,175,126,016 bytes [6.00 TB]
Logical block size: 4096 bytes
LU is fully provisioned
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Logical Unit id: 0x5000cca271b73f40
Serial number: K8K7T51N
Device type: disk
Transport protocol: SAS (SPL-4)
Local Time is: Sun Mar 16 10:48:12 2025 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Temperature Warning: Enabled
=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK
Current Drive Temperature: 29 C
Drive Trip Temperature: 85 C
Accumulated power on time, hours:minutes 25438:59
Manufactured in week 14 of year 2019
Specified cycle count over device lifetime: 50000
Accumulated start-stop cycles: 25
Specified load-unload count over device lifetime: 600000
Accumulated load-unload cycles: 1272
Elements in grown defect list: 0
Vendor (Seagate Cache) information
Blocks sent to initiator = 1544130903670784
Error counter log:
Errors Corrected by Total Correction Gigabytes Total
ECC rereads/ errors algorithm processed uncorrected
fast | delayed rewrites corrected invocations [10^9 bytes] errors
read: 0 0 0 0 76018 25582.011 0
write: 0 0 0 0 28745 12518.477 0
verify: 0 0 0 0 35386 0.000 0
Non-medium error count: 0
Pending defect count:0 Pending Defects
SMART Self-test log
Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ]
Description number (hours)
# 1 Background short Completed - 491 - [- - -]
# 2 Background short Completed - 0 - [- - -]
Long (extended) Self-test duration: 58876 seconds [16.4 hours]
I think that it is possible to replace the bad or unavailable one by using /dev/sdg , but Im not sure how I do that.