Sure you guys go ahead and have fun in my absence…
I’m happy to hear things are better.
Sure you guys go ahead and have fun in my absence…
I’m happy to hear things are better.
This is what it says now, the format is completed.
I also ran a test.
sudo smartctl -a /dev/sdf
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.11.0-14-generic] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Vendor: HGST
Product: HUS726060AL4210
Revision: AAG0
Compliance: SPC-4
User Capacity: 6,001,175,126,016 bytes [6.00 TB]
Logical block size: 4096 bytes
LU is fully provisioned
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Logical Unit id: 0x5000cca271b5d120
Serial number: K8K6ZT5N
Device type: disk
Transport protocol: SAS (SPL-4)
Local Time is: Tue Feb 4 03:20:20 2025 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Temperature Warning: Enabled
=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK
Current Drive Temperature: 29 C
Drive Trip Temperature: 85 C
Accumulated power on time, hours:minutes 24685:25
Manufactured in week 14 of year 2019
Specified cycle count over device lifetime: 50000
Accumulated start-stop cycles: 24
Specified load-unload count over device lifetime: 600000
Accumulated load-unload cycles: 6808
Elements in grown defect list: 0
Vendor (Seagate Cache) information
Blocks sent to initiator = 23457345250000896
Error counter log:
Errors Corrected by Total Correction Gigabytes Total
ECC rereads/ errors algorithm processed uncorrected
fast | delayed rewrites corrected invocations [10^9 bytes] errors
read: 0 288 0 288 1319011 380487.898 0
write: 0 11 0 11 1086528 40832.172 0
verify: 0 0 0 0 329704 0.000 0
Non-medium error count: 0
Pending defect count:0 Pending Defects
SMART Self-test log
Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ]
Description number (hours)
# 1 Background short Completed - 24685 - [- - -]
Long (extended) Self-test duration: 53828 seconds [15.0 hours]
I would venture a guess that it is still usable the Grown defect list did not increase the past errors remained the same…
The grown defect list is what I usually keep a eye on… but their is some validity in the total errors corrected are you usin ECC Rdimms or just regular Udimms ?
Hang on I’m cranking up (just sent the WOL magic packet) Deepblue (my backup server) to pull one of my smartctl status so you can compare.
Now mine has Quite a few corrected errors mostly because of me doing test with the drive checking ZFS which was on purpose.
root@deepblue:/home/mike# smartctl -a /dev/sdd
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.8.0-52-generic] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Vendor: SEAGATE
Product: ST4000NM0023
Revision: GE13
Compliance: SPC-4
User Capacity: 4,000,787,030,016 bytes [4.00 TB]
Logical block size: 512 bytes
LU is fully provisioned
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Logical Unit id: 0x5000c500855fc8c7
Serial number: Z1ZAVMM6
Device type: disk
Transport protocol: SAS (SPL-4)
Local Time is: Tue Feb 4 03:33:12 2025 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Temperature Warning: Enabled
=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK
Current Drive Temperature: 31 C
Drive Trip Temperature: 60 C
Accumulated power on time, hours:minutes 65619:58
Manufactured in week 17 of year 2016
Specified cycle count over device lifetime: 10000
Accumulated start-stop cycles: 3395
Specified load-unload count over device lifetime: 300000
Accumulated load-unload cycles: 6111
Elements in grown defect list: 0
Vendor (Seagate Cache) information
Blocks sent to initiator = 2035324647
Blocks received from initiator = 781105023
Blocks read from cache and sent to initiator = 2994318995
Number of read and write commands whose size <= segment size = 3530437749
Number of read and write commands whose size > segment size = 5820
Vendor (Seagate/Hitachi) factory information
number of hours powered up = 65619.97
number of minutes until next internal SMART test = 52
Error counter log:
Errors Corrected by Total Correction Gigabytes Total
ECC rereads/ errors algorithm processed uncorrected
fast | delayed rewrites corrected invocations [10^9 bytes] errors
read: 829513102 576 0 829513678 576 2440693.559 0
write: 0 0 0 0 0 239785.272 0
verify: 3435179573 0 0 3435179573 0 3463.364 0
Non-medium error count: 1097213
SMART Self-test log
Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ]
Description number (hours)
# 1 Background short Aborted (device reset ?) 64 65535 - [- - -]
# 2 Default Completed 64 35021 - [- - -]
# 3 Default Completed 64 15874 - [- - -]
# 4 Default Completed 64 15874 - [- - -]
# 5 Reserved(7) Completed 64 4 - [- - -]
Long (extended) Self-test duration: 32700 seconds [9.1 hours]
as you can see mine is a bit older has quite a few hours on it but the grown defect list is 0
most of my corrections was handled by the ecc ram but not all like I said I’ve used this drive for testing and interiem duty such as now I will replce these later with a freasher drive set.
this one is from my NFS with HP HGST drives
root@Beastie:/home/mike# smartctl -a /dev/sdd
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.8.0-52-generic] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Vendor: HP
Product: MB4000JEQNL
Revision: HPD7
Compliance: SPC-4
User Capacity: 4,000,787,030,016 bytes [4.00 TB]
Logical block size: 512 bytes
Physical block size: 4096 bytes
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Logical Unit id: 0x5000cca2430eb1c4
Serial number: NHG82J7N
Device type: disk
Transport protocol: SAS (SPL-4)
Local Time is: Mon Feb 3 21:47:48 2025 CST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Temperature Warning: Enabled
=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK
Current Drive Temperature: 41 C
Drive Trip Temperature: 60 C
Accumulated power on time, hours:minutes 40920:46
Manufactured in week 45 of year 2015
Specified cycle count over device lifetime: 50000
Accumulated start-stop cycles: 3120
Specified load-unload count over device lifetime: 600000
Accumulated load-unload cycles: 4814
Elements in grown defect list: 0
Error counter log:
Errors Corrected by Total Correction Gigabytes Total
ECC rereads/ errors algorithm processed uncorrected
fast | delayed rewrites corrected invocations [10^9 bytes] errors
read: 0 1 0 0 0 148253.041 0
write: 0 8 0 8 0 102729.910 0
verify: 0 0 0 0 0 0.000 0
Non-medium error count: 712251
SMART Self-test log
Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ]
Description number (hours)
# 1 Background short Completed - 7 - [- - -]
# 2 Background short Completed - 3 - [- - -]
# 3 Background short Completed - 2 - [- - -]
Long (extended) Self-test duration: 41040 seconds [11.4 hours]
so yes i would set your drive aside as a cold spare…in the event of a failure … gives you time to obtain a fresh drive … honestly your running raidz2 your probably able to get to the server within a week of a faulted drive so really no actual push for a hot spare in my honest opinion .
but if you want to add it back to the vdev as a hot spare you could.
Actually I seen you was quite the busy bee when we was doing that.
you was doing what … three requests for help? at the same time
@Supermag
I like those R720’s good system before re-purposing Deepblue into a backup server I was looking at that family as well as the T series.
so yeah that answers the ECC ram question