It’s worry-time on the server:
# tail -20 /var/log/messages
Feb 25 10:09:32 myserver smartd[2785]: Device: /dev/sdc, 3 Offline uncorrectable sectors
Feb 25 10:39:32 myserver smartd[2785]: Device: /dev/sdc, 9 Currently unreadable (pending) sectors
Feb 25 10:39:32 myserver smartd[2785]: Device: /dev/sdc, 3 Offline uncorrectable sectors
Feb 25 11:09:32 myserver smartd[2785]: Device: /dev/sdc, 9 Currently unreadable (pending) sectors
Feb 25 11:09:32 myserver smartd[2785]: Device: /dev/sdc, 3 Offline uncorrectable sectors
Feb 25 11:39:32 myserver smartd[2785]: Device: /dev/sdc, 9 Currently unreadable (pending) sectors
Feb 25 11:39:32 myserver smartd[2785]: Device: /dev/sdc, 3 Offline uncorrectable sectors
Feb 25 12:09:32 myserver smartd[2785]: Device: /dev/sdc, 9 Currently unreadable (pending) sectors
Feb 25 12:09:32 myserver smartd[2785]: Device: /dev/sdc, 3 Offline uncorrectable sectors
Feb 25 12:39:31 myserver smartd[2785]: Device: /dev/sdc, 9 Currently unreadable (pending) sectors
Feb 25 12:39:31 myserver smartd[2785]: Device: /dev/sdc, 3 Offline uncorrectable sectors
Feb 25 13:09:32 myserver smartd[2785]: Device: /dev/sdc, 9 Currently unreadable (pending) sectors
Feb 25 13:09:32 myserver smartd[2785]: Device: /dev/sdc, 3 Offline uncorrectable sectors
Feb 25 13:39:32 myserver smartd[2785]: Device: /dev/sdc, 9 Currently unreadable (pending) sectors
Feb 25 13:39:32 myserver smartd[2785]: Device: /dev/sdc, 3 Offline uncorrectable sectors
Feb 25 14:09:32 myserver smartd[2785]: Device: /dev/sdc, 9 Currently unreadable (pending) sectors
Feb 25 14:09:32 myserver smartd[2785]: Device: /dev/sdc, 3 Offline uncorrectable sectors
Feb 25 14:39:32 myserver smartd[2785]: Device: /dev/sdc, 9 Currently unreadable (pending) sectors
Feb 25 14:39:32 myserver smartd[2785]: Device: /dev/sdc, 3 Offline uncorrectable sectors
.. and so it goes on. So, I’ll check it out by performing a SMART self-test on the drive:
# smartctl -a -d ata /dev/sdc
smartctl version 5.36 [x86_64-redhat-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF INFORMATION SECTION ===
Device Model: Hitachi HDP725040GLA360
Serial Number: GEB430RE15UEVF
Firmware Version: GMDOA52A
User Capacity: 400,088,457,216 bytes
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: 8
ATA Standard is: Not recognized. Minor revision code: 0x29
Local Time is: Wed Feb 25 14:55:30 2009 GMT
SMART support is: Available – device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (7840) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 130) minutes.
[snip]
I’m not sure what to make of a disk that reports it’s broken to the kernel but reports its “PASSED” to a userspace tool.
One thing’s for certain – it’s being replaced!