We’re all used to doing a disk check in Windows XP. It’s easy. Just double-click on “My Computer”, then select the drive you want to run the check on. Right-click, Properties, Tools tab, then select “Check Now…” in the Error-checking section. In almost every instance you’ll be told that the check will be done upon the next reboot. Easy.
So how does one go about it on Linux? Well… as you may have guessed, it’s not quite so straightforward. Linux, by default, does actually have an intelligent disk-checking system already in place. By all accounts, you generally needn’t worry. But if you have a reason to believe your disk may be slowly dying, and nothing is reporting in the SMART status of your drive, perhaps it’s worth checking the file system instead.
That’s where File System Check comes in (duh!). Like all Linux tools, it’s painfully abbreviated to simply “fsck”. Terse, to say the least. Now the warning:
DO NOT. I REPEAT, DO NOT EVER EVER EVER RUN THIS COMMAND WHILE YOUR DRIVE IS MOUNTED (I.E. IN USE). I TAKE NO RESPONSIBILITY FOR ANY LOSS OF DATA THAT YOU MAY CAUSE BY FOLLOWING THESE INSTRUCTIONS.
To unmount your root (/) volume, follow these easy steps:
- Boot from a Live CD. Your root volume will not be mounted by default.
- Open a terminal and type:
# dmesg | grep sda
If you see output relating to your “SCSI” device, then this will identify that your hard disk, in all likelihood, contains your root partition. For example, amongst other output, I see this:sd 2:0:0:0: [sda] Assuming drive cache: write through
sda: sda1 sda2
sd 2:0:0:0: [sda] Attached SCSI disk - In the example above, we see that SCSI disk 2 (2:0:0:0:) the Linux kernel registers it as the first logical drive (“sda”) in the system. We can also see it has only 2 partitions, sda1 and sda2. If this is the only physical drive in the machine, we should strongly suspect that it uses one partition as /boot (formatted with ext4) and the other as a Logical Volume containing both root (/) and swap. Furthermore, it’s foregone conculsion that the smallest partition will be /boot and the larger one will contain our swap and / partitions, so let’s proceed with accessing them.
- So, how do we access a “Logical Volume” within an equally mystical “Volume Group”? Luckily, Linux LVM comes with a plethora of useful tools to make the job easy.
Great. We have identified the volume group. But before we can identify the logical volumes it contains, we need access it.
# /sbin/vgscan
Reading all physical volumes. This may take a while...
Found volume group "VolGroup00" using metadata type lvm2# /sbin/vgchange -a y
2 logical volume(s) in volume group "VolGroup00" now active
Here, the -a flag indicates that we want to change the “active” status of the volume group, and the y means “yes”.# /sbin/lvdisplay
--- Logical volume ---
LV Name /dev/VolGroup00/LogVol00
VG Name VolGroup00
LV UUID DG2WxJ-sKa5-20mg-NtjW-CsPW-t99V-Egqlja
LV Write Access read/write
LV Status available
# open 0
LV Size 7.25 GB
Current LE 232
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:2--- Logical volume ---
LV Name /dev/VolGroup00/LogVol01
VG Name VolGroup00
LV UUID HqKozT-16PQ-HUaT-Yyc7-lMCO-007m-Xcc2c8
LV Write Access read/write
LV Status available
# open 1
LV Size 512.00 MB
Current LE 16
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:3
We can now see two partitions contained within the volume group. The first partition, although small by today’s standards, looks a lot larger than the second. We can also see that each logical volume has a device node (/dev/VolGroup00/LogVol01, for example).As we want to perform the disk check without the parition being mounted, we do not issue any mount command here. However, if you wanted to double-check that this is the partition to check, mount it and have a quick look around. The following step is only offered to help in this case – skip this if you wish to perform a disk check.
# mkdir /tmp/lv0
For me, the first logical volume (the 7.5GB one) would be the one to test.
# mount -t ext4 /dev/VolGroup00/LogVol00 /tmp/lv0
# cd /tmp/lv0
# ls
bin boot dev etc home lib lib64 lost+found media mnt opt proc root sbin selinux srv sys tmp usr var
Ok, that looks like the root partition, so let’s get out of it and unmount it before running the file system check on it.# cd /
# umount /tmp/lv0 - An alternative to the above steps, if you have already booted into your main system, is to investigate /etc/fstab to see which is your / volume. All you do is open a terminal and issue:
# cat /etc/fstab
On my CentOS 5 system, I see this:/dev/VolGroup00/LogVol00 / ext4 defaults 1 1
LABEL=/boot1 /boot ext4 defaults 1 2
tmpfs /dev/shm tmpfs defaults 0 0
devpts /dev/pts devpts gid=5,mode=620 0 0
sysfs /sys sysfs defaults 0 0
proc /proc proc defaults 0 0
LABEL=SWAP-sdb1 swap swap defaults 0 0So,
/dev/VolGroup00/LogVol00
is my root volume.
So, now that that’s out of the way, what next? Well, assuming you now know which is your root partition, the most sensible thing to do would be to boot from a Live CD of some distribution (Ubuntu, Fedora, etc) if you haven’t done so, and then perform the disk check from that.
Once in the LiveCD desktop, we’ll need to fire up a Terminal window.
If you know your filesystem type, e.g. if it’s Ext4, which is the default on the most common distributions, you can run a modified version of the fsck command specifically for that file system. Here’s what I run for a thorough disk check:
# fsck.ext4 -c -D -f -P -v /dev/VolGroup00/LogVol00
Alternatively, if your partition structure is slightly older and only contains physical paritions (not Logical Volumes), it may just be a case of finding the partition directly – by checking /etc/fstab on the system when running. In that case, your command may look more like this (when / is unmounted!!):# fsck.ext4 -c -D -f -P -v /dev/
sda2
Here’s what the flags do:
-c – forces a bad block scan. Although bad blocks are remapped dynamically by the file system, if the file system or its journal are corrupt, this may not work correctly.
-D – performs a directory check and optimisation. Doesn’t hurt, and can speed up directory listings of a large number of files.
-f – forces the check itself to actually run. As mentioned previously, the file system maintains itself quite well, and if you don’t force the check, fsck may look at the last check interval and decide a check is not required.
-P – perform all file system fixes automatically. This is usually a safe flag, but if your file system is potentially very corrupt, this may not be advisable. In this situation, contact an expert – or restore your back-up…
-v – verbose output. See what’s going on.
/dev/VolGroup00/LogVol00 or /dev/sda2 – this is the partition I want to perform the disk check on.
This little guide doesn’t explain how to perform a check on an encrypted logical volume… That one’s coming.
Updated from post originally put here: http://onecool1.wordpress.com/2008/09/19/how-to-do-a-disk-check-in-linux/
The first option should be lower case, ‘-c’, as ‘-C’ will error out with;
—>
Invalid non-numeric argument to -C
—>
Regards, Rob (AU)
Rob – many thanks for your comment. Yes, looks like I got it wrong. I’ll edit and re-post accordingly.
Your comment also spurred me to check out the man page. The -C (uppercase option) is interesting, as it can print a progress bar to the screen while the check is happening.
You can specify -C 0 (that’s a zero) to show this. The absence of the numeric argument caused the error you mentioned.
Best wishes.
I’m fairly computer literate, but have no formal training in Linux. Need to explore more awesome guide to learn more about Linux and how it works.
At my Linux Mint 14 Live CD I had to use -p instead of -P:
-p Automatic repair (no questions)
-P process_inode_size
[…] But there is an alternative – do a disk check. You may have read my verbose coverage for How to do a disk check in Linux before. This takes it one step further – how to check your logical volume when it’s encrypted […]