Linux : Corrupted files ?

A check on corrupt Linux file system

For your Linux installations, you might be using native Linux file system either ext2 or ext3. These file systems are very robust, near-maintenance-free (don’t need de-fragmentation like Windows FAT 32 etc.) and have ability to survive in most unfavorable conditions. However, it is also equally true that when Linux file system gets corrupted, identifying errors and eliminating them become biggest headaches and often you end up with lost data. A file system is prone to errors due to one or more of these:

  • Bad, ageing disk
  • Defective / broken IDE / SCSI cable
  • Errors in Memory Modules
  • Bugs in Programs
  • Power Interruptions
  • The removable storage device is removed (like Floppy disk, Ram Drive etc.) before Kernel has finished with it.

You never know that which one of the above had silently and suddenly attacked your file system and had made it corrupt. Still, when file system error occurs due to the reasons mentioned above, you can minimize the damages by timely and effectively using tools that are available for Linux.

Fsck

Fsck is front end tool for various file system checkers for Linux, e.g. e2fsck(8). It runs as part of boot process in almost all Linux installations. Fsck is a utility that checks for file system errors and repairs a damaged and corrupted Linux file system. When you halt or shut down your machine, the File system is properly and cleanly unmounted and Kernel writes a special signature on the file system that indicate that files and data are intact. When the file system is mounted again, this signature is removed and a fresh signature is written when it is properly unmounted again. During boot process, before mounting the file system, the said signature is checked. If the signature is found, it is assumed that the file system was unmounted properly last time, and file system is intact and can be used without errors. If other argument are not found (such as periodic forced check), then file system is not checked by the fsck, it is assumed clean and therefore mounted. When the said signature is not found for any of the device having supported file system, fsck checks them for the errors and optionally can remove the errors if it finds. Some errors may not be repairable by fsck, and in this case, data loss may occur.

If fsck finds (this property of fsck can be set through tune2fs, another utility that lets you configure usage of fsck) that it has been a lot of time (say, six months) or number of reboots exceeds (say 20 reboots) since it has not checked the file system, it forces full file system check even when it finds a clean file system. Further, if you use Journaling file system (ext3), the file system is not marked corrupted and hence needs periodic force check. In general, during boot process, if any argument is not given, fsck checks file system in order that are entered in /etc/fstab.

File system checks should not be performed on mounted devices and file systems. You must unmount them to check them properly. Serious file system damage may occur if you run fsck on mounted file system. Similarly, sometimes, boot time file checking may fail and you need to check your files manually.

Steps to check a corrupt file system:

When automatic file system check at boot time fails, you need to check the file system manually. Su to root, unmount it if it is mounted and run command :

# fsck path_name_of_device_containiong_corrupt_file_system

For checking with some advanced options, you can use e2fsck instead. There are many advanced options that you can use for special file checking. See man pages of e2fsck for details about advance options. Path name can be a device name, for example, /dev/hda1; a mount point such as /usr or a UUID identifier. The arguments or options for advanced check are usually not necessary for simple file check and for finding and correcting simple file system errors such as correcting inodes. Usually a manual check removes the errors from Linux file system and you will be able to use your files again. Some time, you may not be able to boot in to Linux even for file check, then you can use rescue disks to boot the machine. A better way is to boot through bootable Live CDs that are distributed with LFY and run fsck from there.

Tune2fs

This tool allows you to adjust various tunable file system parameters on Linux ext2 / ext3 file system. With tune2fs, you can define the way fsck checks your file system during boot time. Tune2fs lets you define maximum time interval (with –i option), or maximum count between two file system checks (with – C option). Mount count dependent or time dependent full file checking is best planned for healthy file system even if there is remote possibility of occurrences of file system errors. You might have seen following message during boot time when you restart your Linux machine without proper shutdown:

Your system appears to have shut down uncleanly

Press Y within 5 seconds to force file system integrity check.

Through tune2fs you can set these behaviors and manage file system checks accordingly.

Dumpe2fs

At times, you need some information about your file system. There is a utility dumpe2fs which can give you number of information about Linux file system. For example, if you want status of bad blocks on a device containing Linux file system, run dumpe2fs with option –b. It will check for bad blocks and give you information about the bad blocks it finds in a dumped file. Dumpe2fs has various arguments and options and you can get the information you need about your file system for a quick diagnosis. For more information on dumpe2fs, see its man pages.

No comments: