When Linux Crashes...!

Rescuing Linux

You are happy that your system running Linux has not crashed so far since you had installed it and hopefully, it may not crash ever. But do you know that Linux does crashes just like any other OS. Though, the frequency of crashes may be far less in Linux in comparison to other OS. And, do you also know that when other often crash able OS like Windows crashes, they become up and running by either resetting or rebooting the machine and if something is little bit more serious, then reinstalling the OS often definitely removes the problems. This is not so when we talk about Linux. When something becomes real serious in Linux, then ctrl-alt-del magic keys, resetting or rebooting the machine or even reinstalling the Linux may not help you. In such situations, the best way to recover from disaster is to avoid such disaster inviting situations and if it is not possible, then be prepared fully in advance with arms and ammunitions to get rescued.

How and why Linux crashes?

There are various reasons that your robust Linux system may eventually crash one fine day. It is possible that a corrupted file system may cause a crash. And you probably know that how files gets corrupted due to slightest of silly reasons like power failure or bad blocks on disks etc. One bad thing about Linux is that, it did not support all available hardware as of now. And when you try to install a new hardware that is particularly new to Linux world, then you are definitely inviting system crashes. Linux also lacks in full support on multimedia, and when you try to run resource intensive multimedia programs, you are inviting system crashes. In Linux system, everything is identified as files, and when you work on GUI mode as root user, then again, you are definitely inviting havoc, since an unintentionally wrong click of a mouse somewhere may also lead your system to crash. When you update your Linux kernel, and forget to run LILO (Intel-compatible PCs) to remember your new kernel, your system will crash in next reboot. If you install another OS that rewrites MBR and does not support other operating system such as Windows9x, your Linux system’s LILO will be over written by new MBR and you won’t be able to boot in to Linux. Some other time, your Linux system boots perfectly, but due to some bizarre reasons, it disallows you to login. And finally, a new program that you installed had some undetected bug finds your system configuration most favourable and triggers, thus making your Linux system to crash.

What to do to avoid system crashes?

There is no guarantee that taking every precaution that had been documented so far will make your system immune enough to crashes. But the occurrences, the frequency and the severity of crashes may be tamed enough to do minimum damage on the system and produce less inconveniency to the user. Here are some tips that may help you run your system for an indefinite period without a crash and if a crash happens even after that then may do minimum damage.

ª Avoid using Linux as root and especially during X Windows session. Except for system administration work and during extremely essential system setting changing that require root authentication, do not use Linux as root. Using Linux as root during X Windows session are discouraged, and it is recommended that you use command line session. The logic behind this is that when you type in commands, then you probably know what you are doing and implication of commands you are entering. But when you are in X Windows mode as root user, a wrong click to an executable file can put you in a dilemma. The splash screen of Gnome desktop environment cautions you when you start Gnome as root user that you may damage your system while using Gnome as root. It is recommended that you create a user for yourself, login with this user ID and use su command if you need to do things as root operator.

ª Wise men’s idea to avoid system crash is to deal with it before it happens. So, always be prepared fully with good backup plan by performing frequent periodic backup and especially before adding any hardware, software or changing system settings.

ª Get handy emergency boot disks with latest, currently installed kernel. Boot disks with old kernel will not help you. So when you upgrade your system with latest kernel, always create fresh boot disks without fail. (See box for how to create boot disks)

ª Remove unnecessary programme from start up that may create problem. For example, running kudzu, the automatic new hardware detection tool at start up may freeze your system during start up when it find a new offending hardware not compatible with Linux.

ª In Linux, e2fsck (for second extended file system) is equivalent to that of Windows scandisk. And you know the value of scandisk if you use Windows. So, make e2fsck run at start up if it is not so by default and never disable it.

ª Hard drives are real cheap these days. But it does not mean that you fill up full 10 gigs with unnecessary and unwanted programmes. Probabilities of system crashes are directly proportional to the number of installed programs. So when you install a new programme, probability of system crash increases.

ª An abrupt power failure or shutting down the running Linux without proper shut down or halt command may spell disaster. When Linux is made to shut down properly, it updates the inodes i.e. structure representation of files. An abrupt power failure does not give time to update this information and Linux could lose track of files and hence may crash. Proper shutdown and using uninterrupted power supply for crash proof Linux system is essential.

Disaster recovery

Well, despite all your efforts and all sorts of precautions, one fine day, your Linux system crashed severely. Now, what to do to get back in to the field? If your system had crashed and is not responding, then try ctrl+alt+del key combination, the tried and trusted magical tool of Windows world. Simultaneously pressing ctlr, alt and del keys produce a command that reboots your system after proper shutdown. When this arsenal fails, try to reset the system with reset button. Again if resetting does not work wonders for you then there is something real serious that need proper detection and investigation, a lot of work and equally a lot of patient.

When your Linux system refuses to boot, try to figure out your last action that has made your system unbootable. Read the error message at start up. Often you may find some clue about the problem that exists. For example, if you had installed LILO to load your Linux, then it displays 70 different error messages at the time of booting so that you can understand what is going wrong. These error messages can indicate problems such as disk or media problems; BIOS errors; transient disk read problems etc. When there is no error message at LILO prompt but still, Linux refuses to boot normally, try booting in to rescue mode. Type ‘rescue’ at LILO prompt, without quote. Soon you will be offered with # bash prompt, where you can use more than 40 useful commands (see box for available command list in rescue mode) in rescue mode that will help you solve your problems. When your system boots, but does not allow you to log in after booting, then try booting in single user mode. Type ‘single’ at LILO boot prompt without quote. You can now repair your system in single user mode.

Finally, when all effort form booting hard disk fails, fetch boot and rescue floppies and try to boot from these. At first, insert boot floppy in to floppy drive but see that your BIOS setting has been set to boot FIRST from floppy drive. Now at the boot: prompt, type rescue that will load kernel from the floppy. Follow the instructions, and when asked, eject the boot floppy disk and insert rescue floppy disk. Your # bash prompt will be there to help you repair your system. You can also boot in to rescue mode from CD ROM if that contain autoboot files containing kernel of rescue image. You have to change BIOS settings for that so that your system can boot first from CDROM drive.

If you still find problem running Linux and did not find a way out, it is time to call your Linux support team. But in any case, don’t panic. With a little determination, passion for error detection, and with a bit of perseverance, you can overcome your problem related to Linux, the addictive, hot OS.

No comments: