Part of the problem that I’m concerned about is that the vast majority of Ubuntu users are less experienced that say Debian users. That’s not a slight against Ubuntu users, but merely a statement of fact; Ubuntu has done a lot of good work to allow less experienced users to be able to install and use Linux. That is a good thing; a very good thing. But it also means that sometimes the protecting users against themselves is in fact a good thing. There is a reason why there are safety interlocks on lawn mowers; giving more control to inexperienced users is not always a good thing.

So if the filesystem is corrupted such that if the system is booted, the “mission critical” application would silently give the wrong answers, or perhaps trade the wrong stocks, or give the 1000 times the amount of X-rays necessary to the human body, would you really be doing the user a favor by giving them the ability to skip an fsck because they are impatient? For life and mission critical systems, usually the designers want to give less control to the users (who often are not sophisticated computer users), not more control.

If the system has to be kept running in order to keep some mission critical system going, then the right answer is to have backup systems and a high availability system (such as Linux-HA) which enables the backup when the primary system is not available. Skipping necessary filesystem checks just because “it might take too long” and allowing potential silent failures is Just A Bad Idea.

Then too, if you really want to avoid long delays due to periodic fsck’s, the right answer is to use devicemapper, and have a cron script fired during the off-hours (say 1am on Sunday nights, when no one is using the system), which takes a read-only snapshot of the filesystem, and then run the e2fsck against the snapshot once a week or once a month. If there are any discrepancies detected when checking the read-only snapshot, then the script should either send e-mail to the system administrator requesting scheduled maintenance ASAP to fix the problem, or if there is a HA system running, the script should signal the HA system that it is about to take the system down, then shutdown the applications and force a reboot and fsck of the corrupted filesystem. If no errors are detected in the read-only snapshot, then the read-only snapshot can be released and “tune2fs -C 0 -T now /dev/sdXX” can be used on the original filesystem indicate that it has been successfully checked. So there are clean ways of avoiding the slow boot-time checks while actually increasing the system reliability, besides letting a potentially clueless user skip a necessary system function out of impatience.

OK, latest update to this: the problem is still in gutsy. I haven’t yet run fsck on the images you suggested making, but will do so.

In the meantime, my laptop was a brick for 3 days during a technical conference, when I _really_ would have appreciated having it be functional, so I’m going to argue your points. Yes, there’s a bug in fsck which caused it, and sure, when you fix the bug, I won’t need the skip option…until the next time there’s a bug like this, when another user on another computer with some weird hardware or bios configuration hits a similar snag.

> Part of the problem that I’m concerned about is that the vast majority of Ubuntu users are less experienced that say Debian users.

So? As my friend points out, Windows users are a good deal less experienced that Ubuntu users, and yet *they* can skip scandisk. Heresy, I know, to compare scandisk with fsck, and yet there users have a choice. I think to deny users a choice is anti-freedom, and autocratic. Your thinking on this analagous to Microsoft’s forcing of system updates on its users; “we know best; this is for your own good; shut up and swallow it, buckwad.”

I am not suggesting that the option to skip fsck be so obvious as to make it easy for “noobs” to cancel it every time. In fact, you could even display horrible warnings when the user does skip it. But, if the user wants to completely wreck their computer by skipping maintenance steps, then let them. You are not their parent, nanny, dictator, or any other authority.

> So if the filesystem is corrupted such that if the system is booted, the “mission critical” application would silently give the wrong answers, or perhaps trade the wrong stocks, or give the 1000 times the amount of X-rays necessary to the human body, would you really be doing the user a favor by giving them the ability to skip an fsck because they are impatient?

But we’re not talking about computers that are monitoring nuclear power plants, or running vital infrastructure; we’re talking about average everyday joe who wants to check his email, show off some presentations at work, write documents, etc. Just as you wouldn’t require average everyday joe to fill out a 30 point checklist every time they boot the system to make sure everything is in order, you shouldn’t force “mission critical” level maintenance checks on him either.

> Then too, if you really want to avoid long delays due to periodic fsck’s, the right answer is to use devicemapper, and have a cron script fired during the off-hours (say 1am on Sunday nights, when no one is using the system)

This is not the right answer on a laptop, or an average user’s desktop. In these cases, the user powers down their computer on a regular basis.

There are two ways I could see doing the workaround (which is completely seperate to the issue of this bug), which would make power users happy, and keep your noobs in line:
1) Add a boot option that skips fsck. Perhaps “safe mode” on ubuntu would include this, perhaps not.
2) Add a thread that listens for a key sequence (ctrl c?); when it detects the sequence, display a nasty message, and only abort the scan if the user confirms that they’re willing to die for want of their system being properly fsck’ed.

Again, I understand the importance of fsck, and I understand that it should be run regularly. However, I also think that there are legitimate reasons for users wanting to skip it on occasion, and that you should provide for these, *from pre-boot*, rather than forcing the user to hack the fstab or use tune2fs *post boot*.

Thanks

Link



Related Leave a Comment