SysV init must die.

I'm joking. You may happily continue relying on init if you feel yourself comfortable with it.

The point is, _I_ was never comfortable with how it works.

Init was not a smooth sailing for me when I started using linux, and I started questioning a need to have it in its current form.

To make long story short, util-linux's init simply stopped restarting getty's one day. After frustrating debugging, complicated by the fact that I can't run a "debugging version of init" in a sandbox, I found out that init calculated elapsed time incorrectly (with sign-extension rather that zero-extension), and on this day bug started to actually trigger. Init was wrongly concluding that getty's are respawining too fast, leaving me with the system with no way to login.

After I come up with a fix, I had a hard time convincing maintainer that the bug is real and the fix is needed.

After a few more buglets in util-linux I started looking for less complex and easier-to-work-with method of doing it.

I found tools which do init's job, and in my opinion they do it much better than init. I am using them for several years now. I frequently mention it whenever people are mentioning problems in getting their init and/or shutdown working properly.

me:

I always wondered why signaling init was chosen as a way to initiate reboot. After all, we do not mount devices by signaling init. We do not up network interfaces by signaling init. I mean, we do not do <some admin actions> by 'kill -<somesig> 1', why do we do this particular admin action (reboot) in this bizarre way?

We can kill all processes, remount RO and reboot without signaling init.

unix guy:

We signal init so that init can run the shutdown script.

If you will kill all processes from a shell script, for example, and you are on a system where init is set to respawn daemons, every time you kill one init will start it up again, and you'll be involved in an endless game of whack-a-mole. (Or you'll shutdown with daemons starting up, doing who knows what to the filesystem state...)

me:

It is unfortunate that init has no standard way of being told "please stop respawning children". I think it is worth adding, but I won't push it, because I have a more radical change in mind.

I think that init which respawns daemons ia a bad idea per se. It is trying to perform two unrelated things:

It is _wrong_. Uhix philosophy is to do one thing, and do it well. _One_ thing.

You can have a separate "daemon spawner" process and thus remove this functionality from init. Init's code will get much simpler:

In fact, then init can be implemented with a shell script which just starts initscripts, and waits(2) for any children in an endless loop. Since it's trivial to edit a shell script and change name(s) of initscript(s), you don't need to have /etc/inittab now. No configuration files. And shutdown/reboot can also be implemented with a shell script.

unix guy:

So init should _not_ respawn mingetty on /dev/tty* when you exit?

The rest of the daemon spawning behavior emerged out of the need to respawn "login" sessions back when teletypes used rolls of paper...

me:

Separate "daemon respawner" tool can do it. While we are at it, we can actually make the task of controlling this easier. For example, I can stop getty respawning on a particular tty with single command:

# sv d /var/service/getty6

This informs "daemon respawner" that I don't want getty on tty6 to respawn anymore. How would you do that with init?

unix guy:

What will happen if "daemon spawner" process gets, for example, whacked by the OOM killer?

me:

If it will die or get killed, system will not stop. Just the respawning stops, giving you time to login, find out what's going on, restart it, debug it, fix the bug etc.

What will you do if _init_ fails to restart a daemon, or worse, dies? On Linux, if init dies, you get instant kernel panic and the system is dead. How will you debug such a problem? (Not that it's impossible, it's just more difficult, because init is so special).

unix guy:

Doesn't this new "daemon spawner" thing have to be notified that the system is shutting down now? (If nothing else, but killing it first so it isn't fighting you when you kill everything else?)

me:

In fact, no, it does not need to have any "magic" way of being notified. It just needs to react to SIGTERM in a logical way - that is, clean up and exit (in this specific case "cleanup" is a no-op).

You can send TERM specifically to this process it you want that for whatever reason. The most typical case when TERM is sent, though, is when you shutdown the machine. Shutdown script just STOPs, TERMs, and CONTs everybody. It does not need any special knowledge about "daemon spawner". I like it, because it is a generic, elegant solution, without special cases.

unix guy:

You're arguing against something people have spent 30 years making work. They do it that way for a reason.

me:

Age is not a valid technical argument. Sendmail is maybe 30 years old too. People are still using it. It doesn't make sendmail any better.

unix guy:

Go make it work your way and then come back to us when you hit a tricky corner case having to do with process group inheritance or console ownership some such piece of evil, and we'll tell you how it was worked out in the existing code many years ago...

me:

I am doing exactly this for several years now, and want to let people know that it actually works rather nice.

I use two utilities - runsvdir and runsv. "runsvdir dir" is cd'ing into dir, and start "runsv subdir" for each subdirectory of it. If any of these runsv porcesses exits, it is restarted. If new directories appear later, new runsv processes started for each.

"runsv subdir" looks for an executable named "run" in the subdir, and will run it unless there is also a file named "down" in that dir. If run exits, it gets started again. Very often, it's a shell script which ends with "exec something". Such runsv-controlled directory is called "service".

There is a provision for running a logger task too, but I won't distract you with it for now.

You will have a process tree which looks like this:

 7550 ?        00:00:00   runsvdir /var/service
 7600 ?        00:00:00     runsv xdm
 7605 ?        00:00:00       svlogd
 7606 ?        00:00:00       gdm
 7644 ?        00:00:00         gdm
 7647 tty7     02:08:48           Xorg
20335 ?        00:00:00           startkde
20471 ?        00:00:00             kwrapper
17052 ?        00:00:00     runsv getty1
19182 tty1     00:00:00       getty

There is a tool, sv, for controlling runsv's. "sv s dir1 dir2" shows you a status of services in dir1 and dir2. "sv d dir" instructs runsv to stop the service (by sending TERM). "sv u dir" starts it again. "sv o dir" starts it once (that is, it won't be restarted if it dies). "sv t dir" sends TERM to the service (which typically is a way to restart it). "sv k dir" sends KILL. Etc.

A few real-world examples of what you can do with runsvdir/runsv.

admin: I have a running system and I want to stop everything and go to singleuser mode for filesystem repair. IOW: I want to go to runlevel 1. Can your runsv crap do that?

me:

# cd /var/service
# sv d *
# sv u getty*
Everything will stop, then only getty's will restart. You can login and do filesystem repairs without other services running.

admin: I have a system with text-based getty logins. I want to have graphical login (say, gdm). IOW: I am at "runlevel 3" and want to go to "runlevel 5".

me: create /var/service/xdm and create a "run" shell script there:

#!/bin/sh
exec gdm -nodaemon

That's all. runsvdir picks it up, starts runsv, and you have your gdm login screen. When you terminate your X session, it restarts.

admin: I have gdm, databases, web servers and whole slew of things running under runsvdir. I need to stop X, but other stuff should still run. IOW: I want to go back to "runlevel 3".

me:

# cd /ver/service/xdm
# sv d .
# >down (if you also want to prevent xdm service from starting on next reboot)

me: admin, can you stop gettys on ttys 3,5,7, and stop PostgreSQL, but keep everything else running? What runlevel in sysV init-based system is *that*?

admin: WTF??

me: I can do it easily, like this:

# cd /var/service
# sv d getty[357] postgresql

admin: you can edit /etc/inittab and then run "telinit q".

me: This solution is working well only if you hand-edit /etc/inittab. Now imagine that you need to do that from the script (say, your script does weekly cold backup of PostgreSQL and need to stop it first). Try to code a script which edits /etc/inittab. Things get ugly pretty damn quickly when you try to do it cleanly.

More things you can do with runsvdir (and which are impossible with init):

Over the years, I've run many different things successfully under runsvdir: getty, gpm, sshd, ntp, DHCP client, DHCP server, tftpd, DNS cache, ftp server, various Web servers, mysql, PostgreSQL, gdm, automounter, klogd, syslogd.

Many of these things you'd never even consider to run from init, because, I guess, it's awkward to add daemons to the set of things already controlled by init on a running system, and then start/stop them at will, if needed.

With runsvdir, it's easy.

Credits and links:

http://cr.yp.to/daemontools.html - D. J. Bernstein's daemontools package.

http://smarden.org/runit/ - Gerrit Pape's GPLed reimplementation of daemontools with slightly different command syntax. runsvdir and runsv which I was talking about above are a part of runit.