That’s what systemd’s dreaded “a stop job is running” is
The worst part of that is that you can’t quickly login to check what it is (so maybe you can prevent it in the future?), or kill it anyway because it’s likely to be something stupid and unimportant. And if it actually was important, well… it’s gonna be shot in the head in a minute anyway, and there’s nothing you can do to prevent it, so what’s the point of delaying?
so what’s the point of delaying?
In the best case the offending process actually does shut down cleanly before the time is up. Like, some databases like redis keep written data in memory for fast access before actually writing the data to disc. If you were to kill such a process before all the data is written you’d lose it.
So, admins of servers like these might even opt to increase the timeout, depending on their configuration and disc speed.
I know what it theoretically is for, I still think it’s a bad implementation.
- It often doesn’t tell you clearly what it is waiting for.
- It doesn’t allow you to checkout what’s going on with the process that isn’t responding, because logins are already disabled
- It doesn’t allow you to cancel the wait and terminate the process anyway. 9/10 when I get it, it has been because of something stupid like a stale NFS mount or a bug in a unit file.
- If it is actually something important, like your Redis example, it doesn’t allow you to cancel the shutdown, or to give it more time. Who’s to say that your Redis instance will be able to persist its state to disk within 90 seconds, or any arbitrary time?
Finally, I think that well written applications should be resilient to being terminated unexpectedly. If, like in your Redis example, you put data in memory without it being backed by persistent storage, you should expect to lose it. After all, power outages and crashes do happen as well.