A lot has been written about systemd. So much has been written about it that it’s hard to find out how to get started. So I’m going to do my part and piss in the pond to cloud the waters some more. My goal however is not give you my opinion on whether systemd is a good init system or not, but rather how to use it. Because whether it’s great or not so great: if you’re doing devops then working with systemd is an inevitability.
What is systemd?
In a nutshell, it’s an init system. That means that it’s main task is to bootstrap the user space and manage all processes.
But what the fuck do people mean with “bootstraping the user space”?
That’s a good question. User space is a name for all processes that run outside of the Linux kernel. When the kernel is running you can start running your own processes. This process of starting all these processes is “bootstrapping the user space.” Or userland. I mean, if it starts with user- and ends with -space, -land, -environment or -area we’re probably talking about the same thing. I don’t want to turn this in to a Linux tutorial. Honestly I don’t know that much about it anyway.
As for what systemd does, it does a lot. It provides a whole bunch of services and daemons so that you can login, use your terminal, configure your network interfaces, et etera. However what we’re going to be most interested in for now is managing your processes and have them run in the background, also known as daemonizing or Doing that thing with screen but then proper. Yeah you can confess about your sins here. You dirty process-in-screen-runner because you never bothered to find out how to daemonize your processes with sysvinit.
How does it work?
When you start your Linux system, the first process that starts is the process that manages all other processes. This is called your init system. This is why your init system has process ID #1. All processes you start are indirectly or directly related to your init process. Init processes are also sweethearts in the sense that they take care of any process you accidentaly or intentionally sever from their parent process. The init process will probably kill the orphan anyway. Such is life in the world of processes.
When you shut down your machine, your init system is also responsible for turning the lights off in order and locking the door. Well, it’s a shared responsibility between the owner of the machine and the init system but I’ll save that for some other time.
Creating a service
To get started with creating a service, we need a process, and a process needs to have a purpose. Most processes are important, like handling SSH logins, HTTP requests, web-scaling Mongo, et cetera. Other processes are less important, their only purpose is to pass the butter.
For testing purposes I’m going to come up with a simple process with a simple spec: write the time to a log file every 5 seconds.
I absolutely love python because it’s as close to pseudocode as you can get, so everybody can follow along.
The process looks like this:
And the utility functions look like this:
systemd Controls daemons and services using unit files. These unit files contain information on how systemd can start, stop, reload and do a whole bunch of other stuff with your service. Unit files live under
system contains unit services installed by the system and
user for unit files created by installed packages.
Let’s take a look at a very bare bones sample unit file:
[Unit] Description=Application [Service] ExecStart=/bin/sh -c 'echo "hello world'
Beats having to write a system control file in bash doesn’t it? Let’s take you on a journey to see what is happening here.
|Description||Description of what your service does|
|ExecStart||Command that will be executed when you give systemd the command to start the process|
Making a unit file for our service.
As this is an introduction to systemd, I will keep simple. Future posts will cover more complex cases. And don’t forget, overengineering literally kills millions of people every year.
[Unit] Description=Chronos Time Logging [Service] ExecStart=python /opt/chronos/chronos.py
I’ll place my unit files under
chronos.service. It’s important that you give your unit files the
.service extension as systemd uses these to determine how to handle the unit files during boot. We’ll touch on this in later posts.
For now, and with the unit file in place let’s fire up our service!
[root@localhost cronos]# systemctl start chronos Failed to start chronos.service: Unit chronos.service failed to load: Invalid argument. See system logs and 'systemctl status chronos.service' for details.
Oh. That’s not good. systemd Logs all issues with processes to the system log. So you can easily see what went wrong with
[root@localhost cronos]# journalctl -xe Sep 01 08:54:01 localhost.localdomain systemd: [/etc/systemd/system/chronos.service:5] Executable path is not absolute, ignoring: python /opt/chronos/chronos.py Sep 01 08:54:01 localhost.localdomain systemd: chronos.service lacks both ExecStart= and ExecStop= setting. Refusing.
Oh deary me. I completely forgot that the executable path has to be absolute.
Are you sure you’re the best guy to tell me how to daemonize my processes with systemd?
This is all part of my plan. You’ve now learnt how to trouble shoot processes not starting. For which you are very much welcome.
The reason why
ExecStart=python /opt/chronos/chronos.py won’t work, is because python is not an absolute path.
/usr/bin/python on the other hand is. So Let’s give it a go with:
And firing up our service again:
[root@localhost cronos]# systemctl start chronos [root@localhost cronos]#
Hey that’s a good sign! our process is now running in the background, right?
[root@localhost chronos]# systemctl status chronos ● chronos.service - Chronos Time Logging Loaded: loaded (/etc/systemd/system/chronos.service; static; vendor preset: disabled) Active: active (running) since Thu 2016-09-22 00:29:43 EDT; 8min ago Main PID: 2290 (python) CGroup: /system.slice/chronos.service └─2290 /usr/bin/python /opt/chronos/chronos.py
Holy crap! That’s all it took to daemonize my process? Yep. That’s all it took.
Great now how do I stop it?
I’m glad you asked!
[root@localhost cronos]# systemctl stop chronos [root@localhost cronos]# systemctl status chronos ● chronos.service - Chronos Time Logging Loaded: loaded (/etc/systemd/system/chronos.service; static; vendor preset: disabled) Active: inactive (dead) [root@localhost cronos]# ps aux | grep chronos root 17300 0.0 0.0 112616 732 pts/1 R+ 09:01 0:00 grep --color=auto chronos
Are you sure the process ran though?
You little sh– yes I’m sure it ran. Let’s take a look at the logs:
[root@localhost chronos]# cat /var/log/chronos-unknown.log Starting Chronos 2016-09-22 04:29:43.642116 2016-09-22 04:29:48.647540 ... 2016-09-22 04:29:53.653167 2016-09-22 04:29:58.659881 Stopping Chronos because of SIGINT
Tadaa! Started and stopped without a hitch.
Automatically restarting your service
By default systemd will not restart your process. Whether you
sigkill, systemd don’t care unless you specify the
Restart option in your unit file.
[Unit] Description=Chronos Time Logging [Service] Restart=on-failure ExecStart=/usr/bin/python /opt/chronos/chronos.py
There are a few options available for
Restart, but for now
on-failure will do the job, as it will bring the process back up if the main process::
- Ended with an unclean exit code, by default anything other than
- Ended with an unclean signal, anything other than
- Timed out
- The watchdog can no longer find the process
With the new
Restart option let’s bring our service back:
[root@localhost chronos]# systemctl start chronos [root@localhost chronos]# systemctl status chronos ● chronos.service - Chronos Time Logging Loaded: loaded (/etc/systemd/system/chronos.service; static; vendor preset: disabled) Active: active (running) since Thu 2016-09-22 02:32:00 EDT; 5s ago Main PID: 2470 (python) CGroup: /system.slice/chronos.service └─2470 /usr/bin/python /opt/chronos/chronos.py
Kill it with
SIGKILL, and immediately get the status of the process
[root@localhost chronos]# kill -9 2470 [root@localhost chronos]# systemctl status chronos ● chronos.service - Chronos Time Logging Loaded: loaded (/etc/systemd/system/chronos.service; static; vendor preset: disabled) Active: active (running) since Thu 2016-09-22 02:33:25 EDT; 959ms ago Main PID: 2472 (python) CGroup: /system.slice/chronos.service └─2472 /usr/bin/python /opt/chronos/chronos.py
The fact that is started
959ms ago verifies that our process was killed and restarted. And there you have it! You should now be able to run a simple process as a service with systemd.
Music of Choice
This blogpost was written while listening to: