environment.d

The upcoming systemd 233 release contains the following feature: systemd user manager instance will load environment variables from a set of plain-text configuration files from /etc/environment.d/, ~/.config/environment.d, /usr/lib/environment.d. Each file must have a .conf extension and contain a set of KEY=value pairs. Those pairs become part of the environment variable block that the systemd user manager provides for any service that it starts.

For example, let's create ~/.config/environment.d/90-mypython37.conf:

PATH=$HOME/bin:$PATH
PYTHONPATH=${PYTHONPATH:+$PYTHONPATH:}$HOME/lib/python3.7/site-packages
PYTHONDEBUG=1

This does what you'd guess: it prepends ~/bin to $PATH, appends ~/lib/python3.7/site-packages to $PYTHONPATH, and sets $PYTHONDEBUG to "1".

The are some subtleties here: it wouldn't be OK to create a $PATH variable with an empty component, for example PATH=$HOME/bin:/bin:, because "zero-length (null) directory name in the value of PATH indicates the current directory" (bash(1)), i.e. allows commands from untrusted directories to be executed. We rely on systemd setting a default value for us, and it is enough to just extend this variable.

A similar consideration holds for $PYTHONPATH: we want to preserve the existing setting (if any), and just append our new directory at the end. We must avoid adding an empty component to $PYTHONPATH, in order to avoid loading python modules from untrusted directories (c.f. python issue 5753). Since we don't know if the variable will be defined (it usually isn't), conditional expansion is used. The syntax is the same as in bash, and ${VAR:+...} expands to "..." if and only if $VAR is defined. Similarly, ${VAR:-...} expands to "..." if and only if $VAR was not defined. Only this subset of bash syntax is supported, but it's enough to cover those needs.

Finally, we set $PYTHONDEBUG unconditionally. This will overwrite any value set previously. Files are loaded in alphanumerical order, and environment.d configuration files should have names with a numerical prefix (similarly to tmpfiles.d and sysctl.d). This makes it easy to know the priority of each setting at a glance.

OK, so we defined a bunch of variables. Where exactly are they used? Each logged-in user has their own systemd user manager, which is started by the system manager (i.e. PID 1) on behalf of systemd-logind, usually when the user first logs in. This systemd --user instance runs as the user@UID.service and starts various services for the user. Variables defined through environment.d will be set for those services. This does not apply to processes which are started externally, for example login and ssh sessions.

Before discussing "the why", let me jump back a bit, and discuss "the how".

Environment generators

systemd --user does not load environment.d directories directly. In fact, it knows nothing whatsoever about those directories and does not support environment variable expansion with "+" and "-" at all. Instead, systemd calls out to a set of "environment generators" — small programs which are invoked when systemd starts and whose job is to print environment variable assignments on standard output.

Environment generators mirror unit file generators which have been a part of systemd since almost the beggining. Environment generators live in /usr/lib/systemd/system-environment-generators and /usr/lib/systemd/user-environment-generators, and unit file generators live in /usr/lib/systemd/system-generators and /usr/lib/systemd/user-generators. Unit file generators were added to move some complexity and code which requires optional dependencies from the main systemd code into separate binaries.

The man page describes the differences (environment-generators(7)). The two important ones are that environment generators are run sequentially, so that later generators can use the environment variables defined by earlier generators and can override their output, and that the user environment generators are run by the user manager, so their output can be tailored to the user (even though currently the generators themselves are installed system-wide).

Currently, only one generator is available: the systemd-environment-d-generator, which adds support for environment.d as described in the first part of this note.

The reasons

The main motivation for this blog post was provide some background for the choices made in the implementation of environment.d.

First, let's consider the question why we need to support loading environment variables, and why do we need it in systemd.

Traditionally, Linux has not had a uniformly supported mechanism to define the environment. There are at least two widely used mechanisms. The most straightforward approach is the shell profile configuration. Shells usually execute any scripts in /etc/profile.d/ and load /etc/environment. An obvious shortcoming is that this mechanism requires a shell to be used. Before systemd this wasn't that much of a problem, because shell was omnipresent and the invocation of most programs went through a shell at some point. In particular, graphical sessions invoked a bunch of shell scripts in /etc/X11/xinit/xinitrc.d/, so starting one more shell to load the user environment wasn't an issue. With systemd things are moving towards a shell-free environment, and Wayland sessions could do without a shell at all. In fact this was the case with gnome-wayland, until the need to load the environment forced the gnome developers to undo that (gnome bug 736660).

Another mechanism is pam_env. This PAM module supports reading of system-wide configuration from /etc/environment and (optionally) user-specific configuration from ~/.pam_environment. It is fairly widely supported, but cannot be considered a solution. The syntax is a bit quirky:

VARIABLE [DEFAULT=[value]] [OVERRIDE=[value]]

with @{VAR} syntax for PAM variables, and ${VAR} for shell variables. This is obviously not shell-compatible, and fairly easy to get wrong (for example the man page warns that $HOME may not be set). The real deal breaker is the fact that the loading of the environment happens in privileged mode. Setting arbitrary variables is very tricky to do safely (think not just $PATH, but also $LD_PRELOAD, $LD_LIBRARY_PATH, and various other variables which, if set arbitrarily by the user, might derail the code that is running as root). pam_env(8) suggests that this module should be last one the in the PAM stack, but (even assuming that this is enough to close the hole), it's too easy to get this wrong. Thus, the loading of user configuration is usually disabled (at least in the redhattish and debianesque distros). We need a solution which is more flexible and which does not require a parser running as root and which allows us to safely define arbitrary variables.

In the new scheme of things, systemd (the system instance), starts user@.service for the user, and the systemd running in user@.service directly forks various processes. Systemd is already in the business of defining the environment for child processes, and the systemd --user instance is pretty much the only place where we can add environment variables. It has to be done by the manager itself, because we want all processes to inherit the same environment. It is possible to upload new environment settings after the manager is already running (e.g. systemctl set-environment), but this would only apply to services started later. Thus if we want to influence all services started by systemd --user, the manager itself should support doing this during startup. Since the user manager runs with the privileges of the user, no privilege boundaries are crossed. (In fact the environment defined by generators is not used by the manager itself, but that's because there doesn't seem to be any need, and not because it would be insecure).

Second, let's consider the question whether the extended substitution syntax with ${..:-...} and ${..:+...} and arbitrary generators is needed. I think that the example with $PATH and $PYTHONPATH above shows why some kind of conditional logic is needed. Without it various variables could not be set in a useful way (also $XDG_DATA_DIRS, $LD_LIBRARY_PATH, etc.). The extended substitution syntax solves our immediate needs, but think it would be too constraining if it was the only option. Even relatively simple things like iterating over a set of directories and adding them to a list if they exist, or setting some variable only for specific users, or defining a variable if some other variable has a specific value, would be impossible. I felt that we'd face immediate pressure to add more logic, until Turing-completeness would be reached. This would be implemented in the systemd manager itself, and we'd end up either defining a custom DSL or executing shell code ourselves. It seems much better to sidestep this problem, and allow arbitrary executables to be used as generators. The manager then only needs to fork off the generators (and forking off processes is something that systemd does very well), and parse very simple pairs of VAR=value assignments. On the other hand, generators can be very flexible and implement arbitrary logic. The way that generators are ordered and their input and output are clearly defined and hopefully we'll end up with an ecosystem of generators that do their job without stepping on each others' toes.

Third, note that keeping the environment in systemd --user is quite efficient, because parsing of the environment is done just once. systemd-environment-d-generator is a written in C and is quite lean, but a custom generator in bash or Python is possible in one line. If a generator proves generally useful it can always be rewritten in a compiled language for efficency.

Support for /etc/environment

/etc/environment is pulled into environment.d by a symlink, so it is read by systemd-environment-d-generator.

Current status and further work

The idea and implementation of environment.d directories came from Ray Strode (a.k.a. halfline). I added the support for generators and repurposed Ray's code for systemd-environment-d-generator. Following Lennart Poettering's suggestion generators are run in a pipeline, the output of one forming the input for the next one. If any of that turns out to be a bad idea, blame me ;)

Current code loads environment.d into the user manager instance. This defines the environment of processes started by systemd. It can be easily exported (systemctl --user show-environment), but to load it into other login sessions, support from their side is needed. gdm does that already, and login might in the future.

Comments !

social