`systemd-nspawn --bind=/tmp/.X11-unix` broken in 235 · Issue #7093 · systemd/systemd

@guns

Summary

Pull request #6979 should be reconsidered because it breaks a useful systemd-nspawn convention for little practical benefit.

See: #6979 (comment)

Submission type

  • Bug report

systemd version the issue has been seen with

235

Used distribution

Arch Linux

In case of bug report: Expected behaviour you didn't see

  1. Launch container with systemd-nspawn … --bind=/tmp/.X11-unix

  2. Run an X client program successfully from within the container.

In case of bug report: Unexpected behaviour you saw

  1. Launch container with systemd-nspawn … --bind=/tmp/.X11-unix

  2. X client program exits with cannot open display: :0.

  3. X unix socket at /tmp/.X11-unix/X0 on the host has been deleted by systemd-nspawn.

In case of bug report: Steps to reproduce the problem

  1. Create a container if necessary.

  2. Ensure host machine is running an X server that is listening on a unix socket in /tmp/.X11-unix/

  3. Boot the container with:

systemd-nspawn --boot --directory DIR --bind=/tmp/.X11-unix
  1. Check /tmp/.X11-unix/ in the host and notice that all sockets in the directory have been deleted.

@shibumi

I can confirm this bug. But some of points in the bug report are wrong:

X unix socket at /tmp/.X11-unix/X0 on the host has been deleted by systemd-nspawn.

That is not true. The X0 socket is still there it just seems that it's isolated from the rest of the filesystem. You can check this via lsof and you will get for example this:
Xorg 673 674 chris 43u unix 0xffff97a489ffdc00 0t0 118253 @/tmp/.X11-unix/X0 type=STREAM

Bugreport in Arch Linux Bugtracker: https://bugs.archlinux.org/task/55983

@guns

Copy link

Author

@guns guns commented Oct 14, 2017

That is not true. The X0 socket is still there it just seems that it's isolated from the rest of the filesystem. You can check this via lsof and you will get for example this:

Xorg 673 674 chris 43u unix 0xffff97a489ffdc00 0t0 118253 @/tmp/.X11-unix/X0 type=STREAM

@shibumi

Unless I'm mistaken, the issue is that the filesystem entry in the host is unlinked by systemd-nspawn, preventing future connections to the socket on either the container or the host.

Existing connections to /tmp/.X11-unix/X0 are unaffected by deleting the fs entry, so naturally lsof will show that the file handle is still held by existing processes.

@guns

Copy link

Author

@guns guns commented Oct 15, 2017

Lennart Poettering #6979 (comment):

Well. Thats a general problem of making host things available to the container in a writable way. Quite frankly the right way to fix this is to use --bind-ro= instead of --bind= so that the container payload cannot modify what you pass in. That should fix your issue robustly and safely.

Evidently, the readonly mount flag does not prevent connect() on the socket, so mounting with --bind-ro is OK.

@shibumi

Copy link

Contributor

@shibumi shibumi commented Oct 16, 2017

@guns how does this solve anything? I tried that with my container.. now I get this message:
Spawning container on root directory is not supported. Consider using --ephemeral.

And with --ephemeral I get: --ephemeral and --link-journal= may not be combined.

And without --link-journal I get a:
Job for systemd-nspawn@work.service failed because a timeout was exceeded.
Are you sure that issue is fixed?

That is my full override.conf:

ExecStart=/usr/bin/systemd-nspawn --quiet --keep-unit --boot --link-journal=try-guest --settings=override --machine=%I --capability=CAP_NET_ADMIN --network-veth --bind=/tmp/.X11-unix:/tmp/.X11-unix --setenv="DISPLAY=:0"

@shibumi

@guns

Copy link

Author

@guns guns commented Oct 16, 2017

@shibumi

Are you sure that issue is fixed?

I am sure that my original issue is fixed, because I can successfully launch X client programs from a container launched with --bind-ro=/tmp/.X11-unix.

I'm not sure why your container is failing to boot, however.

@shibumi

@guns can you please post your override.conf or the parameter that you use for it?

@guns

Copy link

Author

@guns guns commented Oct 16, 2017

Sure. I create shell scripts that exec into systemd-nspawn invocations. Here is a minimal reproducible example:

#!/bin/sh

exec /usr/bin/systemd-nspawn \
    --machine=NAME \
    --boot \
    --setenv=DISPLAY="$DISPLAY" \
    --bind-ro=/tmp/.X11-unix

@shibumi

Copy link

Contributor

@shibumi shibumi commented Oct 17, 2017

mh as I thought. Your setup is very simple and you don't register your machine to systemd-machined nor you set things like --network-veth. I can just say that the current systemd version breaks my container setup.

Should I open a new issue for this or do we re-open this one? I used the container to use a seperate VPN and spawn applications out of this container (like a web-browser). This way I could work with 2 browsers at the same time. One with VPN, the other one without.

@guns

Copy link

Author

@guns guns commented Oct 17, 2017

Your setup is very simple and you don't register your machine to systemd-machined nor you set things like --network-veth

Like I said, it was a minimal example. I use --link-journal=try-guest --private-network --network-veth with my actual containers, and can control them externally with machinectl.

Should I open a new issue for this or do we re-open this one?

Since it can be demonstrated that --bind-ro=/tmp/.X11-unix does not prevent X clients launching from within the machine, your problem appears to be a separate issue.

@shibumi

Ok, no idea what that was.. suddenly it works. Thx for your last comment that makes me testing it again.

@poettering

Spawning container on root directory is not supported. Consider using --ephemeral.

This suggests you didn't specify -M nor -D nor -i and / was your working directory? Or that you specified -D/? either way it appears quite unrelated to the original issue at hand.

I'd need to know the precise nspawn command line ot help you further, if this is till an issue

@shibumi

@poettering Everything fine now. No idea what happened there. It's working now.