Vendor: containerd Project
Vendor URL: https://containerd.io/
Versions affected: 1.3.x, 1.2.x, 1.4.x, others likely
Systems Affected: Linux
Author: Jeff Dileo
CVE Identifier: CVE-2020-15257
Advisory URL: https://github.com/containerd/containerd/security/advisories/GHSA-36xw-fx78-c5r4
Risk: High (full root container escape for a common container configuration)
containerd is a container runtime underpinning Docker and common Kubernetes configurations. It handles abstractions related to containerization and provides APIs to manage container lifecycles. containerd-shim is a binary spawned by containerd that serves as the parent of a container and which implements container lifecycle and reconnection logic that it exposes to containerd through the containerd shim API. This API is exposed over an abstract namespace Unix domain socket that is accessible from the root network namespace. Due to this, non-user namespaced containers with host networking can access this API and cause containerd-shim to perform dangerous actions and spin up arbitrarily privileged containers, enabling container escapes and escalation to full root privileges on the host.
WithStart()
, newCommand()
serve()
newServer()
UnixSocketRequireSameUser()
An attacker that is able to run or compromise a host network container running as UID 0 can escape the container, escalate privileges, and compromise the host.
containerd is a core container runtime, which manages runc-based containers, and is used by Docker (from which it was spun out of) and Kubernetes, either through Docker or directly through the containerd CRI shim. Generally, containerd exists as a long-running service daemon that exposes gRPC APIs (e.g. those for containers and tasks) for container lifecycle management operations (e.g. container execution and supervision, image handling, etc.). To implement its APIs, containerd does not directly parent the containers that it creates and oversees on behalf of its clients. Instead, containerd spawns containerd-shim processes that manage the lifecycle of each container. containerd-shim stays alive for the course of the container’s life to manage it and directly invokes the runc binary to directly spawn and run the container itself.
To serve its own gRPC (actually ttrpc
, an embedded gRPC implementation and
wire protocol) APIs (e.g. v1 and v2), containerd-shim listens on an abstract Unix
domain socket. These are Linux-specific Unix domain sockets that use
length-prefixed keys that begin with a null byte and may contain arbitrary
binary sequences. These containerd-shim sockets take different forms across
different containerd versions; however, a common behavior is that they embed a
trailing null byte in the abstract Unix domain socket sun_path key, which
prevents a number of common Unix tools (e.g. socat) from connecting to it.
@/containerd-shim///shim.sock\0
@/containerd-shim/.sock\0
While containerd-shim is more than capable of binding and listening on such a
socket itself when passed the --socket
CLI flag, it also supports receiving
an arbitrary socket file descriptor from its parent process. containerd uses
this approach and pre-creates and listen(2)s on the abstract Unix domain socket
before the containerd-shim child process is created to that it may be
initialized with a handle to it. containerd-shim then starts its containerd
shim API ttrpc server on the socket. As abstract Unix domain sockets are
otherwise permissionless, containerd-shim uses standard Unix domain socket
features to validate that incoming connections have the same UID and EUID
(effective UID) as the containerd-shim process itself (typically UID:0 and
EUID:0, root).
However, unlike normal Unix domain sockets, which are bound to file paths,
abstract Unix domain sockets are tied to the network namespace of a process.
As a result, containers that use host networking
(e.g. docker run --host network alpine ...
) will be able to access it.
Furthermore, while most containerization platforms run their containers with
a minimal set of Linux capabilities (the constituent privileges of root), they
also do not run the containers in user namespaces, resulting in containers
that run as a privileged dropped root user. Due to this, such containers run
by default with a host user namespace UID and EUID of 0. This combination
enables such containers to enumerate containerd-shim sockets (e.g. via
netstat -xl
or /proc/net/unix) and successfully connect to them.
containerd-shim exposes a number of dangerous APIs that can be used to escape a container and execute privileged commands. Across the two main versions of containerd(-shim) in use, 1.2.x and 1.3.x, the following exploit primitives are exposed to users, among others:
As a result, it is trivial for an attacker to compromise the host if they can reach the containerd shim API.
Abstract namespace Unix domain sockets should not be used to communicate with containerd-shim. Instead, the connection should be performed over unnamed Unix domain sockets created with socketpair(2), or Unix domain sockets bound to a file path, like /run/containerd/containerd.sock and /run/containerd/containerd.sock.ttrpc. If this is not feasible, stricter access control checks would need to be performed to validate incoming shim API clients, and it may be necessary to modify the connection handshake to provide additional authentication data and/or identification. It should be noted that it is insufficient to check that the connecting process is not a child of containerd-shim itself as the process could still connect to the shim API of a different container’s containerd-shim.
For users running container workloads on vulnerable systems, this issue may be mitigated by disallowing host networking from any containers that are not user namespaced, or by ensuring that such containers are run with a non-zero UID/GID.
Users should update to the newest versions of containerd that include patches for this issue. Additionally, as any running containers created prior to updating containerd to a fixed version will remain vulnerable after the update, users will need to ensure that all containers are fully stopped and then restarted after the update is completed.
For users who are uncertain about whether CVE-2020-15257 affects them, the below command can be used to quickly determine if a container created by a vulnerable version of containerd is still running. If any results are returned, a vulnerable containerd-shim process is running.
$ cat /proc/net/unix | grep 'containerd-shim' | grep '@'
6/03/20 - NCC Group emailed the security email of the containerd project
([email protected]) asking for a means of secure
communication to disclose vulnerability information
6/03/20 - NCC Group disclosed vulnerability to the containerd project along
with exploit code targeting containerd 1.2.x and 1.3.x
6/04-05/20 - After some initial conversation over email about possible
remediations, communication migrated to GitHub.
6/05/20 - NCC Group discussed the (in)feasibility of relying on
AppArmor/SELinux to remediate this issue.
6/12/20 - NCC Group requests an update.
6/15/20 - Issue is not accepted as a security vulnerability in containerd.
The containerd project indicates that while a fix will be applied, it
will not be backported to in-use branches. A sample patch is shared
with NCC Group.
6/15-16/20 - Further replies and conversation occurred about the aforementioned
patch's implementation and its incompatibility with prior versions
of containerd. NCC Group provided information on an alternate
approach that could work for all versions.
6/19-24/20 - Further development of a patch occurs by a containerd maintainer
who requests and receives permission to make a public pull
request. The implementation follows NCC Group's original
recommendation and would be compatible across containerd versions.
7/10/20 - NCC Group requests an update and an estimate on when the fix will be
merged and applied to older containerd branches.
7/13/20 - A containerd maintainer replies stating that the upcoming 1.4.0
release will forgo having the fix applied, and that instead, it will
be be applied as a fix in 1.4.1 and to at least the 1.3.x branch.
9/04/20 - After a lack of updates, NCC Group states an intention to publish a
technical advisory for this issue, and asks if anyone can confirm if
the fix has been applied/backported as the standing pull request was
commented as having been pushed to the future 1.5.x release. NCC
Group also asks for a timeline on when the issue will be fixed and
states that they can wait up to 30 days (10/05/20) or until a fix is
released to publish the advisory since the issue was not accepted as
a vulnerability.
9/10/20 - A containerd maintainer replies stating that the issue is still not
fixed and that the pull request is not likely to be merged soon. They
ask for reconsideration of the backwards-incompatible fix.
9/10/20 - NCC Group replies with concerns about the approach of the
backwards-incompatible fix, including a timing side channel in the
implementation that would enable guessing the authentication secret,
and a bias in the PRNG used to create it.
10/02/20 - A maintainer replies with a potential fix based on verifying that the
PID of the connecting process is on the host mount namespace.
Immediately after this, a containerd security advisor asks if NCC
Group still plans to publish a technical advisory on 10/05/20 and if
they would be open to having a conversation about the issue.
10/02/20 - NCC Group replies raising a concern over a possible race condition
in the underlying mechanism of potential fix. NCC Group also states
that they can postpone publishing the advisory, and would be happy
to converse about the issue if it would help to have it fixed. Over
email, meeting availability is exchanged.
10/06/20 - NCC Group, a containerd security advisor and two containerd
maintainers discuss the issue in a call and agree on a plan to
remediate the issue as a vulnerability, with patches applied to
supported branches of containerd.
10/06/20
-11/04/20 - The containerd project works on implementing the fixes
across several supported protocol versions, backports the
patches to the 1.4.x and 1.3.x branches.
10/16/20 - CVE-2020-15257 is issued for this vulnerability.
11/10-13/20 - NCC Group reviews and tests the patches, and provides feedback
on the changes; no major issues are identified. Subsequent
discussion resolves questions raised in the feedback.
11/13/20 - A follow-up call occurs to discuss disclosure timelines, patch
releases, and embargo dates.
11/13-30/20 - Patches are provided under embargo to vendors and Linux
distributions.
11/19-25/20 - A containerd security maintainer backports the patches to the
end-of-life containerd 1.2.x for Linux distributions using that
version. After discussion and analysis, a backport based on
similar patches provided by Canonical and Google is selected for
merging into the 1.2.x branch.
11/30/20 - containerd publishes a security advisory for this issue,
CVE-2020-15257.
11/30/20 - NCC Group publishes this security advisory following the containerd
publication.
Michael Crosby, Samuel Karp, and Derek McGowan of the containerd project.
NCC Group is a global expert in cyber security and risk mitigation, working with businesses to protect their brand, value and reputation against the ever-evolving threat landscape.
With our knowledge, experience and global footprint, we are best placed to help businesses identify, assess, mitigate and respond to the risks they face.
We are passionate about making the Internet safer and revolutionizing the way in which organizations think about cybersecurity.