Admission Control
This document details the philosophy and methods for implementing admission control in a Kubernetes cluster, such as Tanzu Kubernetes Grid (TKG). It covers architectural considerations, tooling choices, and best practices. This document represents how the VMware field team approaches admission control in enterprise Kubernetes environments.
Each section covers architectural recommendations and, at times, configuration for each concern. At a high-level, the key recommendations are:
- Unless mutation or external interaction is required, OPA Gatekeeper can often be the best solution.
- Be aware that admission controllers sit in the critical path to the API server, so consider this bottleneck carefully.
- If writing a controller, consider the trade-off between frameworks and familiar language stacks.
Architecture
When objects are applied to the Kubernetes API server they go through a series of steps before being committed to etcd (the persistent datastore). The whole flow is illustrated below.
There are many admission controllers that are built in to Kubernetes to provide common functionality. These include:
ServiceAccount
- Responsible for injecting the default service account into pods where necessary.ResourceQuota
- Responsible for enforcing resource quotas on workloads.NamespaceLifecycle
- Responsible for preventing new objects being created in a namespace that is currently in aterminating
state.
There are a set of admission controllers that are enabled by default. These defaults include two special admission controllers that allow cluster administrators to dynamically specify additional admission controllers that will be called via webhook from the API server (the built-ins are implemented in-tree). This guide will only focus on designing and writing mutating and validating admission webhooks.
Because they are called as a webhook, dynamic admission controllers (both mutating and validating) can run in-cluster or out-of-cluster (lending themselves to work well as serverless functions for example). One caveat is that the API server will call webhooks over TLS, so webhooks must present certificates trusted by the Kubernetes API. This is often achieved by deploying Cert Manager into the cluster and automatically generating certificates.
Mutating Admission Controllers
Mutating admission controllers receive AdmissionReview
requests from the API
server and can optionally alter objects before allowing them to pass on to the
API server (or rejecting them). These types of controllers are useful for things
like injecting sidecar containers (keeping a clean UX for end users) e.g. Istio.
If a controller chooses to mutate a request, it will allow the request by
sending an AdmissionReview
response object along with a serialized set of
JSONPatch
objects that describe to the API server how the object should be
altered. Depending on the implementation chosen (details below) these patches
may be auto-generated for you.
One downside of mutating controllers is that visibility is removed from the end user, with requests / objects being applied to the cluster that are not consistent with those that the end user originally created, potentially causing confusion if the user is unaware that mutating controllers are in operation on the cluster.
Validating Admission Controllers
Validating admission controllers also receive AdmissionReview
requests from
the API server but are not able to modify them. They can only admit or reject
the original object.
This restriction makes them fairly limited, however they are a good fit when ensuring that objects applied to the cluster conform to security standards (specific user IDs, no host mounts, etc…) or contain all required metadata (internal team labels, annotations, etc…).
Configuring Webhook Admission Controllers
Cluster administrators can use the MutatingWebhookConfiguration
and
ValidatingWebhookConfiguration
kinds to specify the configuration of dynamic
webhooks. Below is an annotated example describing all of the relevant sections:
apiVersion: admissionregistration.k8s.io/v1
kind: MutatingWebhookConfiguration
metadata:
name: "test-mutating-hook"
webhooks:
- name: "test-mutating-hook"
# Matching rules. What API / kind / version / operations should this webhook be sent.
rules:
- apiGroups: [""]
apiVersions: ["v1"]
# The operation that should trigger a call to the webhook.
operations: ["CREATE"]
# Which kind to target.
resources: ["pods"]
# Whether Namespace-scoped or cluster-scoped resources should be targeted.
scope: "Namespaced"
# Describes how the API server should connect to the webhook. In this case it's in cluster at `test-service.test-ns.svc`.
clientConfig:
service:
namespace: test-ns
name: test-service
path: /test-path
port: 8443
# A PEM encoded CA bundle which will be used to validate the webhook's server certificate.
caBundle: "Ci0tLS0tQk...tLS0K"
# Declare the admissionReviewVersions that the webhook supports.
admissionReviewVersions: ["v1", "v1beta1"]
# Describes whether the webhook has external side effects (calls / dependencies to external systems).
sideEffects: None
# How long to wait until triggering the failurePolicy.
timeoutSeconds: 5
# Whether this webhook can be re-invoked (this may happen after other webhooks have been called).
reinvocationPolicy: IfNeeded
# Whether the webhook should fail 'open' or 'closed. This has security implications.
failurePolicy: Fail
Design Considerations
Failure Modes: If a webhook is unreachable or sends an unknown response back
to the API server, it is treated as failing. Administrators must choose whether
to fail ‘open’ or ‘closed’ in this situation by setting the failurePolicy
field to Ignore
(allow the request) or Fail
(reject the request).
For security-related (or critical functionality) webhooks, Fail
is the safest
option. For non-critical hooks Ignore
may be safe (potentially in conjunction
with a reconciling controller as a backup). Combine these recommendations with
those in the performance section below.
Ordering: The first thing to note with regards to request flow above is that mutating webhooks will all be called (potentially more than once)before validating webhooks are called. This is important because it enables validating webhooks (which may reject a request based on security requirements) always to see the final version of a resource before it is applied.
Mutating webhooks are not guaranteed to be called in a specific order, and may
be called multiple times if subsequent hooks modify a request. This can be
modified by specifying the reinvocationPolicy
but ideally webhooks should be
designed for idempotency to ensure ordering does not affect their functionality.
Performance: Webhooks are called as part of the critical path of requests flowing to the API server. If a webhook is critical (security-related) and fails closed (if a timeout occurs, the request is denied) then it should be designed with high-availability in mind.
If a webhook is resource-intensive and / or has external dependencies, consideration should be taken for how often the hook will be called, and the performance impact of adding the functionality into the critical path. In these situations a controller that reconciles objects once in-cluster may be preferable.
Side Effects: Some webhooks may be responsible for modifying external
resources (e.g. some resource in a cloud provider) based on a request to the
Kubernetes API. These webhooks should be aware of and respect the dryRun
option and skip external state modification when it is enabled. Webhooks are
responsible for declaring that they either have no side-effects, or respect this
option by setting the sideEffects
field.
Implementations
There are three main approaches to implementing admission control in Kubernetes. This section details each approach and calls out the advantages / disadvantages and most appropriate use cases for each.
Open Policy Agent (OPA) Gatekeeper
Gatekeeper is an OSS tool that uses Open Policy Agent to implement a validating admission controller. This allows users to write constraints in the Rego language to specify rules that specific resources should conform to.
Administrators specify ConstraintTemplate
resources which are templates with
variable placeholders that can be re-used by end-users. For example, to specify
a ConstraintTemplate
that allows a user to specify some required labels, the
following could be applied:
apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
name: k8srequiredlabels
spec:
crd:
spec:
names:
kind: K8sRequiredLabels
listKind: K8sRequiredLabelsList
plural: k8srequiredlabels
singular: k8srequiredlabels
validation:
# Schema for the `parameters` field
openAPIV3Schema:
properties:
labels:
type: array
items: string
targets:
- target: admission.k8s.gatekeeper.sh
rego: |
package k8srequiredlabels
violation[{"msg": msg, "details": {"missing_labels": missing}}] {
provided := {label | input.review.object.metadata.labels[label]}
required := {label | label := input.parameters.labels[_]}
missing := required - provided
count(missing) > 0
msg := sprintf("you must provide labels: %v", [missing])
}
The ConstraintTemplate
contains Rego code that can refer to input parameters
defined at a later time. An end-user can then apply the following object to make
use of the template to enforce the constraint:
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequiredLabels
metadata:
name: ns-must-have-gk
spec:
match:
kinds:
- apiGroups: [""]
kinds: ["Namespace"]
parameters:
labels: ["gatekeeper"]
The above selects all Namespace
objects and ensures they have the gatekeeper
label.
There are also use-cases where it’s necessary to make a policy decision based on data that exists externally to the object that’s being applied to the cluster. An example of this would be ensuring that all ingresses are unique. For that it’s required that we can query the state of the cluster to check all ingress objects to compare against the one being applied.
Another even more complex example would be to annotate a namespace with a regex pattern, and ensure any ingress applied in that namespace conforms to the regex. A Gatekeeper sync config needs to be applied to make existing cluster information available to Gatekeeper. This tells Gatekeeper to sync and cache information about the specified resources:
apiVersion: config.gatekeeper.sh/v1alpha1
kind: Config
metadata:
name: config
namespace: "gatekeeper-system"
spec:
sync:
syncOnly:
- group: "extensions"
version: "v1beta1"
kind: "Ingress"
- group: "networking.k8s.io"
version: "v1beta1"
kind: "Ingress"
- group: ""
version: "v1"
kind: "Namespace"
The ConstraintTemplate
uses the data.inventory...
object to look up items
from the sync cache:
apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
name: limitnamespaceingress
spec:
crd:
spec:
names:
kind: LimitNamespaceIngress
listKind: LimitNamespaceIngressList
plural: limitnamespaceingresss
singular: limitnamespaceingress
validation:
# Schema for the `parameters` field in the constraint
openAPIV3Schema:
properties:
annotation:
type: string
targets:
- target: admission.k8s.gatekeeper.sh
rego: |
package limitnamespaceingress
violation[{"msg": msg}] {
regex :=
data.inventory.cluster.v1.Namespace[input.review.object.metadata.namespace].metadata.annotations[input.parameters.annotation]
hosts := input.review.object.spec.rules[_].host
not re_match(regex, hosts)
msg := sprintf("Only ingresses with host matching %v are allowed in namespace %v", [regex ,input.review.object.metadata.namespace])
}
The annotation
in the custom input parameters allows the user to specify the
specific annotation that Gatekeeper should pull the regex pattern from.
Rego returns early if any statement returns False
, so the re_match()
is
inverted with not
to ensure that a positive match allows the request.
The LimitNamespaceIngress
object specifies that the rule should apply to
Ingress
objects for both apiGroups
and designates allowed-ingress-pattern
as the annotation that should be inspected for the regex pattern (this was the
customizable input parameter).
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: LimitNamespaceIngress
metadata:
name: limit-namespace-ingress
spec:
match:
kinds:
- apiGroups: ["extensions", "networking.k8s.io"]
kinds: ["Ingress"]
parameters:
annotation: allowed-ingress-pattern
Finally the Namespace object itself is applied with the custom annotation & pattern:
apiVersion: v1
kind: Namespace
metadata:
annotations:
# Note regex special character escaping
allowed-ingress-pattern: \w\.my-namespace\.com
name: ingress-test
Now the setup is complete the ingress objects are applied and rules evaluated against them:
# FAILS because the host doesn't match the pattern above
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
name: test-1
namespace: ingress-test
spec:
rules:
- host: foo.other-namespace.com
http:
paths:
- backend:
serviceName: service1
servicePort: 80
---
# SUCCEEDS as the pattern matches
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
name: test-2
namespace: ingress-test
spec:
rules:
- host: foo.my-namespace.com
http:
paths:
- backend:
serviceName: service2
servicePort: 80
The second ingress above will succeed as the spec.rules.host
matches the regex
pattern specified in the allowed-ingress-pattern
annotation on the
ingress-test
namespace. However the first ingress above does not match and
results in an error:
Error from server ([denied by limit-namespace-ingress] Only ingresses with host matching \w\.my-namespace\.com are allowed in namespace ingress-test): error when creating "ingress.yaml": admission webhook "validation.gatekeeper.sh" denied the request: [denied by limit-namespace-ingress] Only ingresses with host matching \w\.my-namespace\.com are allowed in namespace ingress-test
Advantages:
- Extensible
ConstraintTemplate
model allows admins to define common policies and share / re-use them as libraries. - Doesn’t require any custom coding (outside of Rego policies).
- Fairly mature, community-supported project.
- Easy to install.
Disadvantages:
- Only supports validating (not mutating) admission control.
- Rego can get unwieldy when writing non-trivial evaluation logic.
- Care needs to be taken with the defaults (e.g.
failurePolicy
is set toIgnore
).
OPA with kube-mgmt
Before Gatekeeper was released, there was an alternative approach to use OPA with
Kubernetes. This involved deploying a kube-mgmt
component to the cluster which
would watch for ConfigMap
objects containing rego policies and load them into
OPA. This is in contrast to the more Kubernetes-native approach of using CRDs
for policies and templates.
Controller-Runtime
The upstream tool (controller-runtime) provides abstractions to ease the creation of mutating and validating admission webhooks.
An example of how to implement these hooks for built-in types (e.g. Pod, Deployment, etc…) is provided by the project to help users get started.
This approach is useful when complex logic or side-effects are required in an admission controller. While it does require coding knowledge (Go), it offers great flexibility while still taking advantage of the abstractions offered by controller-runtime.
Webhooks must implement a Handle
method whose signature is:
func (w *Webhook) Handle(ctx context.Context, req admission.Request) admission.Response
The admission.Request
object is an abstraction over the raw JSON that webhooks
receive, and provides easy access to the raw applied object, the operation being
executed (e.g. CREATE
) etc…
Within the handler create a new instance of the object being captured and decode the raw object into the Go structure. In the case below a new Pod struct is created:
p := &v.Pod{}
err := w.decoder.DecodeRaw(req.Object, p)
The p
object can be modified or validated in any way before the response is
returned. In the example below an annotation is added to the Pod before
marshaling the object back to JSON and sending the patched response back to the
API server:
p.Annotations["new-annotation"] = "new-annotation-value"\
marshaledPod, err := json.Marshal(p)
return admission.PatchResponseFromRaw(req.Object.Raw, marshaledPod)
Controller-runtime’s PatchResponseFromRaw
method will automatically calculate
the JSONPatch diffs required between the original raw object and the modified
one before sending the correctly serialized response.
In the case of a simple validating hook, controller-runtime provides convenience
functions admission.Allowed()
and admission.Denied()
that can be used after
processing the required logic.
One of the biggest additional advantages of using Kubebuilder is the use of markers to auto-generate manifests for deployment & RBAC.
// +kubebuilder:webhook:path=/validate-v1-pod,mutating=false,failurePolicy=fail,groups="",resources=pods,verbs=create;update,versions=v1,name=vpod.kb.io
The example above generates the following:
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
name: "vpod.kb.io"
webhooks:
- name: "vpod.kb.io"
rules:
- apiGroups: [""]
apiVersions: ["v1"]
operations: ["CREATE", "UPDATE"]
resources: ["pods"]
scope: "Namespaced"
clientConfig:
service:
name: vpod
path: /validate-v1-pod
port: 8443
caBundle: "Ci0tLS0tQk...tLS0K"
failurePolicy: Fail
Advantages:
- Supports both mutating and validating admission control.
- Use of high-level programming language allows huge flexibility / & extensibility.
- Abstracts underlying request handling and message parsing with convenience interfaces and methods.
- Can have more tightly-scoped RBAC privileges.
Disadvantages:
- Requires Go programming knowledge.
- Requires knowledge of some Kubernetes API server behaviors.
- Admission logic is contained in code rather than being readable in CRDs or ConfigMaps.
Agnostic HTTP Handler
An alternate way to build admission controllers (mutating & validating) is by implementing an HTTP webhook endpoint from scratch in any language. Examples below use Go but any language capable of TLS-enabled HTTP handling and JSON parsing is acceptable.
Using this approach provides the most flexibility to integrate with the current stacks in use, but comes at the cost of many high-level abstractions (although languages with mature Kubernetes client libraries can alleviate this).
As described above admission control webhooks receive and return HTTPS requests from and to the API server. The schema of these messages is well known so it’s possible to receive the request and modify the object (via Patches) manually.
The example below is shown in Go. Whereas controller-runtime
provides a patch
generation helper, it is necessary in plain Go to generate the JSONPatch and
serialize it (to JSON) manually:
patch := []patchOperation{
{
Op: "add",
Path: "/spec/volumes/0",
Value: volume,
},
}
patchBytes, err := json.Marshal(patch)
Finally there is no convenience wrapper around the response, so this also has to be manually created with the relevant patches and status before being returned:
return &v1beta1.AdmissionResponse{
Allowed: true,
Patch: patchBytes,
PatchType: func() *v1beta1.PatchType {
pt := v1beta1.PatchTypeJSONPatch
return &pt
}(),
}
In addition to the extra work in the logic of the webhook, there is a larger amount of supporting code required to handle errors, graceful shutdown, HTTP headers, etc…
Advantages:
- Maximum flexibility for different languages / stacks.
- Supports both mutating and validating hooks.
Disadvantages:
- Almost all functionality must be written from scratch.
- More complex to maintain.