Kubernetes Pod Security Policies and Openshift Security Context Constraints
The motivation for PSP is to being able to restrict or permit creation/deletion of K8s objects within a Cluster or a given Namespace.
The allowance of any creation is governed by the Admission Controller setting on the kube-apiserver. This can be seen in the --enable-admission-plugin flag in the kube-apiserver manifest file.
Also there is the flag called -use-service-account in the kube-controller-manager manifest which dictates the usae of separate service accounts for various objects.
There are multiple SAs in the kube-system Namespace.
Lets take a use case of allowing a deployment to be able to use the hostNetwork. Now deployments create Replicasets
Access is controlled by creating a PSP yaml definition file which can have multiple attributes(like hostNetwork, hostIPC, etc) set to either true/false depending on the requirement.
We also specify the runAsUser, fsGroups, etc which dictates who should be running this.
Now using RBAC, we can leverage who can resolve the PSPs which you have created.
RBAC would mean creating a role or a ClusterRole and eventually bound with roleBindings and ClusterRoleBindings.
The role would define the rules stating the specific type of resources , the name of the resource and the allowed verb(like use)
Think of these ClusterRoleBindings(global/cluster wide) and roleBindings(ns based) as the criteria which would/could allow resolving the PSPs.
So if a specific controller(like a replicaset-controller) would want to create a pod in any given Namespace and maybe if we have assigned a very restrictive PSP(maybe associated with the ClusterRoleBindings)
then it would probably be denied, then a step further would be to verify if the cotroller can create the pod in that given ns, which is where the roleBindings would come into effect essentially resolving the
permissive policy and it would allow that.
Another level deeper into it would be the ability to specify specific service accounts which are allowed actions.
So any particualr deployment which uses that specific SA will be able to created the pod based on the roleBinding(new for SA).
Openshift Security Context Constraints:
SCCs dictate what actions a pod can perform and its ability to access any resources. It also prevents creation of new pods as root.
These are the various SCCs available on the Openshift version.
anyuid
hostaccess
hostmount-anyuid
hostnetwork
node-exporter
nonroot
privileged
restricted
By default containers are assigned the restricted SCC.
So when we try ot create a container with root, it would give us a warning stating that the respective Container Image runs as a root user which might not be permitted by your cluster administrator.
So in order make the container work we could possibly be changing the Security Policy defined in the pod definition file from "restricted" to "anyuid"
Or
We could make changes to the default "restricted" SCC in the YAML mode by changing
runAsUser:
type: MustRunAsRange
to
runAsUser:
type: RunAsAny
It is not usually recommended to change the default 8 SCCs, however creating a new SCC is as easy as a YAML export from one of the default SCCs and then modifying it to fit our purpose.
Now SCC can be used with RBAC in the Role Definition file having the SCC name in the resource name
***************KUBERNETES**********************************
Running of privileged containers privileged
Usage of host namespaces hostPID, hostIPC
Usage of host networking and ports hostNetwork, hostPorts
Usage of volume types volumes
Usage of the host filesystem allowedHostPaths
Allow specific FlexVolume drivers allowedFlexVolumes
Allocating an FSGroup that owns the pod's volumes fsGroup
Requiring the use of a read only root file system readOnlyRootFilesystem
The user and group IDs of the container runAsUser, runAsGroup, supplementalGroups
Restricting escalation to root privileges allowPrivilegeEscalation, defaultAllowPrivilegeEscalation
Linux capabilities defaultAddCapabilities, requiredDropCapabilities, allowedCapabilities
The SELinux context of the container seLinux
The Allowed Proc Mount types for the container allowedProcMountTypes
The AppArmor profile used by containers annotations
The seccomp profile used by containers annotations
The sysctl profile used by containers forbiddenSysctls,allowedUnsafeSysctls
Attributes | Description of Attributes | Corresponding PSP Attribute | privileged allows access to all privileged and host features and the ability to run as any user, any group, any fsGroup, and with any SELinux context. WARNING: this is the most relaxed SCC and should be used only for cluster administration. Grant with caution | restricted denies access to all host features and requires pods to be run with a UID, and SELinux context that are allocated to the namespace. This is the most restrictive SCC and it is used by default for authenticated users | anyuid provides all features of the restricted SCC but allows users to run with any UID and any GID | hostnetwork allows using host networking and host ports but still requires pods to be run with a UID and SELinux context that are allocated to the namespace. | hostaccess allows access to all host namespaces but still requires pods to be run with a UID and SELinux context that are allocated to the namespace. WARNING: this SCC allows host access to namespaces, file systems, and PIDS. It should only be used by trusted pods. Grant with caution. | nonroot provides all features of the restricted SCC but allows users to run with any non-root UID. The user must specify the UID or it must be specified on the by the manifest of the container runtime | node-exporter scc is used for the Prometheus node exporter | hostmount-anyuid provides all the features of the restricted SCC but allows host mounts and any UID by a pod. This is primarily used by the persistent volume recycler. WARNING: this SCC allows host file system access as any UID, including UID 0. Grant with caution. |
SCC | New PSP needs to be created with the below attribute values. | privileged | restricted | anyuid | hostnetwork | hostaccess | nonroot | node-exporter | hostmount-anyuid | |
Allow Host Dir Volume Plugin:
| determines if the policy allow containers to use the HostDir volume plugin | allowedHostPaths | TRUE | FALSE | FALSE | FALSE | TRUE | FALSE | TRUE | TRUE |
Allow Host IPC:
| determines if the policy allows host ipc in the containers | hostIPC | TRUE | FALSE | FALSE | FALSE | TRUE | FALSE | FALSE | FALSE |
Allow Host Network:
| determines if the policy allows the use of HostNetwork in the pod spec | hostNetwork | TRUE | FALSE | FALSE | TRUE | TRUE | FALSE | TRUE | FALSE |
Allow Host PID:
| determines if the policy allows host pid in the containers | hostPID | TRUE | FALSE | FALSE | TRUE | TRUE | FALSE | TRUE | FALSE |
Allow Host Ports:
| determines if the policy allows host ports in the containers | hostPorts/HostPortRange | TRUE | FALSE | FALSE | FALSE | TRUE | FALSE | TRUE | FALSE |
Allow Privilege Escalation:
| controls the default setting for whether a process can gain more privileges than its parent process | allowPrivilegeEscalation/defaultAllowPrivilegeEscalation | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE |
Allow Privileged Container:
| determines if a container can request to be run as privileged | privileged | TRUE | FALSE | FALSE | FALSE | FALSE | FALSE | TRUE | FALSE |
Allowed Capabilities | a list of capabilities that can be requested to add to the container | allowedCapabilities | * | |||||||
Allowed Unsafe Sysctls | It is a list of explicitly allowed unsafe sysctls, defaults to none. Each entry is either a plain sysctl name or ends in "*" in which case it is considered as a prefix of allowed sysctls. Single * means all unsafe sysctls are allowed. Kubelet has to whitelist all allowed unsafe sysctls explicitly to avoid rejection. | allowedUnsafeSysctls | * | |||||||
Default Add Capabilities | default set of capabilities that will be added to the container unless the pod spec specifically drops the capability. | defaultAddCapabilities | ||||||||
Fs Group.Type | the strategy that will dictate what fs group is used by the SecurityContext | fsGroup | RunAsAny | MustRunAs | RunAsAny | MustRunAs | MustRunAs | RunAsAny | RunAsAny | RunAsAny |
Groups | groups that have permission to use this SCC. | This is like the Users where this needs to be created | system:cluster-admins system:nodes system:masters | system:authenticated | system:cluster-admins | |||||
Read Only Root Filesystem | when set to true will force containers to run with a read only root file system. So if the container is requesting a non-read root FS, then the SCC will deny running that Pod. If this is set to FALSE then the container will be allowed to run as Read only Root FS, but wont be forced to.
| readOnlyRootFilesystem | FALSE | FALSE | FALSE | FALSE | FALSE | FALSE | FALSE | FALSE |
Required Drop Capabilities | capabilities that will be dropped from the container | requiredDropCapabilities | KILL MKNOD SETUID SETGID | MKNOD | KILL MKNOD SETUID SETGID | KILL MKNOD SETUID SETGID | KILL MKNOD SETUID SETGID | MKNOD | ||
Run As User.Type | the strategy that will dictate what RunAsUser is used in the SecurityContext | runAsUser | RunAsAny | MustRunAsRange | RunAsAny | MustRunAsRange | MustRunAsRange | MustRunAsNonRoot | RunAsAny | RunAsAny |
Se Linux Context.Type | the strategy that will dictate what labels will be set in the SecurityContext | seLinux | RunAsAny | MustRunAs | MustRunAs | MustRunAs | MustRunAs | MustRunAs | RunAsAny | MustRunAs |
Seccomp Profiles | It lists the allowed profiles that may be set for the pod or container's SecComp annotations. A nil or an empty value means that no profiles may be specified by the pod or container. '*' will allow all profiles. When used to generate a value for a pod the first non-wildcard profile will be used as the default. | seccompProfile | * | |||||||
Supplemental Groups.Type | the strategy that will dictate what supplemental groups are used by the SecurityContext | supplementalGroups | RunAsAny | RunAsAny | RunAsAny | MustRunAs | RunAsAny | RunAsAny | RunAsAny | RunAsAny |
Users | users who have permissions to use this SCC. | It has to be a ServiceAccount which needs to created and added. | system:admin system:serviceaccount:openshift-infra:build-controller | system:serviceaccount:openshift-infra:pv-recycler-controller | ||||||
Volumes | a white list of allowed volume plugins, Here FSType corresponds to the type of Volume | volumes | * | configMap downwardAPI emptyDir persistentVolumeClaim projected secret | configMap downwardAPI emptyDir persistentVolumeClaim projected secret | configMap downwardAPI emptyDir persistentVolumeClaim projected secret | configMap downwardAPI emptyDir persistentVolumeClaim projected secret | configMap downwardAPI emptyDir persistentVolumeClaim projected secret | * | configMap downwardAPI emptyDir hostPath nfs persistentVolumeClaim projected secret |