Kubernetes Pod Security Policies and Openshift Security Context Constraints


The motivation for PSP is to being able to restrict or permit creation/deletion of K8s objects within a Cluster or a given Namespace.

The allowance of any creation is governed by the Admission Controller setting on the kube-apiserver. This can be seen in the --enable-admission-plugin flag in the kube-apiserver manifest file.
Also there is the flag called -use-service-account in the kube-controller-manager manifest which dictates the usae of separate service accounts for various objects.
There are multiple SAs in the kube-system Namespace.

Lets take a use case of allowing a deployment to be able to use the hostNetwork. Now deployments create Replicasets

Access is controlled by creating a PSP yaml definition file which can have multiple attributes(like hostNetwork, hostIPC, etc) set to either true/false depending on the requirement.
We also specify the runAsUser, fsGroups, etc which dictates who should be running this.

Now using RBAC, we can leverage who can resolve the PSPs which you have created.
RBAC would mean creating a role or a ClusterRole and eventually bound with roleBindings and ClusterRoleBindings.

The role would define the rules stating the specific type of resources , the name of the resource and the allowed verb(like use)

Think of these ClusterRoleBindings(global/cluster wide) and roleBindings(ns based) as the criteria which would/could allow resolving the PSPs.

So if a specific controller(like a replicaset-controller) would want to create a pod in any given Namespace and maybe if we have assigned a very restrictive PSP(maybe associated with the ClusterRoleBindings)
then it would probably be denied, then a step further would be to verify if the cotroller can create the pod in that given ns, which is where the roleBindings would come into effect essentially resolving the
permissive policy and it would allow that.

Another level deeper into it would be the ability to specify specific service accounts which are allowed actions.
So any particualr deployment which uses that specific SA will be able to created the pod based on the roleBinding(new for SA).

Openshift Security Context Constraints:
SCCs dictate what actions a pod can perform and its ability to access any resources. It also prevents creation of new pods as root.
These are the various SCCs available on the Openshift version.
anyuid
hostaccess
hostmount-anyuid
hostnetwork
node-exporter
nonroot
privileged
restricted
By default containers are assigned the restricted SCC.
So when we try ot create a container with root, it would give us a warning stating that the respective Container Image runs as a root user which might not be permitted by your cluster administrator.
So in order make the container work we could possibly be changing the Security Policy defined in the pod definition file from "restricted" to "anyuid"
Or
We could make changes to the default "restricted" SCC in the YAML mode by changing
runAsUser:
type: MustRunAsRange
to
runAsUser:
type: RunAsAny
It is not usually recommended to change the default 8 SCCs, however creating a new SCC is as easy as a YAML export from one of the default SCCs and then modifying it to fit our purpose.
Now SCC can be used with RBAC in the Role Definition file having the SCC name in the resource name
***************KUBERNETES**********************************
Running of privileged containers privileged
Usage of host namespaces hostPID, hostIPC
Usage of host networking and ports hostNetwork, hostPorts
Usage of volume types volumes
Usage of the host filesystem allowedHostPaths
Allow specific FlexVolume drivers allowedFlexVolumes
Allocating an FSGroup that owns the pod's volumes fsGroup
Requiring the use of a read only root file system readOnlyRootFilesystem
The user and group IDs of the container runAsUser, runAsGroup, supplementalGroups
Restricting escalation to root privileges allowPrivilegeEscalation, defaultAllowPrivilegeEscalation
Linux capabilities defaultAddCapabilities, requiredDropCapabilities, allowedCapabilities
The SELinux context of the container seLinux
The Allowed Proc Mount types for the container allowedProcMountTypes
The AppArmor profile used by containers annotations
The seccomp profile used by containers annotations
The sysctl profile used by containers forbiddenSysctls,allowedUnsafeSysctls


Attributes
Description of Attributes
Corresponding PSP Attribute
privileged allows access to all privileged and host features and the ability to run as any user, any group, any fsGroup, and with any SELinux context. WARNING: this is the most relaxed SCC and should be used only for cluster administration. Grant with caution
restricted denies access to all host features and requires pods to be run with a UID, and SELinux context that are allocated to the namespace. This is the most restrictive SCC and it is used by default for authenticated users
anyuid provides all features of the restricted SCC but allows users to run with any UID and any GID
hostnetwork allows using host networking and host ports but still requires pods to be run with a UID and SELinux context that are allocated to the namespace.
hostaccess allows access to all host namespaces but still requires pods to be run with a UID and SELinux context that are allocated to the namespace. WARNING: this SCC allows host access to namespaces, file systems, and PIDS. It should only be used by trusted pods. Grant with caution.
nonroot provides all features of the restricted SCC but allows users to run with any non-root UID. The user must specify the UID or it must be specified on the by the manifest of the container runtime
node-exporter scc is used for the Prometheus node exporter
hostmount-anyuid provides all the features of the restricted SCC but allows host mounts and any UID by a pod. This is primarily used by the persistent volume recycler. WARNING: this SCC allows host file system access as any UID, including UID 0. Grant with caution.
SCC
New PSP needs to be created with the below attribute values.
privileged
restricted
anyuid
hostnetwork
hostaccess
nonroot
node-exporter
hostmount-anyuid
Allow Host Dir Volume Plugin:
determines if the policy allow containers to use the HostDir volume plugin
allowedHostPaths
TRUE
FALSE
FALSE
FALSE
TRUE
FALSE
TRUE
TRUE
Allow Host IPC:
determines if the policy allows host ipc in the containers
hostIPC
TRUE
FALSE
FALSE
FALSE
TRUE
FALSE
FALSE
FALSE
Allow Host Network:
determines if the policy allows the use of HostNetwork in the pod spec
hostNetwork
TRUE
FALSE
FALSE
TRUE
TRUE
FALSE
TRUE
FALSE
Allow Host PID:
determines if the policy allows host pid in the containers
hostPID
TRUE
FALSE
FALSE
TRUE
TRUE
FALSE
TRUE
FALSE
Allow Host Ports:
determines if the policy allows host ports in the containers
hostPorts/HostPortRange
TRUE
FALSE
FALSE
FALSE
TRUE
FALSE
TRUE
FALSE
Allow Privilege Escalation:
controls the default setting for whether a process can gain more privileges than its parent process
allowPrivilegeEscalation/defaultAllowPrivilegeEscalation
TRUE
TRUE
TRUE
TRUE
TRUE
TRUE
TRUE
TRUE
Allow Privileged Container:
determines if a container can request to be run as privileged
privileged
TRUE
FALSE
FALSE
FALSE
FALSE
FALSE
TRUE
FALSE
Allowed Capabilities
a list of capabilities that can be requested to add to the container
allowedCapabilities
*
Allowed Unsafe Sysctls
It is a list of explicitly allowed unsafe sysctls, defaults to none. Each entry is either a plain sysctl name or ends in "*" in which case it is considered as a prefix of allowed sysctls. Single * means all unsafe sysctls are allowed. Kubelet has to whitelist all allowed unsafe sysctls explicitly to avoid rejection.
allowedUnsafeSysctls
*
Default Add Capabilities
default set of capabilities that will be added to the container unless the pod spec specifically drops the capability.
defaultAddCapabilities
Fs Group.Type
the strategy that will dictate what fs group is used by the SecurityContext
fsGroup
RunAsAny
MustRunAs
RunAsAny
MustRunAs
MustRunAs
RunAsAny
RunAsAny
RunAsAny
Groups
groups that have permission to use this SCC.
This is like the Users where this needs to be created
system:cluster-admins system:nodes system:masters
system:authenticated
system:cluster-admins
Read Only Root Filesystem
when set to true will force containers to run with a read only root file system. So if the container is requesting a non-read root FS, then the SCC will deny running that Pod. If this is set to FALSE then the container will be allowed to run as Read only Root FS, but wont be forced to.
readOnlyRootFilesystem
FALSE
FALSE
FALSE
FALSE
FALSE
FALSE
FALSE
FALSE
Required Drop Capabilities
capabilities that will be dropped from the container
requiredDropCapabilities
KILL MKNOD SETUID SETGID
MKNOD
KILL MKNOD SETUID SETGID
KILL MKNOD SETUID SETGID
KILL MKNOD SETUID SETGID
MKNOD
Run As User.Type
the strategy that will dictate what RunAsUser is used in the SecurityContext
runAsUser
RunAsAny
MustRunAsRange
RunAsAny
MustRunAsRange
MustRunAsRange
MustRunAsNonRoot
RunAsAny
RunAsAny
Se Linux Context.Type
the strategy that will dictate what labels will be set in the SecurityContext
seLinux
RunAsAny
MustRunAs
MustRunAs
MustRunAs
MustRunAs
MustRunAs
RunAsAny
MustRunAs
Seccomp Profiles
It lists the allowed profiles that may be set for the pod or container's SecComp annotations. A nil or an empty value means that no profiles may be specified by the pod or container. '*' will allow all profiles. When used to generate a value for a pod the first non-wildcard profile will be used as the default.
seccompProfile
*
Supplemental Groups.Type
the strategy that will dictate what supplemental groups are used by the SecurityContext
supplementalGroups
RunAsAny
RunAsAny
RunAsAny
MustRunAs
RunAsAny
RunAsAny
RunAsAny
RunAsAny
Users
users who have permissions to use this SCC.
It has to be a ServiceAccount which needs to created and added.
system:admin system:serviceaccount:openshift-infra:build-controller
system:serviceaccount:openshift-infra:pv-recycler-controller
Volumes
a white list of allowed volume plugins, Here FSType corresponds to the type of Volume
volumes
*
configMap downwardAPI emptyDir persistentVolumeClaim projected secret
configMap downwardAPI emptyDir persistentVolumeClaim projected secret
configMap downwardAPI emptyDir persistentVolumeClaim projected secret
configMap downwardAPI emptyDir persistentVolumeClaim projected secret
configMap downwardAPI emptyDir persistentVolumeClaim projected secret
*
configMap downwardAPI emptyDir hostPath nfs persistentVolumeClaim projected secret