Security

Kubernetes Security Best Practices: A Comprehensive Guide for Production Environments

Zak Kann
KubernetesSecurityDevOpsCloud Native

Key takeaways

  • 67% of organizations experience Kubernetes security incidents, with misconfigurations causing most breaches averaging $4.1M in costs
  • Defense-in-depth requires securing control plane, nodes, network policies, RBAC, and supply chain across multiple layers
  • Implement least-privilege RBAC with three-tier model (read-only, namespace-scoped, cluster-admin) and deny-by-default network policies
  • Pod Security Standards and admission controllers prevent 80%+ of common misconfigurations before deployment
  • Runtime security tools (Falco, Tracee) and automated scanning catch container breakouts and supply chain attacks in production

Kubernetes has revolutionized container orchestration, but its distributed architecture and extensive API surface area create unique security challenges. Unlike traditional infrastructure, Kubernetes security requires a defense-in-depth strategy spanning multiple layers: cluster infrastructure, container runtime, application code, and network communication.

This guide provides comprehensive, production-tested security practices drawn from real-world experience securing Kubernetes clusters across diverse environments.

The Kubernetes Security Landscape

The complexity of Kubernetes introduces attack vectors at multiple levels:

  • Control plane vulnerabilities: API server misconfigurations, etcd exposure, compromised controller managers
  • Node-level threats: Container breakouts, kernel exploits, unauthorized access to kubelet APIs
  • Network attacks: Pod-to-pod lateral movement, service mesh compromise, ingress vulnerabilities
  • Supply chain risks: Malicious container images, compromised dependencies, backdoored helm charts
  • Identity and access issues: Overprivileged service accounts, credential theft, RBAC misconfigurations

According to Red Hat's 2024 State of Kubernetes Security report, 67% of respondents experienced security incidents in their Kubernetes environments, with misconfigurations accounting for the majority of breaches. The average cost of a container security breach now exceeds $4.1 million when accounting for downtime, remediation, and regulatory penalties.

Architecture Security Fundamentals

Cluster Hardening Baseline

Before implementing application-level controls, ensure your cluster infrastructure follows security fundamentals:

Control Plane Security:

  • Run control plane components on dedicated nodes isolated from workloads
  • Enable TLS encryption for all control plane communication (etcd, API server, kubelet)
  • Restrict API server access to authorized networks using firewall rules or VPNs
  • Disable anonymous authentication and insecure ports
  • Enable admission controllers: PodSecurity, NodeRestriction, ResourceQuota, LimitRanger

etcd Security:

  • Encrypt etcd data at rest using encryption providers
  • Use separate TLS certificates for etcd peer and client communication
  • Restrict etcd access to control plane nodes only
  • Implement regular encrypted backups with off-cluster storage
  • Enable etcd audit logging

Node Security:

  • Minimize node OS attack surface (use minimal distributions like Bottlerocket, Flatcar, or Talos)
  • Enable kernel security modules (SELinux, AppArmor, seccomp)
  • Disable SSH access to nodes or restrict to bastion hosts
  • Implement node-level logging and monitoring
  • Use immutable infrastructure patterns with node image baking

1. Identity and Access Management

RBAC Implementation Strategy

Role-Based Access Control is foundational but requires careful design to avoid both excessive permissions and operational bottlenecks.

Principle of Least Privilege:

Implement a three-tier RBAC model:

# Tier 1: Read-only cluster-wide access for monitoring
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: cluster-viewer
rules:
- apiGroups: [""]
  resources: ["pods", "services", "configmaps", "nodes"]
  verbs: ["get", "list", "watch"]
- apiGroups: ["apps"]
  resources: ["deployments", "replicasets", "statefulsets"]
  verbs: ["get", "list", "watch"]
---
# Tier 2: Namespace-scoped developer access
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: developer
  namespace: production
rules:
- apiGroups: [""]
  resources: ["pods", "pods/log", "services", "configmaps"]
  verbs: ["get", "list", "watch", "create", "update", "patch"]
- apiGroups: ["apps"]
  resources: ["deployments"]
  verbs: ["get", "list", "watch", "update", "patch"]
- apiGroups: [""]
  resources: ["pods/exec", "pods/portforward"]
  verbs: ["create"]
  # Restrict to debugging sessions
- apiGroups: [""]
  resources: ["secrets"]
  verbs: [] # Explicitly deny direct secret access
---
# Tier 3: Platform team cluster-admin for specific namespaces
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: namespace-admin
  namespace: production
rules:
- apiGroups: ["*"]
  resources: ["*"]
  verbs: ["*"]
---
# Bind with groups from identity provider
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: developers
  namespace: production
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: developer
subjects:
- kind: Group
  name: engineering-team
  apiGroup: rbac.authorization.k8s.io

Service Account Security:

Service accounts are often overlooked attack vectors. Apply these practices:

# Create specific service accounts per application
apiVersion: v1
kind: ServiceAccount
metadata:
  name: payment-processor
  namespace: production
automountServiceAccountToken: false # Disable automatic mounting
---
# Mount only when explicitly needed
apiVersion: v1
kind: Pod
metadata:
  name: payment-processor
  namespace: production
spec:
  serviceAccountName: payment-processor
  automountServiceAccountToken: true
  containers:
  - name: app
    image: payment-processor:v1.2.3
    # App uses K8s API to update ConfigMap
---
# Restrict service account permissions
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: payment-processor-role
  namespace: production
rules:
- apiGroups: [""]
  resources: ["configmaps"]
  resourceNames: ["payment-config"] # Restrict to specific resource
  verbs: ["get", "update"]

RBAC Auditing:

Regularly audit RBAC configurations to identify privilege creep:

# List all ClusterRoleBindings with cluster-admin
kubectl get clusterrolebindings -o json | \
  jq '.items[] | select(.roleRef.name=="cluster-admin") | {name: .metadata.name, subjects: .subjects}'
 
# Find service accounts with cluster-admin
kubectl get clusterrolebindings -o json | \
  jq '.items[] | select(.roleRef.name=="cluster-admin" and .subjects[].kind=="ServiceAccount")'
 
# Audit who can exec into pods
kubectl get roles,clusterroles -A -o json | \
  jq '.items[] | select(.rules[].resources[]? == "pods/exec")'

Identity Provider Integration

Integrate with enterprise identity providers to leverage existing access controls:

# Example: OIDC configuration for API server
apiVersion: v1
kind: Pod
metadata:
  name: kube-apiserver
  namespace: kube-system
spec:
  containers:
  - name: kube-apiserver
    command:
    - kube-apiserver
    - --oidc-issuer-url=https://accounts.google.com
    - --oidc-client-id=kubernetes
    - --oidc-username-claim=email
    - --oidc-groups-claim=groups
    - --oidc-username-prefix=oidc:
    - --oidc-groups-prefix=oidc:

2. Network Security and Segmentation

Defense-in-Depth Network Policies

Network policies provide microsegmentation at the pod level. Implement a default-deny strategy:

# Step 1: Deny all traffic in namespace
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: production
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress
---
# Step 2: Allow specific ingress to frontend
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: frontend-ingress
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: frontend
      tier: web
  policyTypes:
  - Ingress
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          name: ingress-nginx
      podSelector:
        matchLabels:
          app: ingress-nginx
    ports:
    - protocol: TCP
      port: 8080
---
# Step 3: Allow frontend to backend communication
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: frontend-to-backend
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: frontend
  policyTypes:
  - Egress
  egress:
  - to:
    - podSelector:
        matchLabels:
          app: backend
    ports:
    - protocol: TCP
      port: 8080
  # Allow DNS resolution
  - to:
    - namespaceSelector:
        matchLabels:
          name: kube-system
      podSelector:
        matchLabels:
          k8s-app: kube-dns
    ports:
    - protocol: UDP
      port: 53
---
# Step 4: Restrict backend to database
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: backend-to-database
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: backend
  policyTypes:
  - Egress
  egress:
  - to:
    - podSelector:
        matchLabels:
          app: postgres
    ports:
    - protocol: TCP
      port: 5432
  # Allow DNS
  - to:
    - namespaceSelector:
        matchLabels:
          name: kube-system
      podSelector:
        matchLabels:
          k8s-app: kube-dns
    ports:
    - protocol: UDP
      port: 53
  # Allow HTTPS to external APIs only
  - to:
    - namespaceSelector: {}
    ports:
    - protocol: TCP
      port: 443

Network Policy Testing:

Validate network policies before deploying to production:

# Use network policy editor visualization
kubectl get networkpolicies -A -o yaml > policies.yaml
 
# Test connectivity with ephemeral containers
kubectl debug -it pod/frontend-xyz --image=nicolaka/netshoot -- /bin/bash
# Inside container: test connections
curl backend-service:8080  # Should work
curl database-service:5432  # Should fail

Service Mesh Security

For advanced network security, implement a service mesh (Istio, Linkerd):

# Istio PeerAuthentication: Require mTLS
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
  name: default
  namespace: production
spec:
  mtls:
    mode: STRICT
---
# AuthorizationPolicy: Allow only specific service-to-service calls
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: backend-authz
  namespace: production
spec:
  selector:
    matchLabels:
      app: backend
  action: ALLOW
  rules:
  - from:
    - source:
        principals: ["cluster.local/ns/production/sa/frontend"]
    to:
    - operation:
        methods: ["GET", "POST"]
        paths: ["/api/*"]

3. Workload Security and Isolation

Pod Security Standards

Kubernetes replaced Pod Security Policies with Pod Security Standards, defining three security profiles:

# Enforce at namespace level
apiVersion: v1
kind: Namespace
metadata:
  name: production
  labels:
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/enforce-version: latest
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/audit-version: latest
    pod-security.kubernetes.io/warn: restricted
    pod-security.kubernetes.io/warn-version: latest

Restricted Profile Requirements:

Implement security contexts meeting the restricted profile:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: secure-app
  namespace: production
spec:
  replicas: 3
  selector:
    matchLabels:
      app: secure-app
  template:
    metadata:
      labels:
        app: secure-app
    spec:
      # Run as non-root user
      securityContext:
        runAsNonRoot: true
        runAsUser: 10000
        runAsGroup: 10000
        fsGroup: 10000
        seccompProfile:
          type: RuntimeDefault
        # Prevent privilege escalation
        supplementalGroups: [10000]
      containers:
      - name: app
        image: secure-app:v1.2.3
        securityContext:
          allowPrivilegeEscalation: false
          readOnlyRootFilesystem: true
          runAsNonRoot: true
          runAsUser: 10000
          capabilities:
            drop:
            - ALL
        # Use volume mounts for writable paths
        volumeMounts:
        - name: tmp
          mountPath: /tmp
        - name: cache
          mountPath: /app/cache
        resources:
          limits:
            memory: "512Mi"
            cpu: "500m"
          requests:
            memory: "256Mi"
            cpu: "250m"
      volumes:
      - name: tmp
        emptyDir: {}
      - name: cache
        emptyDir: {}

Runtime Security with seccomp and AppArmor

Restrict syscalls and capabilities at the kernel level:

# Create seccomp profile
apiVersion: v1
kind: ConfigMap
metadata:
  name: seccomp-profile
  namespace: production
data:
  profile.json: |
    {
      "defaultAction": "SCMP_ACT_ERRNO",
      "architectures": ["SCMP_ARCH_X86_64"],
      "syscalls": [
        {
          "names": [
            "accept4", "access", "arch_prctl", "bind", "brk",
            "close", "connect", "dup", "dup2", "epoll_create1",
            "epoll_ctl", "epoll_wait", "exit_group", "fcntl",
            "fstat", "futex", "getcwd", "getpid", "getpeername",
            "getsockname", "getsockopt", "listen", "mmap",
            "mprotect", "munmap", "open", "openat", "poll",
            "read", "readv", "recvfrom", "recvmsg", "rt_sigaction",
            "rt_sigprocmask", "sendmsg", "sendto", "setsockopt",
            "shutdown", "socket", "stat", "write", "writev"
          ],
          "action": "SCMP_ACT_ALLOW"
        }
      ]
    }
---
# Apply to pod
apiVersion: v1
kind: Pod
metadata:
  name: restricted-app
spec:
  securityContext:
    seccompProfile:
      type: Localhost
      localhostProfile: operator/production/profile.json
  containers:
  - name: app
    image: app:latest

4. Supply Chain Security

Image Security Pipeline

Implement comprehensive image security from build to runtime:

# 1. Build images from minimal base
FROM gcr.io/distroless/static-debian11:nonroot
COPY --chown=nonroot:nonroot app /app
USER nonroot:nonroot
ENTRYPOINT ["/app"]
# 2. Scan during CI/CD pipeline
trivy image --severity HIGH,CRITICAL --exit-code 1 myapp:latest
 
# 3. Sign images with cosign
cosign sign --key cosign.key myregistry.io/myapp:latest
 
# 4. Generate SBOM
syft myapp:latest -o spdx-json > sbom.json

Image Policy Enforcement:

Use admission controllers to enforce image policies:

# Require signed images (using Sigstore policy-controller)
apiVersion: policy.sigstore.dev/v1beta1
kind: ClusterImagePolicy
metadata:
  name: require-signature
spec:
  images:
  - glob: "myregistry.io/**"
  authorities:
  - keyless:
      url: https://fulcio.sigstore.dev
      identities:
      - issuer: https://accounts.google.com
        subject: build-system@mycompany.com
---
# Restrict registries with Kyverno
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: restrict-registries
spec:
  validationFailureAction: enforce
  rules:
  - name: require-approved-registry
    match:
      any:
      - resources:
          kinds:
          - Pod
    validate:
      message: "Images must come from approved registries"
      pattern:
        spec:
          containers:
          - image: "myregistry.io/* | gcr.io/distroless/*"

Dependency Management

Track and secure application dependencies:

# Generate Software Bill of Materials
syft dir:. -o spdx-json > app-sbom.json
 
# Scan dependencies for vulnerabilities
grype sbom:app-sbom.json
 
# Monitor for new vulnerabilities
grype sbom:app-sbom.json --add-cpes-if-none --by-cve

5. Secrets Management

External Secrets Architecture

Never store sensitive data in Kubernetes Secrets directly. Use external secret managers:

# Install External Secrets Operator
apiVersion: external-secrets.io/v1beta1
kind: SecretStore
metadata:
  name: vault-backend
  namespace: production
spec:
  provider:
    vault:
      server: "https://vault.company.com"
      path: "secret"
      version: "v2"
      auth:
        kubernetes:
          mountPath: "kubernetes"
          role: "production-role"
          serviceAccountRef:
            name: external-secrets
---
# Define external secret
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: database-credentials
  namespace: production
spec:
  refreshInterval: 1h
  secretStoreRef:
    name: vault-backend
    kind: SecretStore
  target:
    name: database-credentials
    creationPolicy: Owner
  data:
  - secretKey: password
    remoteRef:
      key: database/production
      property: password
  - secretKey: username
    remoteRef:
      key: database/production
      property: username

Encryption at Rest:

Enable encryption for Kubernetes Secrets stored in etcd:

# /etc/kubernetes/encryption-config.yaml
apiVersion: apiserver.config.k8s.io/v1
kind: EncryptionConfiguration
resources:
  - resources:
    - secrets
    providers:
    - aescbc:
        keys:
        - name: key1
          secret: <BASE64_ENCODED_32_BYTE_KEY>
    - identity: {}
# Configure API server
kube-apiserver --encryption-provider-config=/etc/kubernetes/encryption-config.yaml

Secret Rotation:

Implement automated secret rotation:

apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: auto-rotate-secret
spec:
  refreshInterval: 24h # Check for updates daily
  target:
    name: app-credentials
    template:
      type: Opaque
      metadata:
        annotations:
          reloader.stakater.com/match: "true" # Trigger pod restart on change

6. Observability and Detection

Comprehensive Audit Logging

Configure detailed audit logging to track all cluster activity:

# /etc/kubernetes/audit-policy.yaml
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
# Log secret access
- level: Metadata
  resources:
  - group: ""
    resources: ["secrets"]
# Log authentication failures
- level: Metadata
  omitStages:
  - RequestReceived
  users: ["system:anonymous"]
# Log exec and portforward
- level: Request
  resources:
  - group: ""
    resources: ["pods/exec", "pods/attach", "pods/portforward"]
# Log privilege escalations
- level: Request
  verbs: ["create", "update", "patch"]
  resources:
  - group: "rbac.authorization.k8s.io"
    resources: ["clusterrolebindings", "rolebindings"]
# Log all requests at metadata level
- level: Metadata
  omitStages:
  - RequestReceived

Runtime Threat Detection

Deploy runtime security monitoring with Falco:

# Falco custom rules
- rule: Unauthorized Process in Container
  desc: Detect execution of processes not in allowlist
  condition: >
    spawned_process and
    container and
    container.image.repository = "myapp" and
    not proc.name in (app, sh)
  output: >
    Unexpected process spawned (user=%user.name command=%proc.cmdline
    container=%container.name image=%container.image.repository)
  priority: WARNING
 
- rule: Sensitive File Access
  desc: Detect access to sensitive files
  condition: >
    open_read and
    container and
    fd.name in (/etc/shadow, /etc/sudoers, /root/.ssh/id_rsa)
  output: >
    Sensitive file accessed (user=%user.name file=%fd.name
    container=%container.name command=%proc.cmdline)
  priority: CRITICAL
 
- rule: Reverse Shell Detected
  desc: Detect reverse shell connection attempts
  condition: >
    spawned_process and
    container and
    proc.name in (nc, ncat, netcat, socat) and
    proc.args contains "-e"
  output: >
    Reverse shell detected (user=%user.name command=%proc.cmdline
    container=%container.name)
  priority: CRITICAL

Metrics and Alerting

Monitor security-relevant metrics:

# Prometheus alerts for security events
groups:
- name: kubernetes_security
  interval: 30s
  rules:
  - alert: UnauthorizedAPIAccess
    expr: |
      sum(rate(apiserver_audit_event_total{
        verb!~"get|list|watch",
        user=~"system:anonymous|system:unauthenticated"
      }[5m])) > 0
    annotations:
      summary: "Unauthorized API access detected"
 
  - alert: PrivilegedPodCreated
    expr: |
      sum(kube_pod_container_status_running{
        container_security_context_privileged="true"
      }) > 0
    annotations:
      summary: "Privileged container is running"
 
  - alert: HighNumberOfFailedLogins
    expr: |
      sum(rate(apiserver_audit_event_total{
        verb="create",
        objectRef_resource="selfsubjectaccessreviews",
        responseStatus_code="403"
      }[5m])) > 5
    annotations:
      summary: "High number of failed authentication attempts"

7. Compliance and Policy Enforcement

Policy-as-Code with Kyverno

Enforce security policies across the cluster:

# Require resource limits
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-resource-limits
spec:
  validationFailureAction: enforce
  rules:
  - name: check-resource-limits
    match:
      any:
      - resources:
          kinds:
          - Pod
    validate:
      message: "CPU and memory limits are required"
      pattern:
        spec:
          containers:
          - resources:
              limits:
                memory: "?*"
                cpu: "?*"
---
# Disallow privileged containers
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: disallow-privileged
spec:
  validationFailureAction: enforce
  rules:
  - name: check-privileged
    match:
      any:
      - resources:
          kinds:
          - Pod
    validate:
      message: "Privileged containers are not allowed"
      pattern:
        spec:
          containers:
          - securityContext:
              privileged: false
---
# Require non-root containers
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-non-root
spec:
  validationFailureAction: enforce
  rules:
  - name: check-runAsNonRoot
    match:
      any:
      - resources:
          kinds:
          - Pod
    validate:
      message: "Containers must run as non-root user"
      pattern:
        spec:
          securityContext:
            runAsNonRoot: true
          containers:
          - securityContext:
              runAsNonRoot: true
---
# Enforce image pull policy
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-imagepullpolicy
spec:
  validationFailureAction: enforce
  rules:
  - name: require-imagepullpolicy-always
    match:
      any:
      - resources:
          kinds:
          - Pod
    validate:
      message: "imagePullPolicy must be Always"
      pattern:
        spec:
          containers:
          - imagePullPolicy: Always

Compliance Scanning

Automate compliance checks:

# CIS Kubernetes Benchmark scanning with kube-bench
kube-bench run --targets master,node,policies
 
# NSA/CISA Kubernetes Hardening Guide compliance
kubectl-kubesec scan deployment/myapp
 
# Generate compliance reports
kubectl get --raw /apis/metrics.k8s.io/v1beta1/nodes | \
  jq '.items[] | {name: .metadata.name, cpu: .usage.cpu, memory: .usage.memory}'

8. Incident Response and Recovery

Security Incident Playbook

Establish procedures for security incidents:

Detection Phase:

  1. Monitor security alerts from Falco, audit logs, and SIEM
  2. Validate alert legitimacy and severity
  3. Activate incident response team

Containment Phase:

# Isolate compromised pod with network policy
kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: isolate-compromised-pod
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: compromised-app
  policyTypes:
  - Ingress
  - Egress
  # Deny all traffic
EOF
 
# Prevent pod from being rescheduled
kubectl cordon node-name
kubectl drain node-name --ignore-daemonsets --delete-emptydir-data

Investigation Phase:

# Capture pod state before deletion
kubectl get pod compromised-pod -o yaml > compromised-pod.yaml
kubectl describe pod compromised-pod > compromised-pod-describe.txt
kubectl logs compromised-pod --all-containers=true > compromised-pod-logs.txt
 
# Extract audit logs for timeline
kubectl logs -n kube-system kube-apiserver-xxx | \
  grep "compromised-pod" > audit-trail.log
 
# Analyze runtime events
kubectl logs -n falco falco-xxx | grep "compromised-pod"

Eradication Phase:

# Delete compromised workload
kubectl delete pod compromised-pod
 
# Rotate potentially compromised credentials
kubectl delete secret compromised-secret
# Recreate from vault/external source
 
# Patch vulnerability if identified
kubectl set image deployment/myapp app=myapp:patched-version

Backup and Disaster Recovery

Implement backup strategies for cluster state:

# Backup etcd
ETCDCTL_API=3 etcdctl snapshot save /backup/etcd-snapshot.db \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key
 
# Backup cluster resources
kubectl get all --all-namespaces -o yaml > cluster-backup.yaml
 
# Use Velero for comprehensive backups
velero backup create full-backup --include-namespaces '*'

Security Assessment Checklist

Use this comprehensive checklist for security reviews:

Cluster Infrastructure

  • Control plane nodes isolated and hardened
  • etcd encrypted at rest and in transit
  • API server authentication and authorization configured
  • Admission controllers enabled (PodSecurity, NodeRestriction, ResourceQuota)
  • Anonymous auth disabled
  • Kubelet authentication enabled with certificate rotation
  • Audit logging configured and forwarded to SIEM

Identity and Access

  • RBAC enabled with least-privilege roles
  • No direct cluster-admin bindings to users
  • Service accounts scoped per application
  • Default service account tokens not auto-mounted
  • Integration with enterprise identity provider (OIDC/LDAP)
  • Regular RBAC audit conducted
  • Unused service accounts removed

Network Security

  • Default-deny network policies implemented
  • Namespace network isolation configured
  • Ingress traffic restricted to approved sources
  • Egress traffic limited to necessary destinations
  • Service mesh with mTLS deployed (if applicable)
  • Network policy testing performed

Workload Security

  • Pod Security Standards enforced (baseline or restricted)
  • Containers run as non-root users
  • Read-only root filesystems enabled
  • Privilege escalation disabled
  • Unnecessary capabilities dropped
  • Resource limits defined for all containers
  • seccomp profiles applied
  • AppArmor/SELinux policies configured

Supply Chain Security

  • Container images scanned for vulnerabilities
  • Images signed and signature verification enforced
  • Approved container registries whitelisted
  • SBOM generated for all applications
  • Base images regularly updated
  • Minimal base images used (distroless, Alpine)
  • Dependency scanning integrated in CI/CD

Secrets Management

  • External secrets manager integrated
  • Kubernetes Secrets encrypted at rest
  • Secrets not embedded in container images
  • Secret rotation automated
  • Access to secrets logged and monitored
  • Secrets scoped per application

Observability and Detection

  • Comprehensive audit logging enabled
  • Runtime security monitoring deployed (Falco)
  • Security metrics collected and alerted
  • Log aggregation and SIEM integration
  • Anomaly detection configured
  • Security dashboards created

Compliance and Governance

  • Policy enforcement tool deployed (Kyverno, OPA)
  • CIS Kubernetes Benchmark compliance verified
  • Compliance scanning automated
  • Security policies documented and enforced
  • Regular compliance audits scheduled

Incident Response

  • Security incident playbook documented
  • Incident response team identified and trained
  • Backup and recovery procedures tested
  • Post-incident review process established
  • Communication plan defined

Common Security Anti-Patterns

Avoid these frequent mistakes:

1. Permissive RBAC

Anti-pattern:

subjects:
- kind: Group
  name: developers
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: ClusterRole
  name: cluster-admin  # TOO PERMISSIVE

Correct approach:

subjects:
- kind: Group
  name: developers
roleRef:
  kind: Role
  name: developer  # Namespace-scoped, limited permissions

2. Running as Root

Anti-pattern:

FROM ubuntu
RUN apt-get update
CMD ["/app/server"]  # Runs as root (UID 0)

Correct approach:

FROM ubuntu
RUN useradd -u 10000 -m appuser
USER appuser
CMD ["/app/server"]

3. Exposing Sensitive Ports

Anti-pattern:

ports:
- containerPort: 2379  # etcd exposed
- containerPort: 10250  # kubelet API exposed

Correct approach: Use ClusterIP services and restrict with network policies.

4. Storing Secrets in ConfigMaps

Anti-pattern:

apiVersion: v1
kind: ConfigMap
data:
  db_password: "plaintext_password"  # NEVER DO THIS

Correct approach: Use external secrets manager with encryption.

5. Overly Broad Network Policies

Anti-pattern:

spec:
  podSelector: {}
  ingress:
  - from: []  # Allow from anywhere

Correct approach: Explicitly define source selectors and ports.

Advanced Security Patterns

Zero Trust Architecture

Implement zero trust principles in Kubernetes:

# Deny by default
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: zero-trust-default
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress
---
# Explicit service-to-service authorization
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: frontend-to-backend
spec:
  selector:
    matchLabels:
      app: backend
  action: ALLOW
  rules:
  - from:
    - source:
        principals: ["cluster.local/ns/production/sa/frontend"]
    to:
    - operation:
        methods: ["POST"]
        paths: ["/api/v1/orders"]
    when:
    - key: request.headers[Authorization]
      values: ["Bearer *"]

Workload Identity and SPIFFE

Implement cryptographic workload identity:

# Deploy SPIRE server and agent
kubectl apply -f https://raw.githubusercontent.com/spiffe/spire-tutorials/main/k8s/quickstart/spire-namespace.yaml
 
# Configure workload registration
kubectl exec -n spire spire-server-0 -- \
  /opt/spire/bin/spire-server entry create \
  -spiffeID spiffe://example.org/ns/production/sa/backend \
  -parentID spiffe://example.org/ns/spire/sa/spire-agent \
  -selector k8s:ns:production \
  -selector k8s:sa:backend

Multi-Tenancy Isolation

Implement strong tenant isolation:

# Virtual clusters with vcluster
apiVersion: v1
kind: Namespace
metadata:
  name: tenant-a
---
# Resource quotas per tenant
apiVersion: v1
kind: ResourceQuota
metadata:
  name: tenant-a-quota
  namespace: tenant-a
spec:
  hard:
    requests.cpu: "100"
    requests.memory: 200Gi
    persistentvolumeclaims: "10"
    services.loadbalancers: "2"
---
# Limit ranges to prevent resource monopolization
apiVersion: v1
kind: LimitRange
metadata:
  name: tenant-a-limits
  namespace: tenant-a
spec:
  limits:
  - max:
      cpu: "4"
      memory: 8Gi
    min:
      cpu: 100m
      memory: 128Mi
    type: Container

Continuous Security Improvement

Security is not a one-time implementation but an ongoing process:

Automated Security Scanning

  • Daily: Vulnerability scanning of running containers
  • Weekly: Configuration drift detection and remediation
  • Monthly: Penetration testing of exposed services
  • Quarterly: Third-party security assessments

Security Training

  • Conduct regular secure coding workshops for developers
  • Implement security champions program
  • Share security incident post-mortems
  • Stay updated on CVEs affecting Kubernetes ecosystem

Tooling Ecosystem

  • Static Analysis: Checkov, kube-score, kubesec
  • Runtime Security: Falco, Tracee, Tetragon
  • Policy Enforcement: Kyverno, OPA Gatekeeper
  • Vulnerability Scanning: Trivy, Grype, Snyk
  • Compliance: kube-bench, Prowler
  • Secrets Management: External Secrets, Sealed Secrets

Conclusion

Securing Kubernetes requires a comprehensive, layered approach spanning infrastructure, identity, network, workload, and supply chain security. By implementing the practices outlined in this guide, you establish a robust security posture that protects against both common misconfigurations and advanced threats.

Key takeaways:

  • Start with fundamentals: Harden cluster infrastructure before adding complexity
  • Adopt defense-in-depth: Layer multiple security controls
  • Enforce least privilege: Limit access and capabilities to minimum necessary
  • Automate security: Integrate scanning, policy enforcement, and monitoring into CI/CD
  • Plan for incidents: Establish detection, response, and recovery procedures
  • Continuously improve: Security is an ongoing process, not a destination

The security landscape evolves rapidly. Stay informed through security mailing lists, attend KubeCon security talks, and actively participate in the Kubernetes security community.

Need Expert Kubernetes Security Assistance?

Our cloud security team specializes in Kubernetes hardening, compliance, and incident response. We provide:

  • Security Assessments: Comprehensive penetration testing and vulnerability analysis
  • Architecture Review: Expert evaluation of cluster design and security controls
  • Implementation Support: Hands-on assistance deploying security tooling and policies
  • Compliance Consulting: SOC 2, ISO 27001, PCI DSS, HIPAA compliance for Kubernetes
  • Incident Response: 24/7 emergency security incident handling
  • Training Programs: Customized security training for development and operations teams

Contact us for a complimentary Kubernetes security consultation.


Strengthen your security posture. Subscribe to our newsletter for the latest cloud security research, threat intelligence, and best practices.

Need Help with Your Cloud Infrastructure?

Our experts are here to guide you through your cloud journey

Schedule a Free Consultation