Secure Persistent Volumes with Kubernetes

When it comes to orchestrating compute infrastructure, Kubernetes is powerful. The ability to declaratively define scalable and resilient services is game changing. All of this is straightforward for stateless services, development apps and demos. But, what about stateful enterprise applications? When you introduce storage into the mix there are a number of important security details that you should pay attention to before moving into production.

Persistent storage solutions have been around for several software generations. Initially, there were native in-tree volume plugins that were built-in to Kubernetes. This meant that adding full support for a 3rd party storage backend required coordination and integration with Kubernetes, with all updates locked to Kubernetes release cycles. Then came FlexVolumes, which addressed these limitations by moving the storage intelligence out of Kubernetes, but complicated deployment and dependency management. At last, we’ve arrived at a scalable model: CSI plugins. The Container Storage Interface enables storage plugins to be developed out-of-tree, containerized, and deployed via standard Kubernetes primitives.

Here’s our list of the top 8 security recommendations for using persistent volumes with Kubernetes and how to configure the associated protections using the Blockbridge CSI driver.

1. Use Namespaces for Isolation

Kubernetes namespaces are an important isolation concept. Namespaces are a way to divide cluster resources between multiple users. By deploying your CSI driver pods per-namespace, you can create isolated storage domains that segment authentication and/or data in a multi-tenant environment. This provides an excellent way to isolate development from production and separate storage resources across different business groups

When using Blockbridge, we recommend associating a Blockbridge tenant account with a Kubernetes namespace. This 1-1 mapping aligns storage authentication, auditing and naming to namespace boundaries. It also enables fine-grained placement policies and resource utilization controls that are scoped to the namespace.

2. Leverage Role-Based Access Control

Creating a secure system is about providing defense in depth. In the event that a service is compromised, segmented access controls help to contain the damage. Kubernetes’ role-based access control (i.e., RBAC) allow you to scope and control the access rights of the CSI driver.

In the event that a CSI driver pod is compromised, the damage is constrained by the roles assigned to the pod. Access is further constrained by the namespace: resources, including secrets and services in other namespaces are inaccessible to the attacker.

NOTE: As of Kubernetes 1.12, it’s currently not possible to restrict access to a subset of secrets. Roles that grant access to secrets must, by definition, grant access to all secrets in the namespace.

As an example, the following resource defines a bb-external-attacher-runner Cluster Role with an associated set of rules. Each rule specifies a resource type and a list of permitted operations:

kind: ClusterRole
  name: bb-external-attacher-runner
  namespace: production
  - apiGroups: [""]
    resources: ["secrets"]
    verbs: ["get", "list"]
  - apiGroups: [""]
    resources: ["events"]
    verbs: ["get", "list", "watch", "update"]
  - apiGroups: [""]
    resources: ["persistentvolumes"]
    verbs: ["get", "list", "watch", "update"]
  - apiGroups: [""]
    resources: ["nodes"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["volumeattachments"]
    verbs: ["get", "list", "watch", "update"]

Using kubectl, we can verify that the role ACL is being properly enforced:

$ kubectl auth can-i list secrets --namespace production --as \
$ kubectl auth can-i list secrets --namespace kube-system --as \
no - no RBAC policy matched

3. Authenticate API Endpoints

Always secure the control communication between your storage plugins and storage infrastructure. Use TLS for transport encryption, and be sure to verify the server’s certificate to prevent man-in-the-middle attacks.

When using the Blockbridge CSI driver, include your CA certificate in the blockbridge secret. If you’re using a well-known CA with Blockbridge, such as Let’s Encrypt, this step is unnecessary.

apiVersion: v1
kind: Secret
  name: blockbridge
  Namespace: production
  api-url: ""
  access-token: "1/AKFm669V3BuxKmIH2...jLR5c5ZAo2brr1XWWg"
  ca-cert: |
    -----END CERTIFICATE-----

4. Use Transport Encryption

Why trust the network when you don’t have to? Ensure that all control and data traffic uses secure transports. For applications that span clouds or datacenters, transport encryption is not optional.

When using the Blockbridge CSI driver, define your Storage Class with the transportEncryption parameter set to tls. This setting will ensure that your data traffic is protected in flight. Note that control traffic is always protected and that disabling transport security on control flows is not possible.

kind: StorageClass
  namespace: production
  name: gp-tls
provisioner: com.blockbridge.csi.eps
  serviceType: gp
  transportEncryption: tls

5. Use Encryption at Rest

Most attacks are not terribly sophisticated. What happens if someone walks off with a physical drive? By encrypting data at rest, your data is safe from theft. By encrypting above the device-level, you are protected against many forms of software-based attack as well.

For Blockbridge, encryption at rest is always-on. For performance-critical applications, we recommend XTS. For applications that require advanced security protections, we recommend GCM for additional guarantees on authenticity and integrity.

6. Validate the Integrity of your Data

Most full-disk encryption implementations are unauthenticated (such as XTS). Unauthenticated encryption provides confidentiality without proof of integrity or authenticity. As such, changes to data stored on disk, whether by device error, software error, or malicious tampering, cannot be detected.

Authenticated encryption (such as GCM) provides a cryptographic guarantee that your data is internally consistent (i.e., has integrity) as well as an assurance that the writer had knowledge of the encryption key (i.e., authenticity).

If you have applications where correctness is absolutely critical, authenticated encryption cannot be beat. For additional protection, Blockbridge extends the guarantees of GCM with block level metadata that provides extended security guarantees specific to multi-tenant security concerns.

7. Leave no trace. Secure erase!

When a persistent volume is deleted, what happens to the data? In most storage implementations, the system releases the references to underlying blocks and makes them available for reuse. When a block is subsequently reallocated, it is either re-written and/or zeroed out. Therefore, your data is effectively recoverable until the block is re-written, even if you are using drive-level encryption! This is a significant problem for the security of your data, and can be a difficult problem to solve.

The solution to this data security issue is to manage encryption on a per-volume basis, integrating volume deletion with key disposal. When it comes time to delete a volume, the associated volume key should be destroyed, providing a cryptographic guarantee that the data cannot be accessed.

This is the default behavior when using Blockbridge: no additional configuration is required. We automatically allocate strong encryption keys per-volume and destroy the key when the volume is removed.

8. Plan for Denial of Service

Implement Denial of Service (DoS) protections against misbehaving and malicious applications. In a shared storage environment, it is not uncommon for one application to negatively impact others due to unfair resource sharing. A Quality of Service (QoS) and IOPS throttling strategy helps you enforce service-level guarantees and protection against DoS.

Advantages of a well-defined storage performance strategy:

  1. Predictable performance ensures applications run the same in development as they do in production.
  2. Service-level guarantees for production applications.
  3. Established performance characteristics make it easy to spot applications that are storage limited.

With Blockbridge, use a performance template (configured via the controlplane) to define a QoS and throttling specification. Create a Kubernetes storage class that references the template and associate it with the CSI driver. This association allows volumes created via the storage class to inherit the performance and placement characteristics described by the template.

Note that you can create multiple performance templates and multiple storage classes as needed. A provisioned IOPS (piops) template and associated StorageClass are shown below:

piops service template

kind: StorageClass
  namespace: production
  name: piops
provisioner: com.blockbridge.csi.eps
  serviceType: piops