Azure Databricks and Key vault key versioning

Hemanth Kumar 0 Reputation points
2025-01-28T07:07:43.85+00:00

We are building a Data encryption solution and are using pyspark AES encrypt and AES Decrypt function to encrypt the PII columns in the Databricks tables on Unity catalog. We have created views on top of these tables and using the same encryption key we have the decrypted data. This Data is only accessible to limited users based on the ACLs. The encryption key is stored on Azure key vault and we are accessing it through Azure keyvault backed Databricks secret scope. Now, the client has a policy to rotate the secret after one year. If we rotate the secret and update the value, we won't be able to view all the back dated data which has been decrypted using the older secret. We are using external tables as we have to join the different tables based on the PII columns. We cannot go with managed tables or row-level masking or column level masking. What is the recommended and best approach for secret rotation and management to proceed further?

Azure Key Vault
Azure Key Vault
An Azure service that is used to manage and protect cryptographic keys and other secrets used by cloud apps and services.
1,363 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Chiugo Okpala 75 Reputation points MVP
    2025-02-02T21:49:02.7166667+00:00

    @Hemanth Kumar Key rotation is indeed a critical aspect to ensure security, but it can be challenging when dealing with historical data. Here are some recommended approaches and best practices for secret rotation and management:

    1. Key Versioning

    Azure Key Vault supports key versioning, which allows you to keep multiple versions of a key. When you rotate the key, you can create a new version of the key while retaining the old version. This way, you can still decrypt historical data with the old key version.

    1. Dual-Key Encryption

    Use a dual-key encryption approach where each piece of data is encrypted with two keys: a data encryption key (DEK) and a key encryption key (KEK). The DEK is used for encrypting the data, and the KEK is used to encrypt the DEK. When you rotate the KEK, you re-encrypt the DEK with the new KEK. This way, you can still decrypt historical data with the old DEK.

    1. Automated Key Rotation

    Leverage Azure Key Vault's automated key rotation feature. This feature allows you to set an expiration time for the key and automatically rotates the key when it expires. You can also configure notifications to alert you when the key is nearing expiration.

    1. Centralized Key Management

    Use a centralized key management system (KMS) to manage your keys. This system should be secure and allow easy management of keys across the organization. It should also support key versioning and dual-key encryption.

    1. Regular Audits and Monitoring

    Conduct regular audits and monitor key usage to ensure that keys are being used correctly and securely. This can help you identify any potential issues and address them promptly.

    1. Disaster Recovery Strategy

    Create a disaster recovery strategy that includes backing up your keys and ensuring that they can be restored in case of a disaster. This can help you recover your data if something goes wrong.

    1. Documentation and Policies

    Document your key management policies and enforce them consistently. This includes documenting key usage, rotation schedules, and any other relevant information.

    By implementing these practices, you can ensure that your data encryption solution remains secure while still allowing access to historical data. Does this help address your concerns?

    See:

    https://learn.microsoft.com/en-us/azure/key-vault/general/versions

    https://www.encryptionconsulting.com/10-enterprise-encryption-key-management-best-practices/

    https://learn.microsoft.com/en-us/azure/key-vault/keys/how-to-configure-key-rotation

    https://www.liquidweb.com/blog/encryption-key-management-best-practices/

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.