As part of my efforts in reducing my dependency on Big Tech, I have been researching how to self-host my password manager. One solution that looks very promising is Vaultwarden, an open source clone of the Bitwarden cloud server. An interesting aspec...
More
As part of my efforts in reducing my dependency on Big Tech, I have been researching how to self-host my password manager. One solution that looks very promising is Vaultwarden, an open source clone of the Bitwarden cloud server. An interesting aspect of this server is that it stores all the secrets in a standard SQLite database, so in addition to having the self-hosted password server I could keep a backup copy of the database on my machine and query it directly. But of course, the secrets are encrypted in this database, so they are useless unless I learn how to decrypt them, similar to how the Bitwarden clients do it.
Speaking of the Bitwarden clients, while I was writing this article it came out that the official Bitwarden CLI client was compromised in a supply chain attack. This is a tool that I personally use and have on all my computers, so this feels like a wake up call to me. Luckily I did not install the compromised version myself, but I think there is an argument to be made about rolling your own secret management client instead of relying on the one all the hackers are after!
In this article I'll share how the encryption of secrets works in Bitwarden and its Vaultwarden clone. I'll also include working Python code, in case you want to tinker with this and like myself, would be interested in building your own tooling to keep your secrets safe.
The 10,000 foot view
Okay, let's get to it, first at a high enough level to keep things simple.
Bitwarden, Vaultwarden and pretty much any half decent password manager store all your secrets encrypted in the server. When I say "secrets" I do not mean just the passwords, but also the usernames, URLs, notes, attachments and everything else that you store that applies to each secret. Bitwarden even encrypts the name you give to each secret. Only the client knows how to encrypt or decrypt, and it always encrypts data before sending it to the server. The server only knows how to store and retrieve blobs of encrypted data from a database.
To encrypt and decrypt secrets, the client uses a master key that is associated with your account. The master key is a random sequence of bytes that is generated in the client at the time you create an account. Like the secrets, this master key is itself encrypted before it is sent to the server for storage. The encryption algorithm used to encrypt the master key is similar to that of the secrets, but they encryption key in this case is generated from the the passphrase chosen by the user to protect the account.
So you see, the account passphrase is not directly used to encrypt your secrets as many people think, it just encrypts the master key. To be able to decrypt your secrets, the Bitwarden client first uses your passphrase to decrypt the master key. Then it uses the master key to decrypt your actual secrets. When the client leaves your vault unlocked, it just means that it is keeping a copy of the decrypted master key in memory (or maybe the whole decrypted vault), so that it can return secrets to you without you having to type your passphrase again. To lock your vault, all the client needs to do is discard the master key.
The master key
I will now share the details of how things work, which as far as I know are not formally documented anywhere. I had to dig through the Bitwarden and Vaultwarden source code to figure out a lot of these details.
These three sections are all binary sequences written in base64 encoding, which ensures all characters are printable. Below you can see some more Python code to decode the secret into its parts:
from base64 import b64decode
# ...
version, payload = encrypted_secret.split('.', 2)
if version != '2':
raise ValueError('Unsupported encryption version')
fields = payload.split('|')
if len(fields) != 3:
raise ValueError('Invalid encrypted data')
iv = b64decode(fields[0])
ciphertext = b64decode(fields[1])
mac = b64decode(fields[2])
Before attempting to decrypt the secret, we need to ensure that the encrypted string isn't corrupted or altered in any way. For this we can independently calculate the MAC signature and then compare it against the mac value included in the secret. If the two are different, then we'll know that the encrypted string has been corrupted or tampered with, so in that case this secret should be discarded as invalid.
For the signature, Bitwarden computes a standard HMAC hash of the concatenation of iv and ciphertext, using the SHA-256 hash function. The secret key used to calculate this hash is the mac_key part of the master key. If you are not familiar with cryptographic functions then just know that this is a standard cryptographic calculation, so common that it is available in the Python standard library. Below you can see Python code that calculates the value of mac and then confirms that the calculated value is identical to the mac value that comes with the secret: