Using Git-crypt to Protect Sensitive Data
The advent of the EU General Data Protection Regulation (GDPR) permitted to highlight the need to protect sensitive information from leakage.
It is of course even more important in the context of git repositories, whether public or private, since the disposal of a working copy of the repository enable the access to the full history of commits, in particular the ones eventually done by mistake (git commit -a
) that used to include sensitive files.
That’s where git-crypt comes for help.
It is an open source, command line utility that empowers developers to protect specific files within a git repository.
git-crypt enables transparent encryption and decryption of files in a git repository. Files which you choose to protect are encrypted when committed, and decrypted when checked out. git-crypt lets you freely share a repository containing a mix of public and private content. git-crypt gracefully degrades, so developers without the secret key can still clone and commit to a repository with encrypted files. This lets you store your secret material (such as keys or passwords) in the same repository as your code, without requiring you to lock down your entire repository.
The biggest advantage of git-crypt is that private data and public data can live in the same location.
Note: there are alternatives tools/approaches you can use to protect/encrypt data within a Git repository, listed at the end of this post
Table of Content
- Pre-requisites
- Installation
- Initial Repository Setup and Configuration
- git-crypt Usage
- git-crypt alternatives
Pre-requisites
To use git-crypt, you need a working Git and GPG environnment
- For Git: see Tutorial: IT/Dev[op]s Army Knives Tools for the researcher
- For Using GPG within git, you have to instruct it about your GPG signing key ID. Example below:
1 2 3 4 5 |
|
- For GPG: Gnu Privacy Guard, see this tutorial To reach this state:
1 2 3 4 5 6 7 8 9 10 |
|
GPG Key Management
General recommendations / Best practices
- create a 4096bit RSA key, with the sha512 hashing algorithm
- Use the concept of GPG key subpairs
- your primary key is only meant for certification / authentication purposes (in particular not for signing or encrypting).
- Expiration date should be within less than two years.
- You can always extend the key expiration as long as you still have access to the key, even after it has expired
This applies for your personnal GPG keyring on your laptop. You may be reluctant to transfer or share your primary key pair over a remote [computing] system, such as an HPC facility. To handle your GPG keys on such platform (for instance the UL HPC clusters, you have two alternatives:
- create a new key pair proper to each cluster, that you will sign with your primary key.
- create a subkey you will export on the remote facility.
Installation
On your local machine:
If you’re running Mac OS X – and assuming Homebrew is installed:
1
|
|
If you’re running Linux:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
|
Note: git-crypt is installed on the UL HPC platform
Initial Repository Setup and Configuration
Configure/setup a repository to use git-crypt
as follows:
1
|
|
This will generate a symmetric key for encrypting your files (stored in .git/git-crypt/keys/default
).
Then there are a couple of actions to perform, detailed below:
- claim ownership of the git-crypt vault
- bootstrap a
.gitattributes
file at the root of the repository defining the encryption policy for the files of the repository - enable a custom git pre-commit hook (see doc to avoid accidentally adding unencrypted files – see issue #45.
- share this key with allowed collaborators through a commited version of its encrypted version using their respective GPG key (see
git-crypt add-gpg-user
)
Note you need of course to have imported the corresponding GPG key ID into your keyring
1 2 3 |
|
.gitattributes
setup
Create and commit at the root of your repository a new file named .gitattributes
with the following content:
1 2 3 4 5 6 7 8 9 10 |
|
Note: you can find this template file on Github.
To automate the process from online sources:
1 2 3 4 |
|
Git pre-commit
hook
You need also to setup a Pre-commit hook to avoid accidentally adding unencrypted files with git-crypt – see issue #45. You can find it as a gist:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
|
Recommended way to automate the installation (leaving the pre-commit hook script in a dedicated directory config/hooks/
):
1 2 3 4 5 6 7 8 |
|
(optional) Multiple key support
In addition to the implicit default
key, git-crypt supports alternative keys which can be used to encrypt specific files and can be shared with specific GPG users.
This is useful if you want to grant different collaborators access to different sets of files.
To generate an alternative key named <KEYNAME>
and/or share it with a GPG user, pass the -k <KEYNAME>
option to git-crypt { init | add-gpg-user}
as follows:
1 2 3 |
|
To encrypt a file with an alternative key, use the git-crypt-<KEYNAME>
filter in .gitattributes
as follows:
1
|
|
git-crypt Usage
Unlock/lock the git-crypt
vault
You can unlock the vault i.e. decrypt the encryption key using your personnal GPG ID by running
1
|
|
You can lock back the vault by running
1
|
|
/!\ IMPORTANT
thanks to the above configured Git pre-commit hook, you avoid having sensitive files (as filtered within the .gitattributes
file) commited in cleartext while the git-crypt
vault is locked.
Adding data sensitive file to the repository
- First you need to unlock the vault (if not yet done) with
git-crypt unlock
. - Then specify files/wildcard patterns to encrypt by commpleting the
.gitattributes
file at the root of the repositoryfilter=git-crypt diff=git-crypt - commit the changes to the
.gitattributes
file. - add and commit your file
Example of specifications within the .gitattributes
file:
1 2 3 4 5 6 7 8 9 10 |
|
For instance at step 4, assuming you plan to add a *.key
file (thus expected to be encrypted as per above .gitattributes
policy), proceed as follows:
1 2 3 4 5 |
|
Note that thanks to the pre-commit hook, in case you have forgotten to unlock the repository, the above commit command would fail as follows:
1 2 3 4 5 |
|
So assuming you did well, you can commit the file and check that the content is indeed encrypted:
1 2 3 4 5 6 7 8 9 10 11 |
|
Adding new collaborator to the vault
To grant access to the encrypted files stored in the repository to a collaborator, you first need to collect his GPG ID. You have several options at this level:
- query and import the GPG ID from the official GPG servers and carefully check it (assuming you do not have yet import it in your keyring)
- as distributing GPG keys can be cumbersome, rely on the keybase.io service to collect certified GPG ID from their username – see tutorial
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
Now you can share the repository with this GPG ID:
1 2 3 4 5 6 7 8 9 10 |
|
By default, git-crypt add-gpg-user
will fail if there is no assurance that the key belongs to the named user.
If you trust the key you imported (but did not commit this entitlement within your keyring by actually signing this key), you can use the --trusted
option to enforce the operation to succeed:
1
|
|
You can add as many collaborators as you wish.
Example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
|
Removing a collaborator from the vault
(Update Sept 19, 2019)
That’s a tricky part, an open bug is running since 2015 on that subject.
What is sure is that it is not sufficient to remove from the repository the .git-crypt/keys/default/0/<GPG-Key-to-remove-fingerprint>.gpg
. It is requried to re-initialize git-crypt for the repository with a new key and re-add all keys except the one requested for removal.
Note: You still need to change all your secrets to fully protect yourself. Removing a user will prevent them from reading future changes but they will still have a copy of the data up to the point of their removal.
There is no consensus on the appropriate way to handle it but you can find a convenient script that makes the job in this gist
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
|
IMPORTANT Before merging the changes made in the git-crypt-remove
branch into yoru main branch and promote (push) these commits, you have to ensure that your collaborator have locked the repository before pulling or they will end in a conflicting state quite annoying to recover.
1 2 3 4 5 |
|
git-crypt alternatives
Password Management with pass
Another nice git-based approach that team nicely with GPG relies on pass, the standard unix password manager. Password are stored inside GPG encrypted files inside a simple directory tree, meant to become a password repository.
Then pass
is an utility to insert, display or copy to clipboard passwords stored
into this git repository.
It is not mandatory to use it, but it eases password management.
Assuming you have set the environnment variables PASSWORD_STORE_{DIR,SIGNING_KEY}
, the pass
CLI usage can be summarized below:
1 2 3 4 5 6 7 |
|
Notes: Git commit is done automatically by the pass
utility.
If you need to add comments in addition to the password, use the -m
option to insert extra lines.
A dedicated tutorial page will be made available for this tool.
Now if you are allergic to GnuPG and/or by extension git-crypt, here are a few other alternatives you can use to protect your sensitive data in a repository.
EncFS / GocryptFS / eCryptFS / Cryptomator / securefs / CryFS
All these open-source file encryption solutions for Linux (and thus Mac OS) are available. In contrast to disk-encryption software that operate on whole disks (TrueCrypt, dm-crypt etc), file encryption operates on individual files that can be backed up or synchronised easily, especially within a Git repository.
- Comparison matrix
- gocryptfs, aspiring successor of EncFS written in Go
- EncFS, mature with known security issues
- eCryptFS, integrated into the Linux kernel
- Cryptomator, strong cross-platform support through Java and WebDAV
- securefs, a cross-platform project implemented in C++.
- CryFS, result of a master thesis at the KIT University that uses chunked storage to obfuscate file sizes.
Assuming your working copy is stored in /path/to/repo
, your workflow (mentionned below for EncFS, but it can be adpated to all the other tools) operated on encrypted vaults and would be as follows:
- you ignore the mounting directory (ex:
vault/*
) in the root.gitignore
of the repository- this ensures neither you nor a collaborator will commit any unencrypted version of a file by mistake
- you commit only the EncFS / GocryptFS / eCryptFS / Cryptomator / securefs / CryFS raw directory (ex:
.crypt/
) in your repository.- thus only encrypted form or your files are commited
- You create the EncFS / GocryptFS / eCryptFS / Cryptomator / securefs / CryFS encrypted vault
- You prepare macros/scripts/Makefile/Rakefile tasks to lock/unlock the vault on demand
Here are for instance a few example of these operations in live (for EncFS, adapt accordingly)
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
Tool | OS | Opening/Unlocking the vault | Closing/locking the vault |
---|---|---|---|
EncFS | Linux | encfs -o nonempty --idle=60 $rawdir $mountdir |
fusermount -u $mountdir |
EncFS | Mac OS | encfs --idle=60 $rawdir $mountdir |
umount $mountdir |
GocryptFS | gocryptfs $rawdir $mountdir |
as above |
Note: In a Puppet control repository relying on hiera, you can use the hiera-eyaml format.
File Encryption using SSH [RSA] Key Pairs
- Man pages:
openssl rsa
,openssl rsautl
andopenssl enc
- Tutorial: Encryption with RSA Key Pairs
- Tutorial: How to encrypt a big file using OpenSSL and someone’s public key
- OpenSSL Command-Line HOWTO, in particular the section ‘How do I simply encrypt a file?’
If you encrypt/decrypt files or messages on more than a one-off occasion, you should really use GnuPGP as that is a much better suited tool for this kind of operations. But if you already have someone’s public SSH key, it can be convenient to use it, and it is safe.
The below notes assumes you have a (potentially big) file you want to send encrypted to a collaborator, typically on a remote server where your SSH public key is allowed (i.e. your id_rsa.pub
key is added to the remote ~/.ssh/authorized_keys
file).
We also assume that you own a copy of the SSH public key of your collaborator (denoted by id_dst_rsa.pub
) in the sequel.
Note: as a reminder, you can generate a strong RSA key pair (4096 bits) using
ssh-keygen -t rsa -b 4096 -a 100 [-f <name>]
This will produce the key files <name>
and <name>.pub
, where <name>
is ~/.ssh/id_rsa
by default.
Example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
|
/!\ IMPORTANT
the below instructions are NOT compliant with the new OpenSSH format which is used for storing encrypted (or unencrypted) RSA, EcDSA and Ed25519 keys (among others) when you use the -o
option of ssh-keygen
. You can recognize these keys by the fact that the private SSH key ~/.ssh/id_rsa
starts with -----BEGIN OPENSSH PRIVATE KEY-----
Encrypt a file using a public SSH key
(eventually) SSH RSA public key conversion to PEM PKCS8
OpenSSL encryption/decryption operations performed using the RSA algorithm relies on keys following the PEM format 1 (ideally in the PKCS#8 format).
It is possible to convert OpenSSH public keys (private ones are already compliant) to the PEM PKCS8 format (a more secure format).
For that he can either use the ssh-keygen
or the openssl
commands, the first one being recommended.
1 2 3 4 |
|
Note that you don’t actually need to save the PKCS#8 version of his public key file – the below command will make this conversion on demand.
Generate a 256 bit (32 byte) random symmetric key
There is a limit to the maximum length of a message i.e. size of a file that can be encrypted using asymmetric RSA public key encryption keys (which is what SSH keys are). For this reason, you should better rely on a 256 bit key to use for symmetric AES encryption and then encrypt/decrypt that symmetric AES key with the asymmetric RSA keys This is how encrypted connections usually work, by the way.
Generate the unique symmetric key key.bin
of 32 bytes (i.e. 256 bit) as follows:
1
|
|
You should only use this key once. If you send something else to the recipient at another time, you should regenerate another key.
Encrypt the (potentially big) file with the symmetric key
1
|
|
Note: for your tests, you can quickly generate random files of 1 GiB size as follows:
1 2 3 4 |
|
An indicated encryption time taken for the above random file is proposed in the below table, using
openssl enc -aes-256-cbc -salt -in bigfile_<N>GiB.dat -out bigfile_<N>GiB.dat.enc -pass file:./key.bin
File | size | Encryption time |
---|---|---|
bigfile_1GiB.dat |
1 GiB | 0m5.395s |
bigfile_10GiB.dat |
10 GiB | 2m50.214s |
Encrypt the symmetric key, using your collaborator public SSH key in PKCS8 format:
1 2 3 |
|
Delete the unencrypted symmetric key as you don’t need it any more (and you should not use it anymore)
1
|
|
Now you can transfer the *.enc
files i.e. send the (potentially big) encrypted file <file>.enc
and the encrypted symmetric key (i.e. key.bin.enc
) to the recipient _i.e. your collaborator.
If you’re allowed to, transfer them by SSH to an agreed remote server. It is even safe to upload the files to a public file sharing service and tell the recipient to download them from there.
Decrypt a file encrypted with a public SSH key
First decrypt the symmetric key using the SSH private counterpart:
1 2 3 |
|
Now the (potentially big) file can be decrypted, using the symmetric key:
1
|
|
Misc
For a ‘quick and dirty’ encryption/decryption of small files:
1 2 3 4 |
|
-
Defined in RFCs 1421 through 1424, is a container format for public/private keys or certificates used preferentially by open-source software such as OpenSSL. The name is from Privacy Enhanced Mail (PEM) (a failed method for secure email, but the container format it used lives on, and is a base64 translation of the x509 ASN.1 keys. ↩