Vault Deployment on AWS
Vault Cluster Setup
-
Our vault uses consul as the storage backend and is deploy on baremetal instead of inside a container, as per recommendation.
-
I build our vault AMI based on the example from hashicorp.
-
Because vault is an internal service, I decide to use the default internal domain name vault.service.consul as its domain name. Note that in hashicorp’s example, dnsmasq, the dns forwarder, is set to forward .consul domains to consul’s DNS interface already.
-
Use your own CA to generate vault’s server certificate and private key with your internal domain name, and update
ca_public_key_path
,tls_public_key_path
,tls_private_key_path
in vault-consul.json -
The vault cluster terraform is very similar to the example cluster setup, minus the consul and vpc setup. Make sure you set ami_id, consul_cluster_tag_key, and consul_cluster_tag_value to your own value
-
Once terraform is run, log into any instance and init/unseal as normal. On the vault instance, you should be able to resolve vault.service.consul to the private AWS IP already
Vault Consumer ECS Cluster Setup
-
Because we need to set up consul agent and dnsmasq to resolve the internal domain name, we still need to build our custom AMI based on hashicorp’s example. Note that this packer template does not have pip and aws installed, I have to modify the install script to ensure that pip and aws are available on the sudo path, i.e., add symbolic link from /usr/local/bin to /usr/bin
-
The ecs cluster terraform setup is similar to the module. However, I have to modify the user_data in the launch configuration to include
#set up ECS agent
export ECS_CLUSTER=$MY_ECS_CLUSTER
cat <<'EOF' >> /etc/ecs/ecs.config
ECS_CLUSTER=$MY_ECS_CLUSTER
ECS_LOGFILE=/var/log/ecs/ecs-agent.log
EOF
#start consul agent
set -e
exec > >(tee /var/log/user-data.log|logger -t user-data -s 2>/dev/console) 2>&1
/opt/consul/bin/run-consul --client --cluster-tag-key CONSUL_TAG_KEY --cluster-tag-value CONSUL_TAG_VALUE
Vault-consuming ECS Container Setup
-
Each service should have its own ECS Task role, ideally one for different enviroment. This ecs task role is mainly for vault’s IAM auth, because it is very hard to solve the secret bootstrap problem with token-based auth.
-
Because we use dns forwarding at the ami level, our service container needs to be deployed in the host mode
-
When we build the docker image
- Because we use a private domain for vault, we need to add our own ca.cert to the java trust store, i.e.,
ADD ca.crt.pem .
RUN mkdir -p $JAVA_HOME/lib/security && \
yes "yes" | $JAVA_HOME/bin/keytool -import -trustcacerts -alias $CA_ALIAS -file ca.crt.pem \
-keystore $JAVA_HOME/lib/security/cacerts --storepass $TRUST_PW
* Add our new cert to the java class path, i.e.,
-Djavax.net.ssl.trustStore=$JAVA_HOME/lib/security/cacerts
-Djavax.net.ssl.trustStorePassword=$TRUST_PW