Our friends at AWS published a great blog this week about working with Chef, Lambda and CloudWatch. We’re delighted to share Josh’s best practices around leveraging AWS Services like Lambda and CloudWatch to get the most out of Chef on AWS.
Originally posted here: https://aws.amazon.com/blogs/apn/automatically-delete-terminated-instances-in-chef-server-with-aws-lambda-and-cloudwatch-events/
The post and reference code assumes that you’re using Chef Server version 12. You’ll need to create a user in Chef Server with the ability to query for and delete nodes. You’ll also need the private key (.pem file) for this user, which we will encrypt in the next section. I recommend that you create a new user for this and not use an existing user.
The Lambda function described below, and in the sample code, is expecting that all nodes/instances managed by Chef have an attribute called ec2_instance_id
with a value set to the EC2 instance ID (e.g., i-abcde123
). Alternatives to this approach are described in thesample code, but ultimately, as long as you’re storing instance IDs somewhere in Chef Server, this approach should work.
You’ll also need to create an IAM role for your Lambda function and add the AWSLambdaBasicExecutionRole AWS managed policy to the role. You shouldn’t need to add any other policies to the role.
Chef Server uses public key cryptography to authenticate API requests. This requires the client to sign a hash of requests using a valid private key. In this example, we’ll use KMS to encrypt a copy of our Chef Server certificate (with its associated private key) and then decrypt it on the fly with the Lambda function as needed. This allows us to safely store our Chef Server credentials at rest in encrypted form without the risk of unauthorized users discovering the decryption key needed to access the credentials.
aws kms encrypt --key-id KEY_ID_FROM_STEP_1 –plaintext file://your_private_key.pem
3. You will receive a response with a CiphertextBlob
if successful. An example of a successful response will look like this:
{ "KeyId": "arn:aws:kms:us-east-1:123456789000:key/14d2aba8- 5142-4612-a836-7cf17284c8fd", "CiphertextBlob": "CiCgJ6/K9CIXrDdsJ1fES7kBIJ0STEn+VwpMBjzsHVnH2xKQAQEBAgB4oCevyvQi F6w3bCdXxEu5ASCdEkxJ/lcKTAY87B1Zx9sAAABnMGUGCSqGSIb3DQEHBqBYMFYCA QAwUQYJKoZIhvcNAQcBMB4GCWCGSAFlAwQBLjARBAyk4nsWzRAWTiU4syoCARCAJD HOtYNdSYI6wlso8SgATXKJ0WF5s3qhLcVqMKxaTOO3bCI6Lw==" }
4. Copy this CiphertextBlob
into a new file and store it in the same directory as the Lambda function; this is required so it can be packaged up with the function itself. I’ve used encrypted_pem.txt as the file name in my example, given the encrypted object is a certificate and private key, which is commonly name with the .pem file extension. Note the CiphertextBlob
output is base64 encoded by the AWS CLI unless you send the output to a binary file using the fileb:// parameter. See the AWS KMS CLI help for more information on input and output encoding.
Your Lambda function will need to authenticate with your Chef Server in order to query for and delete nodes. The Chef Server authentication documentation lists information for helping authenticate your client (Lambda function). Note that the recommended Python module (PyChef) does not support Amazon Linux, which Lambda runs on, as of this post, but a quick fix can be added before uploading your Lambda function to correct this. See this comment or the sample code for more information. You can also implement your own authentication provider in the language of your choice using the Chef documentation.
To make requests to the Chef Server you’ll need the URL, the username of a user with permissions (described above), and the encrypted private key from KMS we made above. Your function should already have the required permissions to decrypt the private key if you followed the previous sections. You will need to add some code to decrypt the private key that we encrypted in the previous section. You can find an example of decrypting a KMS key in the KMS documentation, or check out the Slack Integration Blueprints for AWS Lambda for a great example. The only other thing left to do is to query for and delete the nodes from Chef Server. I’ll discuss that next.
The CloudWatch Event rule, which we will set up last, will give us the instance ID of the terminated instance when triggered. No other unique identifying attributes (i.e., IP addresses or FQDN) will be given, so we need to make sure that the instance ID is present within Chef’s metadata. This is why I previously recommended that you include the ec2_instance_id
attribute for each Chef node, if you aren’t already storing the instance IDs elsewhere. Again, there are other approaches, such as using AWS Config, for retrieving metadata about the terminated instance to use to identify a node.
The Lambda function should look at the event object passed in, which will contain data from CloudWatch Events, and parse it to retrieve the instance ID. In Python, this can be done with event['detail']['instance-id']
. With the instance ID, you can now make your Chef Server request to delete that node. For example, in PyChef, you can use either the Nodes or the Search interface, as appropriate, to find the nodes and then delete them.
To deploy the Lambda function, you can use AWS CloudFormation or your tool of choice. I recommend using a tool like CloudFormation so you can easily deploy the function in multiple regions and automatically version your function as changes are made. The sample codeuses Hashicorp’s Terraform to deploy the Lambda function and provision other infrastructure components like IAM roles and the CloudWatch Event rule, which we’ll discuss next.
To tie it all together, we need a CloudWatch Event that triggers the Lambda function created above whenever an instance is terminated. From the CloudWatch console:
Now any time an instance is terminated, the Lambda function will run and delete the node from Chef Server, which keeps things neat and tidy.
By combining the forces of AWS Lambda, CloudWatch Events, and KMS, we’ve created a simple solution to keep our Chef Server organized and up to date automatically. You can also apply this process to other situations that require automatic cleanup after terminated instances. This is just another example that demonstrates the usefulness of combining AWS services to simplify maintaining your DevOps infrastructure.