In the cloud world, companies must be vigilant to the types of risk that exist when placing their code and infrastructure into the hands of others. Anytime you place your business data in the hands of a third party, there's risk. While providers like Amazon, Rackspace or VMware have a certain amount of credibility behind their names - inherently, there's still some sort of risk involved. This is especially true with cloud providers, where the implementation and security behind the scenes is usually not visible to the end customer. You have to trust the vendor.
At CloudBees, we have a number of security measures in place to help safeguard your applications and code against external threats. Some of these are practices we've honed over the past few years, others are best practices that everyone should be doing. In this post, I'll share a few of the things we're doing to help safeguard our customers.
While we have equipment in multiple data centers worldwide, a good portion of our Platform as a Service (PaaS) cloud environment runs on Amazon Web Services (AWS) so this will be the focus of my article.
AWS Credential Management
Our service offerings have been developed over the past few years and, like many others who have spent a few years in Amazon's cloud, the evolution of credential management has improved. Originally, Amazon offered one set of credentials that were universal across your AWS account. To tackle security, a lot of people (CloudBees included) had multiple Amazon accounts for a layer of separation between services and access needs. In 2010, Amazon released a more fine-grained credential and access management system called AWS Identity and Access Management (IAM).
We have spent a significant amount of time revising our access management system to take advantage of this system. As noted, previously there was one centralized set of credentials for a specific AWS account. This meant that every developer who needed access had to be given these credentials. In addition, all of our services that utilized the EC2 API also had to have these credentials distributed to the instances they ran on.
I don't think I need to explain why having a single key to the kingdom is not the greatest. For starters, if a developer left the company this would necessitate a forced change of all of the credentials out there. This means that every developer would need to be given a new set of credentials, and every application would need to be updated to have the new set of credentials in place. While we make use of automation, this is certainly not a desired scenario.
When the company first started, the number of developers was small, and everyone did everything. However, as more people joined the team and had different access needs, it became not only a security threat, but also just a development threat for everyone to have full access to all systems.
Over the past few months, we've done an extensive audit and redeployment using both developer and service specific credentials throughout our system. Not only can we have specific credentials now for each developer and each service, but we can also lock those credentials down to minimize security risks.
One such example is our DNA service. This is an internal facing service that we use to monitor and manage instance and service health. The DNA application needs the ability to access instance lists, start and stop instances, and update IP address information, amongst other things. Not only does DNA now have its own credentials that are specific to its service, but those credentials are locked to a single fixed IP address. AWS will not accept commands using those credentials unless they originate from that single IP address. Now we worry much less about those credentials being used maliciously.
With developer-specific credentials, comes the ability to much more easily rotate and disable access as needed. Now we can easily/quickly remove access for a specific set of developer credentials, without impacting other services or developers in the process. As well, we can limit developer access to the pieces of infrastructure they need in order to do their work. Our major concern isn't a rogue developer causing issues, it's much more concerning if someone's laptop gets stolen, or someone at a coffee shop oversees login credentials on the screen. Restricted access also limits what a developer can do accidentally, if they target the wrong thing or try something when they don't completely understand the potential outcome.
Rolling out new credentials to all developers and services is not easy. This process has required considerable planning and execution, and you have to be ready for some of the potential downfalls. For one, when people have restricted access to the system, they now are not able to react to major system issues that may creep up. When service access becomes limited, it can cause future potential issues. For example, if new features are added that make use of restricted API calls - nobody may remember they are restricted and a significant amount of debugging time may be spent trying to figure out why things don't work.
You must also think through scenarios when developers still have access to change other developers or services permissions. For example, locking down our DNA service to a specific IP address increases security, but if any developer can go in later and change that lockdown, it may not be obvious that change ever happened. During some of our initial audits, we found a lot of security-related changes that were done ad hoc to quickly get something that had been broken working again, but then the security change was never later reversed.
As a result, part of our policy now is to disallow developers or services from making Identity Access Management (IAM) changes. Those changes are handled by a group of three administrators, and via an administrative account only. This account is the only one able to make IAM changes (controlling access to this account will be discussed in another blog post).
By distributing credentials in this manner, we feel we have much better protection of our infrastructure in the cloud which, in turn, allows us to keep our customers data more secure.
Stay tuned for Part 2, where I will discuss how we manage remote access and network traffic security.
Caleb Tennis, Elite Developer
Read Parts 2 and 3 in Caleb's Securing the Cloud blog series: