Robert Pirsig’s seminal 1974 book, Zen and the Art of Motorcycle Maintenance, teaches us that quality is an attribute that is inherent to a work product. It is not possible to take something that has low quality and add quality later. In software, this is why “QA teams” in industry are problematic, as you cannot charge a single team and only that team with accountability for software quality. What about the developers who wrote the program code in the first place? If you regard information security as just one aspect of quality, it therefore follows that you cannot design a system and simply bolt on security.
In this post, I’ll discuss why today’s approaches for attempting to enforce information security when none existed during the design and implementation phase are doomed to fail. Jim Bird’s recent DZone article, “Towards Compliance as Code“, captures our views on this topic. We believe that, as a result of Chef’s extensible, non-ontology design, it is perfectly capable of reifying Bird’s notion of Compliance as Code. This is what we call Compliance at Velocity: making changes at the speed of business but without giving up an inch of compliance.
Compliance regimes such as PCI, SOX, HIPAA, etc. have been around for some time but there is actually very poor “compliance with compliance”. Verizon’s annual PCI Compliance Report indicates that only about 25-30% of businesses that are required to be PCI compliant actually fully are. Worse, that figure only represents the compliance at the time of audit. Compliance drops after the auditors leave and companies go back to doing business-as-usual.
Existing compliance solutions rely mostly on post-hoc verification. In fact, some of these concepts are built right into compliance regimes. Take the need for infrastructure scanning by an approved vulnerability scanning vendor as part of PCI-DSS. While this sounds good on paper, there is no possible way that external “black box” scanning can surface all possible vulnerabilities and attack surfaces — and under PCI-DSS it probably isn’t intended to. It doesn’t defend against internal actors who have authorized access to a system via proper mechanisms and who subsequently use some local weakness on the server to compromise it. External scanning would not have caught the rogue agents who compromised Target’s POS system, for example.
More seriously, though, external scanning is the wrong approach to information security and compliance. As mentioned earlier, the only way you can achieve quality is to bind it closely with the actual system under management and the applications that are running on it. Otherwise, when you make changes to the system, by definition your security will lag behind because it is an afterthought.
This is fundamentally what drove the design and implementation of Chef’s “audit mode” functionality: to make the controls for testing and asserting compliance live alongside the same code you’re using to apply policy. We could only do this because Chef was explicitly designed not to be an ontology, so that we can easily define new language components (the control and control_group directives), to represent compliance controls. For a real-life, example, we’ve translated the Center for Internet Security (CIS) Benchmarks into Chef audit-mode language, so that you can assess the compliance of your systems, for free, against their security recommendations.
Compliance has both a build-time and a run-time component, and we have only started to scratch the surface with the former, by starting to build compliance checking capabilities right into Chef Delivery. It should theoretically be possible to build additional helpers, or syntactic sugar, right into Delivery’s recipe language to represent compliance checks, or to, invoke additional ecosystem tools like HP Fortify Static Analysis, right within the context of Chef Delivery for this purpose.
Conclusions
Solving compliance with code was clearly not a use case contemplated when Chef was originally written back in 2008 as a configuration management tool. However, Chef’s native extensibility allows us to implement such features relatively easily. It is for this reason that we increasingly regard Chef as a correctness management platform, because the “configurations” we are discussing are no longer files on disk or services running on a server, but facets of your entire enterprise, for which you could theoretically write correctness rules to describe anything.
Describing enterprise correctness as code that can then be tested is an extremely powerful concept: testing leads to fewer errors, which then lead to less downtime and fewer security breaches, which leads to the ability to make changes at the enterprise level with higher velocity. It’s like a flywheel that just keeps spinning faster and flies in the face of the traditional belief that to achieve safety, you need to slow things down. We invite you to join us on this exciting journey.