Thursday, April 30, 2009

Are configuration management tools still needed in the cloud?

Cloud is the buzz word of the year and with the Ubuntu Enterprise Cloud available in Ubuntu 9.04 everyone will be able to build its own private cloud to experiment. As a cloud base infrastructure provides more flexibility and dynamism in the computing infrastructure it seems that configuration management tools will become more and more important in the future.

What's the purpose of a configuration management tool ?


According to wikipedia:
Configuration management (CM) is a field of management that focuses on establishing and maintaining consistency of a system's or product's performance and its functional and physical attributes with its requirements, design, and operational information throughout its life. For information assurance, CM can be defined as the management of security features and assurances through control of changes made to hardware, software, firmware, documentation, test, test fixtures, and test documentation throughout the life cycle of an information system

While the definition above is quite generic system administrators are using configuration management tools such as puppet or cfengine in order to automate system deployments and making sure that every instance providing a specific service has the same configuration. Another service provided by these tools is to automatically distribute configuration changes to all running systems.

How does this apply to a cloud infrastructure?


The cloud model as implemented by the Ubuntu Enterprise Cloud is based on the golden image principle. Each system is based on a static image. The cloud infrastructure is then used to spawn new instances of a specific image. This is one of the characteristics of such an infrastructure: deploying new systems is easier, faster and cheaper. Potential resources are much larger than before.

However one of the issue with the golden image model is that over time there is a drift between running systems and the golden image. When a configuration update is made to the service the offline golden image also needs to be updated. Moreover a configuration management system is needed to push the changes to running systems.

Let's take the example of a web hosting infrastructure running 20 instances of an apache server. How would a new virtual host be defined?

With a configuration management system a new virtual host is defined in the central repository and the tool deploys the new virtual host definition to all running systems.

Applying the combination of the golden image feature with the ease of deployment provided by a cloud infrastructure would lead to defining the new virtual host in one running system, updating the golden image, spawning 20 new instances and swapping them with the old ones  in the web infrastructure.

It seems strange at first to re-bundle a new image and redeploy all of your servers just for a one-line configuration change. One reason may be that system re-installation has always been seen as a last resort option in a traditional infrastructure. This assumption is no longer true in the cloud with its fast and easy provisioning feature.

What are the advantages of the golden image pattern?


Rolling back a configuration change is much faster as both revisions of the service are running at the same time. System administrator don't need to learn another tool and can just use their standard ways of administrating a single server.

However some issues remain:

How about applying a configuration change to different golden images? A dozen of images still need to be booted and the change made everywhere. We're back to square one.  Configuration management tools have the concept of classes and each system will apply specific configurations according to the defined classes. This is done to avoid redundancy in configuration definition. Having just a set of golden images creates redundant configuration. However the amount of images to change is much smaller than dealing with hundreds of instances.

How about tracking changes between configuration? Most of the configuration management tools suggest to keep the central repository under revision control so that changes made to the environment can easily be tracked. With golden images we're lacking the tools to store multiple versions of golden images and perform image diffs: what is the difference between the running system and its base offline image, between two revisions of the same golden image, between two running systems? Having access to such tools would be very useful to system administrators to help the debugging process.

In conclusion configuration management tools have been used for some time by groups running big infrastructures with lots and lots of systems to manage. The dynamism of the cloud brings the same problems to its users even if they are only using a couple of instances to run their infrastructure. Configuration management tools should probably be considered as an essential tool when moving into the cloud.