Mathias' thoughts...

Thursday, April 8, 2010

Using puppet in UEC/EC2: Improving performance with Phusion Passenger

Now that we have an efficient process to start instances within UEC/EC2 and get them configured for their task by puppet we'll dive into improving the performance of the puppetmaster with Phusion Passenger.

Why?

The default configuration used by puppetmasterd is based on webrick which doesn't really scale well. One popular choice to improve puppetmasterd performance is to use mod passenger from the libapache2-mod-passenger package.

Apache2 setup

The configuration is based on the Puppet passenger documentation. It is available from the bzr branch as we'll use puppet to actually configure the instance running puppetmasterd.

The puppet module has been updated to make sure the apache2 and libapache2-mod-passenger packages are installed. It also creates the relevant files and directories required to run puppetmasterd as a rack application.

Passenger and SSL modules are enabled in the apache2 configuration. All of their configuration is done inside a virtual host definition. Note that the SSL options related to certificates and private keys files points directly to /var/lib/puppet/ssl/.

Apache2 is also configured to only listen on the default puppetmaster port by replacing apache2 default ports.conf and disabling the default virtual site.

Finally the configuration of puppetmasterd has been updated so that it can correctly process the certificate clients while being run under passenger.

Note that puppetmasterd needs to be run once in order to be able to generate its ssl configuration. This happens automatically when the puppetmaster package is installed since puppetmasterd is started during the package installation.

Deploying an improved puppetmaster

Log on the puppetmaster instance and update the puppet configuration using the bzr branch:

bzr pull --remember lp:~mathiaz/+junk/uec-ec2-puppet-config-passenger /etc/puppet/

Update the configuration:

sudo puppet --node_terminus=plain /etc/puppet/manifests/puppetmaster.pp

On the Cloud Conductor start a new instance with start_instance.py. If you're starting from scratch remember to update the start_instance.yaml
file with the puppetmaster CA and internal IP:

./start_instance.py -c start_instance.yaml AMI_NUMBER

Following /var/log/syslog on the puppetmaster you should see the new instance requesting a certificate:

Apr 8 00:40:08 ip-10-195-93-129 puppetmasterd[3353]: Starting Puppet server version 0.25.4
Apr 8 00:40:08 ip-10-195-93-129 puppetmasterd[3353]: 7d6b61a7-3772-4c41-a23d-471b417d9c47 has a waiting certificate request

Now that the puppetmasterd process is run by apache2 and mod-passenger you can check in /var/log/apache2/other_vhosts_access.logs.log the http requests made by the puppet client to get its certificate signed:

ip-10-195-93-129.ec2.internal:8140 10.195.94.224 - - [08/Apr/2010:00:40:06 +0000] "GET /production/certificate/7d6b61a7-3772-4c41-a23d-471b417d9c47 HTTP/1.1" 404 2178 "-" "-"
ip-10-195-93-129.ec2.internal:8140 10.195.94.224 - - [08/Apr/2010:00:40:08 +0000] "GET /production/certificate_request/7d6b61a7-3772-4c41-a23d-471b417d9c47 HTTP/1.1" 404 2178 "-" "-"
ip-10-195-93-129.ec2.internal:8140 10.195.94.224 - - [08/Apr/2010:00:40:08 +0000] "PUT /production/certificate_request/7d6b61a7-3772-4c41-a23d-471b417d9c47 HTTP/1.1" 200 2082 "-" "-"
ip-10-195-93-129.ec2.internal:8140 10.195.94.224 - - [08/Apr/2010:00:40:08 +0000] "GET /production/certificate/7d6b61a7-3772-4c41-a23d-471b417d9c47 HTTP/1.1" 404 2178 "-" "-"
ip-10-195-93-129.ec2.internal:8140 10.195.94.224 - - [08/Apr/2010:00:40:08 +0000] "GET /production/certificate/7d6b61a7-3772-4c41-a23d-471b417d9c47 HTTP/1.1" 404 2178 "-" "-"

Once check_csr is run by cron the certificate will be signed and the puppet client is able to retrieve its certificate:

ip-10-195-93-129.ec2.internal:8140 10.195.94.224 - - [08/Apr/2010:00:42:08 +0000] "GET /production/certificate/7d6b61a7-3772-4c41-a23d-471b417d9c47 HTTP/1.1" 200 2962 "-" "-"
ip-10-195-93-129.ec2.internal:8140 10.195.94.224 - - [08/Apr/2010:00:42:08 +0000] "GET /production/certificate_revocation_list/ca HTTP/1.1" 200 2450 "-" "-"

The puppet client ends up requesting its manifest:

ip-10-195-93-129.ec2.internal:8140 10.195.94.224 - - [08/Apr/2010:00:42:09 +0000] "GET /production/catalog/7d6b61a7-3772-4c41-a23d-471b417d9c47?facts_format=b64_zlib_yaml&facts=eNp [....] HTTP/1.1" 200 2354 "-" "-"

Conclusion

I've just outlined how to configure mod passeenger to run puppetmasterd which is a much more efficient setup than using the default webrick server. Most of the configuration is detailed in the files available in the bzr branch.

Wednesday, April 7, 2010

Using puppet in UEC/EC2: Node classification

In a previous article I discussed how to set up an automated registration process for puppet instances. We'll now have a look at how we can tell these instances what they should be doing.

Going back to the overall architecture the Cloud conductor is the component responsible for starting new instances. Of all the three components it's him that has the most knowledge about what an instance should be: it is the one responsible for starting a new instance after all.

Using S3 to store node definitions

We'll use the puppet external node feature to connect the Cloud conductor with the puppetmaster. The external node script -node_classifier.py - will be responsible for telling which classes each instance is supposed to have. Whenever a puppet client connects to the master the node_classifier.py script is called with the certificate name. It is responsible for providing a description of the classes, environments and parameters for the client on its standard output in a yaml format.

Given that the Cloud conductor creates a file with the certificate name for each instance it spawns we'll extend the start_instance.py script to store the node classification in the content of the file created in the S3 bucket.

You may have noticed that instances started by start_instance.py don't have an ssh public key associated with them. So we're going to create a login-allowed class that will install the authorized key for the ubuntu user.

Setup the puppetmaster to use the node classifier

We'll use the Ubuntu Lucid Beta2 image as the base image on which to build our Puppet infrastructure.

Start an instance of the Lucid Beta2 AMI using an ssh key. Once it's running write down its public and private DNS addresses. The public DNS address will be used to setup the puppetmaster via ssh. The private DNS address will be used as the puppetmaster hostname given out to puppet clients.

Log on the started instance via ssh to install and setup the puppet master:

Update apt files:

sudo apt-get update

Install the puppet and bzr packages:

sudo apt-get install puppet bzr

Change the ownership of the puppet directory so that the ubuntu user can directly edit the puppet configuration files:

sudo chown -R ubuntu:ubuntu /etc/puppet/

On the puppetmaster check out the tutorial3 bzr branch:

bzr branch --use-existing-dir lp:~mathiaz/+junk/uec-ec2-puppet-config-tut3 /etc/puppet/

You'll get a conflict for the puppet.conf file. You can ignore the conflict as the puppet.conf file from the branch is the one that supports an external node classifier:
bzr resolve /etc/puppet/puppet.conf

Edit the node classifier script scripts/node_classifier.py to set the correct location of your S3 bucket.

Note that the script is set to return 1 if the certificate name doesn't have a corresponding file in the S3 bucket. You may want to change the return code to 0 if you want to use the normal nodes definition. See the puppet external node documentation for more information.

The puppetmaster configuration in puppet.conf has been updated to use the external node script.

There is also the login-allowed class defined in the manifests/site.pp file. It sets the authorized key file for the ubuntu user.

On the puppetmaster edit manifests/site.pp to update the public key with your EC2 public key. You can get the public key from ~ubuntu/.ssh/authorized_key on the puppetmaster.

To bootstrap the new puppetmaster configuration run the puppet client:

sudo puppet --node_terminus=plain /etc/puppet/manifests/puppetmaster.pp

Note that you'll have to set the node_terminus to plain to avoid calling the node classifier script when configuring the puppetmaster itself. Otherwise the puppet run would fail since the puppetmaster certificate name (which defaults the to fqdn of the instance) doesn't have a corresponding file in the S3 bucket.

We have now our puppetmaster configured to look up the node classification for each puppet client.

Update start_instance.py to provide a node definition

It's time to update the Cloud conductor to provide the relevant node classification information whenever it starts a new instance.

Update the bzr branch on the Cloud conductor system:

bzr pull --remember lp:~mathiaz/uec-puppet-config-tut3

The start_instance.py script has been updated to write the node classification information when it creates the instance file in the S3 bucket. That information is actually set in the start_instance.yaml file under the node key. All of the node classification information expected by the puppetmaster from the external node classifier script is set under the node key in start_instance.yaml. See the puppet external node documentation for more information on the information that can be provided by the external node script.

Review the start_instance.yaml file to make sure the S3 bucket name, the puppetmaster server IP and CA certificate are still valid for your own setup.

Start an instance:

./start_instance.py -c start_instance.yaml AMI_NUMBER

Following /var/log/syslog you should see something similar to this:

Apr 7 19:15:37 domU-12-31-39-07-D6-52 puppetmasterd[1644]: 77ad2a3c-5d52-4ca7-9fea-b99b767b09d0 has a waiting certificate request

The instance has booted and registered with the puppetmaster.

Apr 7 19:16:01 domU-12-31-39-07-D6-52 CRON[2188]: (root) CMD (/usr/local/bin/check_csr --log-level=debug https://mathiaz-puppet-nodes-1.s3.amazonaws.com)
Apr 7 19:16:02 domU-12-31-39-07-D6-52 check_csr[2189]: DEBUG: List of waiting csr: 77ad2a3c-5d52-4ca7-9fea-b99b767b09d0
Apr 7 19:16:02 domU-12-31-39-07-D6-52 check_csr[2189]: DEBUG: Checking 77ad2a3c-5d52-4ca7-9fea-b99b767b09d0
Apr 7 19:16:02 domU-12-31-39-07-D6-52 check_csr[2189]: DEBUG: Checking url https://mathiaz-puppet-nodes-1.s3.amazonaws.com/77ad2a3c-5d52-4ca7-9fea-b99b767b09d0
Apr 7 19:16:03 domU-12-31-39-07-D6-52 check_csr[2189]: INFO: Signing request: 77ad2a3c-5d52-4ca7-9fea-b99b767b09d0

The puppetmaster checked if the client request is expected and signs it.

Apr 7 19:17:39 domU-12-31-39-07-D6-52 node_classifier[2240]: DEBUG: Checking url https://mathiaz-puppet-nodes-1.s3.amazonaws.com/77ad2a3c-5d52-4ca7-9fea-b99b767b09d0
Apr 7 19:17:39 domU-12-31-39-07-D6-52 node_classifier[2240]: INFO: Getting node configuration: 77ad2a3c-5d52-4ca7-9fea-b99b767b09d0
Apr 7 19:17:39 domU-12-31-39-07-D6-52 node_classifier[2240]: DEBUG: Node configuration (77ad2a3c-5d52-4ca7-9fea-b99b767b09d0): classes: [login-allowed]
Apr 7 19:17:39 domU-12-31-39-07-D6-52 puppetmasterd[1644]: Compiled catalog for 77ad2a3c-5d52-4ca7-9fea-b99b767b09d0 in 0.01 seconds

The puppetmaster compiled a manifest for the client according to the information provided by the node classifier script.

Make sure that the instance that has been started doesn't have any ssh key associated with it:

euca-describe-instances

Make a note of the instance ID and its public DNS name.

Login into the instance:

Run euca-get-console-output instance_ID to get the ssh fingerprint. You may need to scroll back to get the fingerprints.

Conclusion

The start_instance.py script is currently very simple and should be considered as a proof of concept.

Storing the node classification information into an S3 bucket makes it also easy to edit the content of the file. It also provides an easy way to get a list of the nodes that have been started by the Cloud Conductor as well as their classification.

If you look at the start_instance.py script you'll notice that the ACL on the S3 bucket is 'public-read'. That means anyone can read the list of your nodes as well as the list of classes and other node classification information for each of them. You may wanna use S3 private url instead.

We now have a puppet infrastructure where instances are started by a Cloud conductor in order to achieve a specific task. These instances automatically connect to the puppetmaster to get configured automatically for the task they've been created for. All of the instances configuration is stored in a reliable and scalable system: S3.

With instances being created on demand our puppet infrastructure can grow quickly. The puppetmaster can easily be responsible for managing hundreds of instances. Next we'll have a look at how improving the performance of the puppetmaster.

Tuesday, March 30, 2010

MySQL 5.1 Bug Zap: Bug day result

Today was targeted at looking through the mysql-dfsg-5.1 bugs to triage them. We ended up with all bugs having their importance set and their status set to Triaged or Incomplete.

Tomorrow will be dedicated to fixing most of them as well as some upgrade testing. I'll also have a look at the new mysql upstart job that has replaced the mysld_safe script.

Looking at the bugs today I've found a couple of bugs that look easy to fix:

Bug 552053: mysqld_safe should be available in mysql-server

Bug 498939: mysql- packages section on synaptic

To get started grab a copy of the package branch:

bzr init-repo mysql-dfsg-5.1/
cd mysql-dfsg-5.1/
bzr branch lp:ubuntu/mysql-dfsg-5.1

Fix a bug and push the branch to launchpad:

bzr push lp:~YOUR-LOGIN/ubuntu/mysql-dfsg-5.1/zap-bug-XXXXXX

And finish up by creating a merge proposal for the Lucid package branch. I'll take a look at the list of merge proposal throughout the day and include them in the upload schedule for tomorrow.

mysqld_safe should be available in mysql-server

Monday, March 29, 2010

Ubuntu Server Bug Zap: MySQL 5.1

Following up on the kvm and samba bug zap days I'm organizing a two day bug
zap around MySQL.

First phase: bug triaging

First in line is triaging all the bugs related to mysql-dfsg-5.1 package. As
of Tue Mar 30 00:23:04 UTC 2010 there are 27 bugs waiting to be looked at.

The goal is to have the importance set for all bugs and have as many bugs
status moved to either Triaged or Invalid/Won't Fix.

A few resources are available to help out:

The Debugging MySQL wiki page

The #ubuntu-server IRC channel

The up-to-date list of bugs that still need to be triaged.

Objective: get the list of bugs to zero.

Thursday, March 25, 2010

Using puppet in UEC/EC2: Automating the signing process

I outlined in the previous article how to setup a puppetmaster instance on UEC/EC2 and how to start instances that will automatically register with the puppetmaster. We're going to look at automating the process of signing puppet client certificate requests.

Overview

Our puppet infrastructure on the cloud can be broken down into three components:

The Cloud conductor responsible for starting new instances in our cloud.

A Puppetmaster responsible for configuring all the instances running in our cloud.

Instances acting as puppet clients asking to be setup correctly.

The idea is to have the Cloud conductor start instances and notify the puppetmaster that these new instances are coming up. The puppetmaster can then automatically sign their certificate requests.

We'll use S3 as the way to communicate between the Cloud conductor and the puppetmaster. The Cloud conductor will also assign a random certificate to each instance it starts.

The Cloud conductor will be located on a sysadmin workstation while the puppetmaster and instances will be running in the cloud. The bzr branch contains all the scripts necessary to setup such a solution.

The Cloud conductor: start_instance.py

Get the tutorial2 bzr on the Cloud conductor (an admin workstation):

bzr branch lp:~mathiaz/+junk/uec-ec2-puppet-config-tut2

In the scripts/ directory start_instance.py plays the role of the Cloud conductor. It creates new instances and stores their certname in S3. The start_instance.yaml configuration file provides almost the same information as the user-data.yaml file we used in the previous article.

Edit the start_instance.yaml file and update each setting:
- Choose a unique S3 bucket name.
- Use the private DNS hostname of the instance running the puppetmaster.
- Add the puppetmaster ca certificate found on the puppetmaster.

Make sure your AWS/UEC credentials are available in the environment. The start_instance.py uses these to access EC2 to start new instances and S3 to store the instance certificate names.

Start a new instance of the Lucid Beta1 AMI:

./start_instance.py -c ./start_instance.yaml ami-ad09e6c4

start_instance.py starts a new instance using the AMI specified on the command line. The instance user data holds a random UUID for the puppet client certificate name. start_instance.py also creates a new file in its S3 bucket named after the puppet client certificate name.

On the puppetmater looking at the puppetmaster log you should see a certificate request show up after some time:

Mar 19 19:09:33 ip-10-245-197-226 puppetmasterd[20273]: a83b0057-ab8d-426e-b2ab-175729742adb has a waiting certificate request

Automating the signing process on the puppetmaster

It's time to setup the puppetmaster to check if there are any certificate requests waiting and signs only the ones started by the Cloud conductor. We'll use the check_csr.py cron job that will get the list of waiting certificate requests via puppetca --list and checks whether there is a corresponding file in the S3 bucket.

On the puppetmaster get the tutorial2 bzr branch:

bzr pull --remember lp:~mathiaz/+junk/uec-ec2-config/tut2 /etc/puppet/

The puppetmaster.pp manifest has been updated to setup the check_csr.py cron job to run every 2 minutes. You need to update the cron job command line in /etc/puppet/manifests/puppetmaster.pp with your own S3 bucket name.

Update the puppetmaster configuration:

sudo puppet /etc/puppet/manifests/puppetmaster.pp

Watching /var/log/syslog you should see check_csr being run by cron every other minute:

Mar 19 19:10:01 ip-10-245-197-226 CRON[21858]: (root) CMD (/usr/local/bin/check_csr --log-level=debug https://mathiaz-puppet-nodes-1.s3.amazonaws.com)

check_csr gets the list of waiting certificate requests and checks if there is a corresponding file in its S3 bucket:

Mar 19 19:10:03 ip-10-245-197-226 check_csr[21859]: DEBUG: List of waiting csr: a83b0057-ab8d-426e-b2ab-175729742adb
Mar 19 19:10:03 ip-10-245-197-226 check_csr[21859]: DEBUG: Checking a83b0057-ab8d-426e-b2ab-175729742adb
Mar 19 19:10:03 ip-10-245-197-226 check_csr[21859]: DEBUG: Checking url https://mathiaz-puppet-nodes-1.s3.amazonaws.com/a83b0057-ab8d-426e-b2ab-175729742adb

If so it will sign the certificate request:

Mar 19 19:10:03 ip-10-245-197-226 check_csr[21859]: INFO: Signing request: a83b0057-ab8d-426e-b2ab-175729742adb

S3 bucket ACL

For now the S3 bucket ACL is set so that anyone can get the list files available in the bucket. However only authenticated requests can create new files in the bucket. Given that the filename are just random UUID this is not a big issue.

Using SQS instead of S3

Another implementation of the same idea is to use SQS to handle the notification of the puppetmaster by the Cloud conductor about new instances. While SQS would seem to be the best tool to provide that functionality it is not available in UEC in Lucid.

Conclusion

We end up with a puppet infrastructure where legitimate instances are automatically accepted. Now that instances can easily show up and be automatically enrolled what should these be configured as? We'll dive into this issue in the next article.

Wednesday, March 24, 2010

Using puppet in UEC/EC2: puppet support in Ubuntu images

One of the focus for the Lucid release cycle in the Ubuntu Server team is to improve the integration between puppet and UEC/EC2. I'll discuss in a series of articles how to setup a puppet infrastructure to manage Ubuntu Lucid instances running on UEC/EC2. I'll focus on the bootstrapping process rather than writing puppet recipes.

Today we'll look at configuring a puppetmaster into an instance and how to start instances that will register automatically with the puppetmaster.

We'll work with the Lucid Beta1 image on EC2. All the instances started through out this article will be based on this AMI.

Puppetmaster setup

Let's start by creating a puppetmaster running on EC2. We'll setup all the puppet configuration via ssh using a bzr branch on Launchpad: lp:~mathiaz/+junk/uec-ec2-puppet-config-tut1.

Start an instance of the Lucid Beta1 AMI using an ssh key. Once it's running write down its public and private DNS addresses. The public DNS address will be used to setup the puppetmaster via ssh. The private DNS address will be used as the puppetmaster hostname given out to puppet clients.

We'll actually install the puppetmaster using puppet itself.

Log on the started instance via ssh to install and setup the puppet master:

Update apt files:

sudo apt-get update

Install the puppet and bzr packages:

sudo apt-get install puppet bzr

Change the ownership of the puppet directory so that the ubuntu user can directly edit the puppet configuration files:

sudo chown ubuntu:ubuntu /etc/puppet/

Get the puppet configuration branch:

bzr branch --use-existing-directory lp:~mathiaz/+junk/uec-ec2-puppet-config-tut1 /etc/puppet/

Before doing the actual configuration let's have a look at the content of the /etc/puppet/ directory created from the bzr branch.

The layout follows the recommended puppet practices. The puppet module available in the modules directory defines a puppet::master class. The class makes sure that the puppetmaster package is installed and that the puppetmaster service is running. The manifests/puppetmaster.pp file defines the default node to be configured as a puppetmaster.

We'll now run the puppet client to setup the instance as a puppetmaster:

sudo puppet /etc/puppet/manifests/puppetmaster.pp

Starting a new instance

Now that we have puppetmaster available in our cloud we'll have look at how a new instances of the Lucid Beta1 AMI can be started and automatically setup to register with the puppetmaster.

We're going to use the cloud-config puppet syntax to boot an instance and have it configure itself to connect to the puppetmaster using its user data information:

On the puppetmaster instance create a user-data.yaml file to include the relevant puppetmaster configuration:

cp /usr/share/doc/cloud-init/examples/cloud-config-puppet.txt user-data.yaml

Update the server setting to point to the puppetmaster private dns hostname. I also strongly recommend to include the puppmaster ca certificate as the ca_cert setting.

The example certname setting uses a string extrapolation to make each puppet client certificate unique: for now %i is replace by the instance Id while %f is replaced by the FQDN of the instance.

The sample file has extensive comments about the format of the file. One of the key point is that you can set any of the puppet configuration options via the user data passed to the instance.

Note that you can remove all the comments to make the user-data.yaml file easier to copy and paste. However don't remove the first line (#cloud-config) as this is used by the instance boot process to start the puppet installation.

Launch a new instance using the content of the user-data.yaml file you've just created as the user-data option passed to the new instance.

You can watch the puppetmaster log on the puppetmaster instance to see when the new instance will request a new certificate:

tail -f /var/log/syslog

After some time you should see a request coming in:

puppetmasterd[2637]: i-fdb31b96.ip-10-195-18-227.ec2.internal has a waiting certificate request

During the boot process of the new instance the puppet cloud-config plugin used the user-data information to automatically install the puppet package, generate the /etc/puppet/puppet.conf file and start the puppetd daemon.

You can then approve the new instance:

sudo puppetca -s i-fdb31b96.ip-10-195-18-227.ec2.internal

Watching the puppetmaster log you'll see that after some time the new instance will connect and get its new manifest compiled and sent:

puppetmasterd[2637]: Compiled catalog for i-fdb31b96.ip-10-195-18-227.ec2.internal in 0.03 seconds

In conclusion we now have an instance acting as a puppetmaster and have a single user-data configuration for the whole puppet infrastructure. That user data can be passed to new instances which will automatically register with our puppetmaster.

Even though we're able to make all our instances automatically register with our puppetmaster we still need to manually sign each request as outlined in step 6 above. We'll have a look at automating this step in the next article.

Wednesday, February 17, 2010

FOSDEM 2010

I had the opportunity to attend FOSDEM this year. The most amazing (and frustrating) part of the event was the huge number of talks that were given. Making choices was sometimes hard. However the FOSDEM team recorded most of the sessions and the videos are available now. Perfect for such a busy event!

Here a few highlights of the presentations I attended:

Linux distribution for the cloud

I started the day by attending Peter Eisentraut (a Debian and PostgreSQL core developer) session about Linux distributions for the cloud. He focused on the provisioning aspect of clouds by giving a history of how operating systems were installed: from floppy drives to Cloud images. He dedicated one slide to Ubuntu's cloud offering including Canonical Landscape commenting that Ubuntu is clearly the leader of distributions in the cloud space. He also outlined what were the current problems such as lack of standards and integration of existing software stacks. He pointed out that linux distributions could drive this.

The second part of his talk was focused on the Linux desktop and the impact of cloud services on it. Giving ChromeOS as an example he outlined how applications themselves were moved to the cloud. He then listed the problems with Cloud services with regards to the Free Software principles: non-free server side code, non-free hosting, little or not control over data, lack of open-source community.

He concluded by outlining the challenge in the domain: how could free software principles be transposed to the Cloud and its services? One great reference is Eben Moglen talk named "Freedom in the Cloud".

Beyond MySQL GA

Kristian Nielsen, a MariaDB developer, gave an overview of the Developer ecosystem around MySQL. He listed a few patches that were available to add new functionalities and fix some bugs: the Google patch, Percona patches, eBay patches, Galera multi-master replication for InnoDB as well as a growing list of storage engines. Few options are available to use them:

packages from third party repositories (such as ourdelta and percona)

MariaDB maintains and integrates most of the patches

a more do-it-yourself approach where you maintain a patch serie.

I talked with Kristian after the presentation about leveraging bzr and LP to make the maintenance easier. It could look like this:

Each patch would be available and maintained in a bzr branch - in LP or else where.

The Ubuntu MySQL package branch available in LP would be used as the base for creating a debian package (or the Debian packaging branch since Debian packages are also available in LP via bzr)

bzr-looms would glue the package bzr branch with the patches bzr branches. The loom could be available from LP or elsewhere.

bzr-builder would be used to create a recipe to build binary packages out of the loom branch.

Packages would be published in PPAs ready to be installed on Ubuntu systems.

The Cassandra distributed database

I finally managed to get in the NoSQL room to attend Eric Evans overview of the Cassandra project. He is a full time developer and employee of Rackspace. The project was started by Facebook to power their inbox search. Even though the project had been available for some years the developer community really started to grow in March 2009. It is now part of the Apache project and about to graduate to a Top Level Project there.

It is inspired by Dynamo from Amazon and provide a O(1) DHT with eventual consistency and a consistent hashing. Multiple client APIs are available:

Thrift

Ruby

Python

Scala

I left before the end of the talk as I wanted to catch the complete presentation about using git for packaging.

Cross distro packaging with (top)git

Thomas Koch gave an overview of using git and related tools to help in maintaining Debian packaging. He works in web shop where every web application is deployed as a Debian package.

The upstream release tarball is imported in a upstream git branch using the pristine-tar tool. Packaging code (ie the debian/ directory) is kept in a different branch.

Patches to the upstream code are managed by topgit as seperate git branches. He also noted that topgit was able to export the whole stack of patches in the quilt Debian source format using the tg export command.

Here is the list of tools associated with his workflow:

pristine-tar

git-buildpackage

git-import-orig

git-dch

topgit

The workflow he outlined looked very similar to the one based around bzr and looms.

Scaling Facebook with OpenSource tools

David Recordon from Facebook gave a good presentation on the challenges that Facebook runs into when it comes to scale effectively.

Here are a few numbers I caught during the presentation to give an idea about the scale of the Facebook infrastructure (Warning: they may be wrong - watch the video to double check):

8 billion minutes spent on Facebook every day

2.5 billions of pictures uploaded every month

400 billion page/view per month

25 TB of log per day

40 billions pictures stored in 4 resolutions which bring a grand total of 160 billions photos

4 millions of line codes in php

Their overall architecture can be broken into the following components:

Load balancers

Web server (php): Most of the code is written in PHP: the language is simple, it fits well for fast development environments and there are a lot of developers available. A few of the problems are CPU, Memory, how to reuse the PHP logic in other systems and the difficulty to write extensions to speed up critical parts of the code. An overview of the HipHop compiler was given: a majority of their PHP code can be converted to C++ code which is then compiled and deployed on their webserver. An apache module is coming up soon probably as a fastcgi extension.

memcached (fast, simple): A core component of their infrastructure. It's robust and scales well: 120 millions queries/second. They wrote up some patches which are now making their way to upstream.

Services (fast, complicated): David gave an overview of some of the services that Facebook opensourced:
- Thrift: an rpc server, now part of the Apache incubator.
- Hive: build on top of hadoop it is now part of the Apache project. It's an SQL-like frontend to hadoop aiming at simplifying access to the hadoop infrastructure so that more people (ie non-engineers) can write and run data analysis jobs.
- Scribe: a performant and scalable logging system. Logs are stored in a hadoop/hive cluster to help in data analysis.

Databases (slow, simple): 1000's of MySQL servers are used as a persistence layer. InnoDB is used for the storage engine and multiple independent clusters are used for reliability. Joins are done at the webserver layer. The database layer is actually the persistence storage layer with memcached acting as a distributed index.

Other talks that seemed interesting

I had planned to a attend a few other talks as well. Unfortunately either their schedule was conflicting with another interesting presentation or the room was completely full (which seemed to happen all day long with the NoSQL room). Here is a list of them:

NoSQL for Fun & Profit

Introduction to MongoDB

Cloudlets: universal server images for the cloud

Continuous Packaging with Project-Builder.org