tag:blogger.com,1999:blog-33581155403724472422024-03-15T03:18:06.871-04:00Mathias' thoughts...Unknownnoreply@blogger.comBlogger25125tag:blogger.com,1999:blog-3358115540372447242.post-86066997200426106722010-10-06T13:49:00.001-04:002010-11-29T23:00:01.754-05:00Overview of a Puppet Split CA architecture<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjeMPBSf22ixR04Y4fu6_DrhVLHzxAlw3k7MqA8VZEmvusX4u062zcTPSY7kg1ZsUNkTV9eSSoq3esNQJUP7onsh7pyz4loq_DMZPgJ_5MHWrp281a7i5EHdZgG_-fCUcaVNYqd1NGcI3Z7/s1600/overview.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjeMPBSf22ixR04Y4fu6_DrhVLHzxAlw3k7MqA8VZEmvusX4u062zcTPSY7kg1ZsUNkTV9eSSoq3esNQJUP7onsh7pyz4loq_DMZPgJ_5MHWrp281a7i5EHdZgG_-fCUcaVNYqd1NGcI3Z7/s1600/overview.png" /></a></div><br />
The Puppet Master CA is the only Certificate Authority (CA) in the whole infrastructure. It issues certificates for all Puppet agents. It also manages the Puppet Master systems.<br />
<br />
The Puppet Masters are only responsible for compiling catalogs requested by Puppet Agents - they don't act as CA themselves. They only accept Puppet Agents which certificates have been issued by the Puppet Master CA.<br />
<br />
The Puppet Agent retrieves their certificates from the Puppet Master CA the first time they run. They connect to the Puppet Masters afterwards to get their catalogs. They won't contact the Puppet Master CA anymore.<br />
<h3>Puppet Master CA</h3><br />
The Puppet Master CA manages all Puppet Masters. In particular it distributes its own Certificate Revocation List (CRL) file to every Puppet Master. The Puppet Master CA also issues certificates to Puppet Agents.<br />
<h3>Puppet Master</h3><br />
A Puppet Master runs under Apache and Passenger. Apache ssl module is configured to <b>require</b> certificates signed by the Puppet Master CA (<i>/etc/apache2/site-available/puppetmaster</i>):<br />
<pre># Require certificates to be valid
SSLVerifyClient require
SSLVerifyDepth 1</pre><br />
The Puppet Master is also configured to not act as a Puppet CA (<i>/etc/puppet/puppet.conf</i>):<br />
<pre>[main]
ca = false</pre><br />
<h3>Puppet Agent</h3><br />
Puppet Agents retrieve their certificate from the Puppet Master CA and request their catalog from one of the Puppet Masters (<i>/etc/puppet/puppet.conf</i>):<br />
<pre>[agent]
ca_server = PUPPET_MASTER_CA
server = PUPPET_MASTER</pre><br />
<h3>Conclusion</h3><br />
From a security perspective setting the <i>SSLVerifyClient</i> option to <b>require</b> increases the protection of Puppet Masters from unknown requests and revoked Puppet Agents. Having the Puppet Master CA manage the Puppet Masters also facilitates the distribution of the Puppet Master CA CRL.<br />
<br />
On the reliability front new systems won't be added to the infrastructure if the Puppet Master CA is unavailable. However existing Puppet Agents are still functional as long as they can connect to a Puppet Master.Unknownnoreply@blogger.com16tag:blogger.com,1999:blog-3358115540372447242.post-74401073328300236142010-09-27T17:02:00.001-04:002010-11-29T23:05:15.409-05:00Deploying a Hadoop cluster on EC2/UEC with Puppet and Ubuntu Maverick<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh4Qvn-G3VEzbZ22iF2uyerlEHy6akYeMxrrqoOuVN20dT0YmI-ocFF0f1xCuExi93r9ajWYjqswnfG31LNZQsUtGTruMv39iE0gKUnCY0NNFhhvfH6BzF_fdDTzwBOPvbCGgCIW_smGmMW/s1600/overview.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="213" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh4Qvn-G3VEzbZ22iF2uyerlEHy6akYeMxrrqoOuVN20dT0YmI-ocFF0f1xCuExi93r9ajWYjqswnfG31LNZQsUtGTruMv39iE0gKUnCY0NNFhhvfH6BzF_fdDTzwBOPvbCGgCIW_smGmMW/s320/overview.png" width="320" /></a></div><a href="http://ubuntumathiaz.files.wordpress.com/2010/09/overview.png"><br />
</a><br />
A Hadoop Cluster running on EC2/UEC deployed by puppet on Ubuntu Maverick.<br />
<h3>How it works</h3><br />
The <a href="http://ubuntumathiaz.wordpress.com/2010/04/07/using-puppet-in-uecec2-node-classification-4/">Cloud Conductor</a> is located outside the AWS infrastructure as it needs AWS credentials to start new instances. The Puppet Master runs in EC2 and uses S3 to check which clients it should accept.<br />
<br />
The Hadoop Namenode, Jobtracker and Worker are also running in EC2. The Puppet Master automatically configures them so that each Worker can connect to the Namenode and Jobtracker.<br />
<br />
The Puppet Master uses<a href="http://projects.puppetlabs.com/projects/puppet/wiki/Using_Stored_Configuration"> Stored Configuration</a> to distribute configuration between all the Hadoop components. For example the Namenode IP address is automatically pushed to the Jobtracker and the Worker nodes so that they can connect to the Namenode.<br />
<br />
Ubuntu Maverick is used since Puppet 2.6 is required. The excellent <a href="http://www.cloudera.com/">Cloudera</a> CDH3 Beta2 packages provide the base Hadoop foundation.<br />
<br />
Puppet recipes and the Cloud Conductor scripts are available in a <a href="https://code.launchpad.net/%7Emathiaz/+junk/hadoop-cluster-puppet-conf/">bzr branch on Launchpad</a>.<br />
<h3>Setup the Cloud Conductor</h3><br />
The first part of the Cloud Conductor is the <b>start_instance.py</b> script. It takes care of starting new instances in EC2 and registering them in S3. Its configuration lives in <b>start_instance.yaml</b>. Both files are located in the <b>conductor</b> directory of the bzr branch.<br />
<br />
The following options are available on the cloud conductor:<br />
<ul><li><i>s3_bucket_name</i>: Sets the name of the S3 bucket used to store the list of instances started by the Cloud Conductor. The Puppet Master uses the same bucket to check which Puppet Client should be accepted.</li>
<li><i>ami_id</i>: Sets the id of the AMI the Cloud Conductor will use to start new instances.</li>
<li><i>cloud_init</i>: Sets specific cloud-init parameters. All of the puppet client configuration is defined here.Public ssh keys (for example from Launchpad) can be configured using the <b>ssh_import_id</b> option. The <a href="http://bazaar.launchpad.net/%7Eubuntu-branches/ubuntu/maverick/cloud-init/maverick/files/head%3A/doc/">cloud-init documentation </a>has more information [1] about what can be configured when starting new instances.</li>
</ul><br />
A sample <b>start_instance.yaml</b> file looks like this:<br />
<br />
<code># Name of the S3 bucket to use to store the certname of started instances<br />
s3_bucket_name: mathiaz-hadoop-cluster<br />
# Base AMI id to use to start all instances<br />
ami_id: ami-c210e5ab<br />
# Extra information passed to cloud-init when starting new instances<br />
# see cloud-init documentation for available options.<br />
cloud_init: &site-cloud-init<br />
ssh_import_id: mathiaz</code><br />
<br />
Once the Cloud Conductor is configured a Puppet Master can be started:<br />
<code> </code><br />
<br />
<code>./start_instance.py puppetmaster</code><br />
<h3>Setup the Puppet Master</h3><br />
Once the instance has started and its ssh fingerprints can be verified the puppet recipes are deployed on the Puppet Master:<br />
<code> </code><br />
<br />
<code>bzr branch lp:~mathiaz/+junk/hadoop-cluster-puppet-conf ~/puppet/<br />
sudo mv /etc/puppet/ /etc/old.puppet<br />
sudo mv ~/puppet/ /etc/</code><br />
<br />
The S3 bucket name is set in the Puppet Master configuration <b>/etc/puppet/manifests/puppetmaster.pp</b>:<br />
<code> </code><br />
<br />
<code>node default {<br />
class {<br />
"puppet::ca":<br />
node_bucket => "https://mathiaz-hadoop-cluster.s3.amazonaws.com";<br />
}<br />
}</code><br />
<br />
And finally the Puppet Master installation can be completed by puppet itself:<br />
<br />
<code>sudo puppet apply /etc/puppet/manifests/puppetmaster.pp</code><br />
<br />
A Puppet Master is now running into EC2 with all the recipes required to deploy the different components of a Hadoop Cluster.<br />
<h3>Update the Cloud Conductor configuration</h3><br />
Since the Cloud Conductor starts instances that will connect to the Puppet Master it needs to know some information about the Puppet Master:<br />
<ul><li>the Puppet Master internal IP address or DNS name. For example the DNS name of the instance (which is the FQDN) can be used.</li>
<li>the Puppet Master certificate (located in <b>/var/lib/puppet/ssl/ca/ca_crt.pem</b>):</li>
</ul><br />
On the Cloud Conductor the information gathered on the Puppet Master is added to <b>start_instance.yaml</b>:<br />
<br />
<code> agent:<br />
# Puppet server hostname or IP<br />
# In EC2 the Private DNS of the instance should be used<br />
server: domU-12-31-38-00-35-98.compute-1.internal<br />
# NB: the certname will automatically be added by start_instance.py<br />
# when a new instance is started.<br />
# Puppetmaster ca certificate<br />
# located in /var/lib/puppet/ssl/ca/ca_crt.pem on the puppetmaster system<br />
ca_cert: |<br />
-----BEGIN CERTIFICATE-----<br />
MIICFzCCAYCgAwIBAgIBATANBgkqhkiG9w0BAQUFADAUMRIwEAYDVQQDDAlQdXBw<br />
[ ... ]<br />
k0r/nTX6Tmr8TTU=<br />
-----END CERTIFICATE-----</code><br />
<h3>Start the Hadoop Namenode</h3><br />
Once the Puppet Master and Cloud Conductor are configured the Hadoop Cluster can be deployed. First in line is the Hadoop Namenode:<br />
<code> </code><br />
<br />
<code>./start_instance.py namenode</code><br />
<br />
After a few minutes the Namenode puppet client requests a certificate:<br />
<code> puppet-master[7397]: Starting Puppet master version 2.6.1<br />
puppet-master[7397]: 53b0b7bf-723c-4a0f-b4b1-082ebec84041 has a waiting certificate request</code><br />
The Master signs the CSR:<br />
<code> </code><br />
<br />
<code>CRON[8542]: (root) CMD (/usr/local/bin/check_csr https://mathiaz-hadoop-cluster.s3.amazonaws.com)<br />
check_csr[8543]: INFO: Signing request: 53b0b7bf-723c-4a0f-b4b1-082ebec84041</code><br />
<br />
And finally the Master compiles the manifest for the Namenode:<br />
<code> </code><br />
<br />
<code>node_classifier[8989]: DEBUG: Checking url https://mathiaz-hadoop-cluster.s3.amazonaws.com/53b0b7bf-723c-4a0f-b4b1-082ebec84041<br />
node_classifier[8989]: INFO: Getting node configuration: 53b0b7bf-723c-4a0f-b4b1-082ebec84041<br />
node_classifier[8989]: DEBUG: Node configuration (53b0b7bf-723c-4a0f-b4b1-082ebec84041): classes: ['hadoop::namenode']<br />
puppet-master[7397]: Puppet::Parser::AST::Resource failed with error ArgumentError: Could not find stage hadoop-base specified by Class[Hadoop::Base] at /etc/puppet/modules/hadoop/manifests/init.pp:142 on node 53b0b7bf-723c-4a0f-b4b1-082ebec84041</code><br />
<br />
Unfortunately there is a bug related to puppet stages. As a workaround the puppet agent can be restarted:<br />
<code> </code><br />
<br />
<code>sudo /etc/init.d/puppet restart</code><br />
<br />
Looking at the syslog file on the Namenode the Puppet Agent installs and configures the Hadoop Namenode:<br />
<code> </code><br />
<br />
<code>puppet-agent[1795]: Starting Puppet client version 2.6.1<br />
puppet-agent[1795]: (/Stage[apt]/Hadoop::Apt/Apt::Key[cloudera]/File[/etc/apt/cloudera.key]/ensure) defined content as '{md5}dc59b632a1ce2ad325c40d0ba4a4927e'<br />
puppet-agent[1795]: (/Stage[apt]/Hadoop::Apt/Apt::Key[cloudera]/Exec[import apt key cloudera]) Triggered 'refresh' from 1 events<br />
puppet-agent[1795]: (/Stage[apt]/Hadoop::Apt/Apt::Sources_list[canonical]/File[/etc/apt/sources.list.d/canonical.list]/ensure) created<br />
puppet-agent[1795]: (/Stage[apt]/Hadoop::Apt/Apt::Sources_list[cloudera]/File[/etc/apt/sources.list.d/cloudera.list]/ensure) created<br />
puppet-agent[1795]: (/Stage[apt]/Apt::Apt/Exec[apt-get_update]) Triggered 'refresh' from 3 events</code><br />
<br />
The first stage of the puppet run sets up the Canonical partner archive and the Cloudera archive. The Sun JVM is pulled from the Canonical archive while Hadoop packages are downloaded from the Cloudera archive.<br />
<br />
The following stage creates a common Hadoop configuration:<br />
<code> </code><br />
<br />
<code>puppet-agent[1795]: (/Stage[hadoop-base]/Hadoop::Base/File[/var/cache/debconf/sun-java6.seeds]/ensure) defined content as '{md5}1e3a7ac4c2dc9e9c3a1ae9ab2c040794'<br />
puppet-agent[1795]: (/Stage[hadoop-base]/Hadoop::Base/Package[sun-java6-bin]/ensure) ensure changed 'purged' to 'latest'<br />
puppet-agent[1795]: (/Stage[hadoop-base]/Hadoop::Base/Package[hadoop-0.20]/ensure) ensure changed 'purged' to 'latest'<br />
puppet-agent[1795]: (/Stage[hadoop-base]/Hadoop::Base/File[/var/lib/hadoop-0.20/dfs]/ensure) created<br />
puppet-agent[1795]: (/Stage[hadoop-base]/Hadoop::Base/File[/etc/hadoop-0.20/conf.puppet]/ensure) created<br />
puppet-agent[1795]: (/Stage[hadoop-base]/Hadoop::Base/File[/etc/hadoop-0.20/conf.puppet/hdfs-site.xml]/ensure) defined content as '{md5}1f9788fceffdd1b2300c06160e7c364e'<br />
puppet-agent[1795]: (/Stage[hadoop-base]/Hadoop::Base/Exec[/usr/sbin/update-alternatives --install /etc/hadoop-0.20/conf hadoop-0.20-conf /etc/hadoop-0.20/conf.puppet 15]) Triggered 'refresh' from 1 events<br />
puppet-agent[1795]: (/Stage[hadoop-base]/Hadoop::Base/File[/etc/default/hadoop-0.20]/content) content changed '{md5}578894d1b3f7d636187955c15b8edb09' to '{md5}ecb699397751cbaec1b9ac8b2dd0b9c3'</code><br />
Finally the Hadoop Namenode is configured:<br />
<code> </code><br />
<br />
<code>puppet-agent[1795]: (/Stage[main]/Hadoop::Namenode/Package[hadoop-0.20-namenode]/ensure) ensure changed 'purged' to 'latest'<br />
puppet-agent[1795]: (/Stage[main]/Hadoop::Namenode/File[hadoop-core-site]/ensure) defined content as '{md5}2f2445bf3d4e26f5ceb3c32047b19419'<br />
puppet-agent[1795]: (/Stage[main]/Hadoop::Namenode/File[/var/lib/hadoop-0.20/dfs/name]/ensure) created<br />
puppet-agent[1795]: (/Stage[main]/Hadoop::Namenode/Exec[format-dfs]) Triggered 'refresh' from 1 events<br />
puppet-agent[1795]: (/Stage[main]/Hadoop::Namenode/Service[hadoop-0.20-namenode]/ensure) ensure changed 'stopped' to 'running'<br />
puppet-agent[1795]: (/Stage[main]/Hadoop::Namenode/Service[hadoop-0.20-namenode]) Failed to call refresh: Could not start Service[hadoop-0.20-namenode]: Execution of '/etc/init.d/hadoop-0.20-namenode start' returned 1: at /etc/puppet/modules/hadoop/manifests/init.pp:177</code><br />
<br />
There is another bug in the Hadoop init script this time: the Namenode cannot be started. The puppet agent can be restarted or the next puppet run will start it:<br />
<code> </code><br />
<br />
<code>sudo /etc/init.d/puppet restart</code><br />
<br />
The Namenode daemon is running and logs information to its log file in <b>/var/log/hadoop/hadoop-hadoop-namenode-*.log</b>:<br />
<code> </code><br />
<br />
<code>[...]<br />
INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Web-server up at: 0.0.0.0:50070<br />
[...]<br />
INFO org.apache.hadoop.ipc.Server: IPC Server handler 9 on 8200: starting<br />
INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 8200: starting</code><br />
<h3>Start the Hadoop Jobtracker</h3><br />
The next component to start is the Hadoop Jobtracker:<br />
<code> </code><br />
<br />
<code>./start_instance.py jobtracker</code><br />
<br />
After some time the Puppet Master compiles the Jobtracker manifest:<br />
<code> </code><br />
<br />
<code>DEBUG: Checking url https://mathiaz-hadoop-cluster.s3.amazonaws.com/2faa4de9-c708-45ab-a515-ae041a9d0239<br />
node_classifier[30683]: INFO: Getting node configuration: 2faa4de9-c708-45ab-a515-ae041a9d0239<br />
node_classifier[30683]: DEBUG: Node configuration (2faa4de9-c708-45ab-a515-ae041a9d0239): classes: ['hadoop::jobtracker']<br />
puppet-master[23542]: Compiled catalog for 2faa4de9-c708-45ab-a515-ae041a9d0239 in environment production in 2.00 seconds</code><br />
<br />
On the instance the puppet agent configures the Hadoop Jobtracker:<br />
<code> </code><br />
<br />
<code>puppet-agent[1035]: (/Stage[main]/Hadoop::Jobtracker/File[hadoop-mapred-site]/ensure) defined content as '{md5}af3b65a08df03e14305cc5fd56674867'<br />
puppet-agent[1035]: (/Stage[main]/Hadoop::Jobtracker/File[hadoop-core-site]/ensure) defined content as '{md5}2f2445bf3d4e26f5ceb3c32047b19419'<br />
puppet-agent[1035]: (/Stage[main]/Hadoop::Jobtracker/Package[hadoop-0.20-jobtracker]/ensure) ensure changed 'purged' to 'latest'<br />
puppet-agent[1035]: (/Stage[main]/Hadoop::Jobtracker/Service[hadoop-0.20-jobtracker]/ensure) ensure changed 'stopped' to 'running'<br />
puppet-agent[1035]: (/Stage[main]/Hadoop::Jobtracker/Service[hadoop-0.20-jobtracker]) Failed to call refresh: Could not start Service[hadoop-0.20-jobtracker]: Execution of '/etc/init.d/hadoop-0.20-jobtracker start' returned 1: at /etc/puppet/modules/hadoop/manifests/init.pp:135</code><br />
<br />
There is the same bug in the init script. Let's restart the puppet agent:<br />
<code> </code><br />
<br />
<code>sudo /etc/init.d/puppet restart</code><br />
<br />
The Jobtracker connects to the Namenode and error messages are logged on a regular basis to both the Namenode and Jobtracker log files:<br />
<code> </code><br />
<br />
<code>INFO org.apache.hadoop.ipc.Server: IPC Server handler 7 on 8200, call<br />
addBlock(/hadoop/mapred/system/jobtracker.info, DFSClient_-268101966, null)<br />
from 10.122.183.121:54322: error: java.io.IOException: File<br />
/hadoop/mapred/system/jobtracker.info could only be replicated to 0 nodes,<br />
instead of 1<br />
java.io.IOException: File /hadoop/mapred/system/jobtracker.info could only be<br />
replicated to 0 nodes, instead of 1</code><br />
<br />
This is normal as there aren't any Datanode daemon available for data replication.<br />
<h3>Start Hadoop workers</h3><br />
It's now time to start the Hadoop Worker to get an operational Hadoop Cluster:<br />
<code> </code><br />
<br />
<code>./start_instance.py worker</code><br />
<br />
The Hadoop Worker holds both a Data node and a Task tracker. The Puppet agent configures them to talk to the Namenode and Job tracker respectively.<br />
<br />
After some time the Puppet Master compiles the catalog for the Hadoop Worker:<br />
<code> </code><br />
<br />
<code>node_classifier[8368]: DEBUG: Checking url https://mathiaz-hadoop-cluster.s3.amazonaws.com/b72a8f4d-55e6-4059-ac4b-26927f1a1016<br />
node_classifier[8368]: INFO: Getting node configuration: b72a8f4d-55e6-4059-ac4b-26927f1a1016<br />
node_classifier[8368]: DEBUG: Node configuration (b72a8f4d-55e6-4059-ac4b-26927f1a1016): classes: ['hadoop::worker']<br />
puppet-master[23542]: Compiled catalog for b72a8f4d-55e6-4059-ac4b-26927f1a1016 in environment production in 0.18 seconds</code><br />
<br />
On the instance the puppet agent installs the Hadoop worker:<br />
<code> </code><br />
<br />
<code>puppet-agent[1030]: (/Stage[main]/Hadoop::Worker/File[hadoop-mapred-site]/ensure) defined content as '{md5}af3b65a08df03e14305cc5fd56674867'<br />
puppet-agent[1030]: (/Stage[main]/Hadoop::Worker/Package[hadoop-0.20-datanode]/ensure) ensure changed 'purged' to 'latest'<br />
puppet-agent[1030]: (/Stage[main]/Hadoop::Worker/File[/var/lib/hadoop-0.20/dfs/data]/ensure) created<br />
puppet-agent[1030]: (/Stage[main]/Hadoop::Worker/Package[hadoop-0.20-tasktracker]/ensure) ensure changed 'purged' to 'latest'<br />
puppet-agent[1030]: (/Stage[main]/Hadoop::Worker/File[hadoop-core-site]/ensure) defined content as '{md5}2f2445bf3d4e26f5ceb3c32047b19419'<br />
puppet-agent[1030]: (/Stage[main]/Hadoop::Worker/Service[hadoop-0.20-datanode]/ensure) ensure changed 'stopped' to 'running'<br />
puppet-agent[1030]: (/Stage[main]/Hadoop::Worker/Service[hadoop-0.20-datanode]) Failed to call refresh: Could not start Service[hadoop-0.20-datanode]: Execution of '/etc/init.d/hadoop-0.20-datanode start' returned 1: at /etc/puppet/modules/hadoop/manifests/init.pp:103<br />
puppet-agent[1030]: (/Stage[main]/Hadoop::Worker/Service[hadoop-0.20-tasktracker]/ensure) ensure changed 'stopped' to 'running'<br />
puppet-agent[1030]: (/Stage[main]/Hadoop::Worker/Service[hadoop-0.20-tasktracker]) Failed to call refresh: Could not start Service[hadoop-0.20-tasktracker]: Execution of '/etc/init.d/hadoop-0.20-tasktracker start' returned 1: at /etc/puppet/modules/hadoop/manifests/init.pp:103</code><br />
<br />
Again the same init script bug - let's restart the puppet agent:<br />
<code> </code><br />
<br />
<code>sudo /etc/init.d/puppet restart</code><br />
<br />
Once the worker is installed the Datanode daemon connects to the Namenode:<br />
<code> </code><br />
<br />
<code>INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.registerDatanode: node registration from 10.249.187.5:50010 storage DS-2066068566-10.249.187.5-50010-1285276011214<br />
INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /default-rack/10.249.187.5:50010</code><br />
<br />
Similarly the Task Tracker daemon registers itself with the Jobtracker:<br />
<code> INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /default-rack/domU-12-31-39-03-B8-F7.compute-1.internal</code><br />
<br />
The Hadoop Cluster is up and running.<br />
<h3>Conclusion</h3><br />
Once the initial setup of the Puppet master is done and the Hadoop Namenode and Jobtracker are up and running adding new Hadoop Workers is<br />
just one command:<br />
<br />
<code> ./start_instance.py worker</code><br />
<br />
Puppet automatically configures them to join the Hadoop Cluster.Unknownnoreply@blogger.com23tag:blogger.com,1999:blog-3358115540372447242.post-25541921082208479162010-07-08T14:49:00.000-04:002010-11-29T19:08:41.786-05:00Vote for the Ubuntu stack exchangeThis morning <a href="https://lists.ubuntu.com/archives/ubuntu-devel-discuss/2010-July/011822.html">Evan email</a> hit my inbox: there is a <a href="http://area51.stackexchange.com/proposals/7716/ubuntu">suggestion to create a stack exchange for Ubuntu</a>.<br/><br/>I've always been impressed by the <a href="http://stackoverflow.com/">stackoverflow</a> and <a href="http://serverfault.com/">serverfault</a> web sites. Granted forums have been around for a long time - however I love the user interaction provided by the folks behind <a href="http://www.stackexchange.com/">Stack Exchange</a>. A couple of months ago they created <a href="http://area51.stackexchange.com/">area51</a> to request new ideas that could use the same framework behind stackoverflow and serverfault. And again the user experience for handling these requests is great.<br/><br/>In my opinion Stack Exchange provides an excellent user experience that fosters user contributions and collaboration - in-line with the values of the Ubuntu community.<br/><br/>So I went over to area51 and voted on the on-topic and off-topic questions for the <a href="http://area51.stackexchange.com/proposals/7716/ubuntu">Ubuntu proposal</a>.Unknownnoreply@blogger.com4tag:blogger.com,1999:blog-3358115540372447242.post-20040950581864990942010-07-06T12:59:00.000-04:002010-11-29T19:08:41.793-05:00Velocity 2010: Fast by default - ThursdayThursday was the last day of the conference and followed the same format as Wednesday: keynotes in the morning, three parallel tracks in the afternoon.<br/><h3>Creating Cultural Change</h3><br/>John Rauser from Amazon<a href="http://en.oreilly.com/velocity2010/public/schedule/detail/11793"> shared a few experiences</a> about creating cultural changes inside and outside organizations.<br/><br/>Here are some key takeaways:<br/><ul><br/> <li>Try something new</li><br/> <li>Seek group identity</li><br/> <li>Welcome newcomers</li><br/> <li>Be relentless happy</li><br/></ul><br/>Theses ideas actually reminded me of how the Ubuntu community is been built up.<br/><br/><object height="350" width="425"><br/> <param name="movie" value="http://www.youtube.com/v/UL2WDcNu_3A"><br/> <param name="wmode" value="transparent"><br/> <embed src="http://www.youtube.com/v/UL2WDcNu_3A;rel=0" type="application/x-shockwave-flash" wmode="transparent" height="350" width="425"><br/> </object><br/><h3>In the Belly of the Whale Operations at Twitter</h3><br/>John Adams of Twitter <a href="http://en.oreilly.com/velocity2010/public/schedule/detail/12987">presented</a> a few insights on how operations are run at Twitter.<br/><br/>He outlined several principles to keep in mind when building their infrastructure:<br/><ul><br/> <li>Nothing works the first time. Plan to rebuild everything more than once.</li><br/> <li>Deploy faster and more often as less code will change.</li><br/> <li>Detect problem as early as possible - to recover fast.</li><br/> <li>Disable/enable features in production aka Feature darkmode.</li><br/></ul><br/>To support these guiding principles he listed some of the tools that are used:<br/><ul><br/> <li>configuration management done with puppet and svn.</li><br/> <li>Reviewboard to review changes made to the infrastructure</li><br/> <li>Ganglia to take care of monitoring</li><br/> <li>Scribe to collect and aggregate all logs into Hadoop HDFS using LZO compression.</li><br/> <li>Murder to deploy their code to all of their systems via bittorrent.</li><br/> <li>Google analytics to track errors pages while Whale Watcher to track errors in logs.</li><br/> <li>Unicorn to powers their rails stack.</li><br/></ul><br/><object height="350" width="425"><br/> <param name="movie" value="http://www.youtube.com/v/_7KdeUIvlvw"><br/> <param name="wmode" value="transparent"><br/> <embed src="http://www.youtube.com/v/_7KdeUIvlvw;rel=0" type="application/x-shockwave-flash" wmode="transparent" height="350" width="425"><br/> </object><br/><h3>Lightning talks</h3><br/>Thursdays lightning <a href="http://en.oreilly.com/velocity2010/public/schedule/detail/15310">talks</a> covered another round of useful tools in helping optimizing page loads:<br/><ul><br/> <li>httpwatch: a commercial tool that loads web pages and analyses it</li><br/> <li>pagetest</li><br/> <li>speedtracer: chrome browser extension that provides an insight on what the browser is doing when a loading a page</li><br/> <li>fiddler2</li><br/></ul><br/><object height="350" width="425"><br/> <param name="movie" value="http://www.youtube.com/v/0daDZs0c0D0"><br/> <param name="wmode" value="transparent"><br/> <embed src="http://www.youtube.com/v/0daDZs0c0D0;rel=0" type="application/x-shockwave-flash" wmode="transparent" height="350" width="425"><br/> </object><br/><h3>Moving Fast</h3><br/>Robert Johnson of Facebook <a href="http://en.oreilly.com/velocity2010/public/schedule/detail/14129">gave a talk </a>about the culture of moving fast at Facebook. Here are a few short sentence to summarize his points:<br/><ul><br/> <li>How to scale? Have a team that reacts fast.</li><br/> <li>The release cycle: Make changes every day as frequent small changes makes it easier to figure out what went wrong.</li><br/> <li>Control and responsibility to one person.</li><br/></ul><br/>He finished with a few lessons that were learned:<br/><ul><br/> <li>New code is slow.</li><br/> <li>Give developers room to try things.</li><br/> <li>Nobody's job is to say no.</li><br/></ul><br/><object height="350" width="425"><br/> <param name="movie" value="http://www.youtube.com/v/nEmJ_5UHs1g"><br/> <param name="wmode" value="transparent"><br/> <embed src="http://www.youtube.com/v/nEmJ_5UHs1g;rel=0" type="application/x-shockwave-flash" wmode="transparent" height="350" width="425"><br/> </object><br/><h3>Practice of Continuous Deployment</h3><br/>Throughout the conference I heard multiple times the idea of continuous deployment. With continuous integration being pushed on on the developer side, its pendant on the ops side is continuous deployment: tests, build, deploy. Deploy multiple times a day with a good monitoring system to identify quickly when things go wrong. When things go wrong it's easier to identify what changed as the number of changes is rather low. All the big shops have a deployment dashboard to review what went live, when and by whom.<br/><br/>The launchpad team is already following this idea: <a href="https://edge.launchpad.net/">Launchpad edge</a> has a daily update of the code running against the production database. Releases (with DB schema changes) are conducted on a monthly basis. And Ubuntu is providing something similar as the development version is always available for installation - and releases are cut every 6 months.Unknownnoreply@blogger.com12tag:blogger.com,1999:blog-3358115540372447242.post-54038619006075513082010-07-05T12:45:00.000-04:002010-11-29T19:08:41.796-05:00Velocity 2010: Fast by default - Tuesday and WednesdayHere is a report on <a href="http://en.oreilly.com/velocity2010/">Velocity 2010, the Web Performance and Operations conference</a>. In its third year it grew to more than 1100 attendees - this year was sold out.<br/><h2>Tuesday workshops</h2><br/>Tuesday was dedicated to workshops even though most of them turned out to be presentations with demos given the number of participants. So not a lot of hands-on sessions. Here is a small selection of talks I found interesting throughout the day:<br/><h3>Infrastructure automation with Chef</h3><br/><a href="http://en.oreilly.com/velocity2010/public/schedule/detail/14432">Overview</a> of the chef project lead by the high energy and opinionated Adam Jacob from Opscode.<br/><br/>For me the most exciting part was the ability that chef provides a complete view in your infrastructure and the ability to query your infrastructure any way you want.<br/><br/>Adam gave a few high impact principles:<br/><blockquote>Being able to reconstruct a business from a source code repository, a data backup and bare metal resources.</blockquote><br/>Another interesting feature from the knife tool was the ability to start/spawn new instances in EC2 from the command line. For example the following command will give you an ec2 instance running your rails role within a few minutes:<br/><blockquote>knife ec2 server create 'role[rails]'</blockquote><br/><h3>Protecting "Cloud" Secrets With Grendel</h3><br/>A technical overview of the <a href="http://github.com/wesabe/grendel">Grendel project:</a> OpenPGP as a software service.<br/><br/>The project gives the ability to share encrypted documents between multiple people. From the security perspective each user private key is stored in the cloud encrypted by a pass phrase only known to the user transmitted via http basic auth.<br/><h2>Wednesday sessions</h2><br/>Wednesday was the first day of the conference with keynotes in the morning and three tracks in the afternoon.<br/><h3>Datacenter Infrastructure Innovation</h3><br/>James Hamilton from Amazon Web Services gave an <a href="http://en.oreilly.com/velocity2010/public/schedule/detail/15429">interesting overview</a> of the different parts of building a data center.<br/><br/>An interesting point he made was that data center should target 100% usage of their servers while the industry standard is around 10 to 15% utilization on average. This objective lead to the introduction of spot instances in EC2 so that resource usage could be maximized and Amazon cloud infrastructure can be flat lined. That reminds me of some comments from Google engineers stating that they try to pile as much work as possible on each of their servers. At their scale having a server powered off is costing money.<br/><br/>He covered other topics:<br/><ul><br/> <li>air conditioning: DC could be run way hotter they are now</li><br/> <li>power: the cost of power has a small part of the total cost of running a data center - server hardware being more than half of the cost. This is an interesting point with regards to the whole green computing movement.</li><br/></ul><br/><object height="350" width="425"><br/> <param name="movie" value="http://www.youtube.com/v/kHW-ayt_Urk"><br/> <param name="wmode" value="transparent"><br/> <embed src="http://www.youtube.com/v/kHW-ayt_Urk;rel=0" type="application/x-shockwave-flash" wmode="transparent" height="350" width="425"><br/> </object><br/><h3>Speed matters</h3><br/>Urs Hölzle from Google <a href="http://en.oreilly.com/velocity2010/public/schedule/detail/14371">covered</a> the importance of having web page that load fast and a range of improvements Google had been working on for the last years: from the web browser (via chrome) down to the infrastructure (such as dns).<br/><br/>He also highlighted that Google page ranking process now takes into account the speed at which a page loads. As heard multiple times during the conference there is now empirical evidence that links directly the page load speed to revenue: the faster a page load the more people will stay on the web site.<br/><br/><object height="350" width="425"><br/> <param name="movie" value="http://www.youtube.com/v/MStKwEff_kY"><br/> <param name="wmode" value="transparent"><br/> <embed src="http://www.youtube.com/v/MStKwEff_kY;rel=0" type="application/x-shockwave-flash" wmode="transparent" height="350" width="425"><br/> </object><br/><h3>Lightning talks</h3><br/>Wednesdays<a href="http://en.oreilly.com/velocity2010/public/schedule/detail/15306"> lightning demos</a> show cased a list of tools focusing on highlighting performance bottleneck and helping out tracking why page load are slow and how to improve them:<br/><ul><br/> <li><a href="http://developer.yahoo.com/yslow/">Yslow</a></li><br/> <li><a href="http://www.dynatrace.com/">dynaTrace link</a></li><br/> <li><a href="http://code.google.com/speed/page-speed/">page speed</a></li><br/></ul><br/><object height="350" width="425"><br/> <param name="movie" value="http://www.youtube.com/v/33R1FVCArA8"><br/> <param name="wmode" value="transparent"><br/> <embed src="http://www.youtube.com/v/33R1FVCArA8;rel=0" type="application/x-shockwave-flash" wmode="transparent" height="350" width="425"><br/> </object><br/><h3>Getting Fast: Moving Towards a Toolchain for Automated Operations</h3><br/>Lee Thompson and Alex Honor <a href="http://en.oreilly.com/velocity2010/public/schedule/detail/13113">reported</a> on the work of the devtools-toolchain group. The group formed a few months ago to share experiences and build up a set of best practices. Of the use cases they've outlined KaChing's Continuous Deployment was the most interesting one:<br/><pre style="padding-left:30px;">Release is a marketing concern.</pre><br/><h3>Facebook operations</h3><br/>Tom Cook of Facebook <a href="http://en.oreilly.com/velocity2010/public/schedule/detail/13103">gave a sneak peak</a> at the life of operations in Facebook.<br/><br/>Very interesting talk about the developement practices of one of the busiest website of the internet. Facebook is running of two data center (one on the east coast, one of the west coast) while they're building their own data center in Oregon.<br/><br/>Their core OS is Centos 5 with a customized kernel. For system management cfengine is set to update every 15 minutes with a cfengine run taking around 30 seconds. All of the changes are peer reviewed.<br/><br/>On the deployment front bug fixes are pushed out once a day while new features are rolled out on a weekly basis. Code is pushed to 10000s of servers using bittorrent swarms. Coordination is done via IRC with the engineer available in case something goes wrong.<br/><br/>The developer is responsible for writing the code as well as testing and deploying it. New code is then exposed to a subset of real traffic. Ops are embedded in engineering teams and take part of design decisions. They're actually an interface to other ops.<br/><br/>As a summary tom gave a few points:<br/><ul><br/> <li>version control everything</li><br/> <li>optimize early</li><br/> <li>automate++</li><br/> <li>use configuratiom mgmgt</li><br/> <li>plan to fail</li><br/> <li>instrument everything</li><br/> <li>don't waste time on dumb stuff</li><br/></ul><br/><object height="350" width="425"><br/> <param name="movie" value="http://www.youtube.com/v/T-Xr_PJdNmQ"><br/> <param name="wmode" value="transparent"><br/> <embed src="http://www.youtube.com/v/T-Xr_PJdNmQ;rel=0" type="application/x-shockwave-flash" wmode="transparent" height="350" width="425"><br/> </object>Unknownnoreply@blogger.com3tag:blogger.com,1999:blog-3358115540372447242.post-50234953849236616112010-04-08T17:55:00.000-04:002010-11-29T19:08:41.800-05:00Using puppet in UEC/EC2: Improving performance with Phusion PassengerNow that we have an efficient process to <a class="reference external" href="http://ubuntumathiaz.wordpress.com/2010/03/24/using-puppet-in-uecec2-puppet-support-in-ubuntu-images/">start instances within UEC/EC2</a> and get them <a class="reference external" href="http://ubuntumathiaz.wordpress.com/2010/03/25/using-puppet-in-uecec2-automating-the-signing-process/">configured</a> <a class="reference external" href="http://ubuntumathiaz.wordpress.com/2010/04/07/using-puppet-in-uecec2-node-classification-4/">for their task</a> by puppet we'll dive into improving the performance of the <tt class="docutils literal">puppetmaster</tt> with <a class="reference external" href="http://www.modrails.com/">Phusion Passenger</a>.<br/><div id="why" class="section"><br/><h3>Why?</h3><br/>The default configuration used by puppetmasterd is based on webrick which doesn't really scale well. One popular choice to improve puppetmasterd performance is to use mod passenger from the <a class="reference external" href="http://packages.ubuntu.com/lucid/libapache2-mod-passenger">libapache2-mod-passenger package</a>.<br/><br/></div><br/><div id="apache2-setup" class="section"><br/><h3>Apache2 setup</h3><br/>The configuration is based on the <a class="reference external" href="http://projects.reductivelabs.com/projects/puppet/wiki/Using_Passenger">Puppet passenger documentation</a>. It is available from the <a class="reference external" href="https://code.launchpad.net/~mathiaz/+junk/uec-ec2-puppet-config-passenger">bzr branch</a> as we'll use puppet to actually configure the instance running puppetmasterd.<br/><br/>The <a class="reference external" href="http://bazaar.launchpad.net/~mathiaz/%2Bjunk/uec-ec2-puppet-config-passenger/annotate/head%3A/modules/puppet/manifests/init.pp">puppet module</a> has been updated to make sure the apache2 and libapache2-mod-passenger packages are installed. It also creates the relevant files and directories required to run puppetmasterd as a rack application.<br/><br/>Passenger and SSL modules are enabled in the <a class="reference external" href="http://bazaar.launchpad.net/~mathiaz/%2Bjunk/uec-ec2-puppet-config-passenger/annotate/head%3A/modules/puppet/templates/apache2.conf">apache2 configuration</a>. All of their configuration is done inside a virtual host definition. Note that the SSL options related to certificates and private keys files points directly to<tt class="docutils literal"> /var/lib/puppet/ssl/</tt>.<br/><br/>Apache2 is also configured to only listen on the default puppetmaster port by replacing apache2 default <a class="reference external" href="http://bazaar.launchpad.net/%7Emathiaz/%2Bjunk/uec-ec2-puppet-config-passenger/annotate/head%3A/modules/puppet/files/ports.conf">ports.conf</a> and <a class="reference external" href="http://bazaar.launchpad.net/%7Emathiaz/%2Bjunk/uec-ec2-puppet-config-passenger/annotate/head%3A/modules/puppet/manifests/init.pp#L64">disabling</a> the default virtual site.<br/><br/>Finally the <a class="reference external" href="http://bazaar.launchpad.net/~mathiaz/%2Bjunk/uec-ec2-puppet-config-passenger/annotate/head%3A/puppet.conf">configuration</a> of puppetmasterd has been updated so that it can correctly process the certificate clients while being run under passenger.<br/><br/>Note that puppetmasterd needs to be run once in order to be able to generate its ssl configuration. This happens automatically when the puppetmaster package is installed since puppetmasterd is started during the package installation.<br/><br/></div><br/><div id="deploying-an-improved-puppetmaster" class="section"><br/><h3>Deploying an improved puppetmaster</h3><br/>Log on the puppetmaster instance and update the puppet configuration using the<a class="reference external" href="https://code.launchpad.net/~mathiaz/+junk/uec-ec2-puppet-config-passenger"> bzr branch</a>:<br/><blockquote>bzr pull --remember lp:~mathiaz/+junk/uec-ec2-puppet-config-passenger /etc/puppet/</blockquote><br/>Update the configuration:<br/><blockquote>sudo puppet --node_terminus=plain /etc/puppet/manifests/puppetmaster.pp</blockquote><br/>On the <tt class="docutils literal">Cloud Conductor</tt> start a new instance with <tt class="docutils literal">start_instance.py</tt>. If you're starting from scratch remember to update the <tt class="docutils literal">start_instance.yaml</tt><br/>file with the puppetmaster CA and internal IP:<br/><blockquote>./start_instance.py -c start_instance.yaml AMI_NUMBER</blockquote><br/>Following <tt class="docutils literal">/var/log/syslog</tt> on the puppetmaster you should see the new instance requesting a certificate:<br/><blockquote>Apr 8 00:40:08 ip-10-195-93-129 puppetmasterd[3353]: Starting Puppet server version 0.25.4<br/>Apr 8 00:40:08 ip-10-195-93-129 puppetmasterd[3353]: 7d6b61a7-3772-4c41-a23d-471b417d9c47 has a waiting certificate request</blockquote><br/>Now that the puppetmasterd process is run by apache2 and mod-passenger you can check in <tt class="docutils literal">/var/log/apache2/other_vhosts_access.logs.log</tt> the http requests made by the puppet client to get its certificate signed:<br/><blockquote>ip-10-195-93-129.ec2.internal:8140 10.195.94.224 - - [08/Apr/2010:00:40:06 +0000] "GET /production/certificate/7d6b61a7-3772-4c41-a23d-471b417d9c47 HTTP/1.1" 404 2178 "-" "-"<br/>ip-10-195-93-129.ec2.internal:8140 10.195.94.224 - - [08/Apr/2010:00:40:08 +0000] "GET /production/certificate_request/7d6b61a7-3772-4c41-a23d-471b417d9c47 HTTP/1.1" 404 2178 "-" "-"<br/>ip-10-195-93-129.ec2.internal:8140 10.195.94.224 - - [08/Apr/2010:00:40:08 +0000] "PUT /production/certificate_request/7d6b61a7-3772-4c41-a23d-471b417d9c47 HTTP/1.1" 200 2082 "-" "-"<br/>ip-10-195-93-129.ec2.internal:8140 10.195.94.224 - - [08/Apr/2010:00:40:08 +0000] "GET /production/certificate/7d6b61a7-3772-4c41-a23d-471b417d9c47 HTTP/1.1" 404 2178 "-" "-"<br/>ip-10-195-93-129.ec2.internal:8140 10.195.94.224 - - [08/Apr/2010:00:40:08 +0000] "GET /production/certificate/7d6b61a7-3772-4c41-a23d-471b417d9c47 HTTP/1.1" 404 2178 "-" "-"</blockquote><br/>Once <tt class="docutils literal">check_csr</tt> is run by cron the certificate will be signed and the puppet client is able to retrieve its certificate:<br/><blockquote>ip-10-195-93-129.ec2.internal:8140 10.195.94.224 - - [08/Apr/2010:00:42:08 +0000] "GET /production/certificate/7d6b61a7-3772-4c41-a23d-471b417d9c47 HTTP/1.1" 200 2962 "-" "-"<br/>ip-10-195-93-129.ec2.internal:8140 10.195.94.224 - - [08/Apr/2010:00:42:08 +0000] "GET /production/certificate_revocation_list/ca HTTP/1.1" 200 2450 "-" "-"</blockquote><br/>The puppet client ends up requesting its manifest:<br/><blockquote>ip-10-195-93-129.ec2.internal:8140 10.195.94.224 - - [08/Apr/2010:00:42:09 +0000] "GET /production/catalog/7d6b61a7-3772-4c41-a23d-471b417d9c47?facts_format=b64_zlib_yaml&facts=eNp [....] HTTP/1.1" 200 2354 "-" "-"</blockquote><br/></div><br/><div id="conclusion" class="section"><br/><h3>Conclusion</h3><br/>I've just outlined how to configure mod passeenger to run puppetmasterd which is a much more efficient setup than using the default webrick server. Most of the configuration is detailed in the files available in the <a class="reference external" href="https://code.launchpad.net/~mathiaz/+junk/uec-ec2-puppet-config-passenger">bzr branch</a>.<br/><br/></div>Unknownnoreply@blogger.com5tag:blogger.com,1999:blog-3358115540372447242.post-14026429208367340602010-04-07T12:58:00.000-04:002010-11-29T19:08:41.808-05:00Using puppet in UEC/EC2: Node classificationIn a <a class="reference external" href="http://ubuntumathiaz.wordpress.com/2010/03/25/using-puppet-in-uecec2-automating-the-signing-process/">previous article</a> I discussed how to set up an automated registration process for puppet instances. We'll now have a look at how we can tell these instances what they should be doing.<br/><br/>Going back to the <a class="reference external" href="http://ubuntumathiaz.wordpress.com/2010/03/25/using-puppet-in-uecec2-automating-the-signing-process/#overview">overall architecture</a> the <tt class="docutils literal">Cloud conductor</tt> is the component responsible for starting new instances. Of all the three components it's him that has the most knowledge about what an instance should be: it is the one responsible for starting a new instance after all.<br/><div id="using-s3-to-store-node-definitions" class="section"><br/><h3>Using S3 to store node definitions</h3><br/>We'll use the <a class="reference external" href="http://projects.reductivelabs.com/projects/puppet/wiki/External_Nodes">puppet external node</a> feature to connect the <tt class="docutils literal">Cloud conductor</tt> with the <tt class="docutils literal">puppetmaster</tt>. The external node script -<a class="reference external" href="http://bazaar.launchpad.net/%7Emathiaz/%2Bjunk/uec-ec2-puppet-config-tut3/annotate/head%3A/scripts/node_classifier.py">node_classifier.py</a> - will be responsible for telling which classes each instance is supposed to have. Whenever a puppet client connects to the master the <tt class="docutils literal">node_classifier.py</tt> script is called with the certificate name. It is responsible for providing a description of the classes, environments and parameters for the client on its standard output in a yaml format.<br/><br/>Given that the <tt class="docutils literal">Cloud conductor</tt> creates a file with the certificate name for each instance it spawns we'll extend the <a class="reference external" href="http://bazaar.launchpad.net/%7Emathiaz/%2Bjunk/uec-ec2-puppet-config-tut3/annotate/head%3A/scripts/start_instance.py">start_instance.py</a> script to store the node classification in the content of the file created in the S3 bucket.<br/><br/>You may have noticed that instances started by <tt class="docutils literal">start_instance.py</tt> don't have an ssh public key associated with them. So we're going to create a<tt class="docutils literal"><span class="pre"> login-allowed</span></tt> class that will install the authorized key for the ubuntu user.<br/><br/></div><br/><div id="setup-the-puppetmaster-to-use-the-node-classifier" class="section"><br/><h3>Setup the puppetmaster to use the node classifier</h3><br/>We'll use the <a class="reference external" href="http://uec-images.ubuntu.com/releases/lucid/beta-2/">Ubuntu Lucid Beta2 image</a> as the base image on which to build our Puppet infrastructure.<br/><br/>Start an instance of the Lucid Beta2 AMI using an ssh key. Once it's running write down its public and private DNS addresses. The public DNS address will be used to setup the puppetmaster via ssh. The private DNS address will be used as the puppetmaster hostname given out to puppet clients.<br/><br/>Log on the started instance via ssh to install and setup the puppet master:<br/><ol class="arabic"><br/> <li><br/><p class="first">Update apt files:</p><br/><br/><blockquote>sudo apt-get update</blockquote><br/></li><br/> <li><br/><p class="first">Install the puppet and bzr packages:</p><br/><br/><blockquote>sudo apt-get install puppet bzr</blockquote><br/></li><br/> <li><br/><p class="first">Change the ownership of the puppet directory so that the ubuntu user can directly edit the puppet configuration files:</p><br/><br/><blockquote>sudo chown -R ubuntu:ubuntu /etc/puppet/</blockquote><br/></li><br/> <li><br/><p class="first">On the puppetmaster check out the <a class="reference external" href="https://code.launchpad.net/~mathiaz/+junk/uec-ec2-puppet-config-tut3">tutorial3</a> bzr branch:</p><br/><br/><blockquote>bzr branch --use-existing-dir lp:~mathiaz/+junk/uec-ec2-puppet-config-tut3 /etc/puppet/</blockquote><br/>You'll get a conflict for the puppet.conf file. You can ignore the conflict as the puppet.conf file from the branch is the one that supports an external node classifier:<br/><blockquote>bzr resolve /etc/puppet/puppet.conf</blockquote><br/></li><br/></ol><br/>Edit the node classifier script <tt class="docutils literal">scripts/node_classifier.py</tt> to set the correct location of your S3 bucket.<br/><br/>Note that the script is set to return 1 if the certificate name doesn't have a corresponding file in the S3 bucket. You may want to change the return code to 0 if you want to use the normal nodes definition. See the <a class="reference external" href="http://projects.reductivelabs.com/projects/puppet/wiki/External_Nodes">puppet external node</a> documentation for more information.<br/><br/>The puppetmaster configuration in <tt class="docutils literal">puppet.conf</tt> has been updated to use the external node script.<br/><br/>There is also the <tt class="docutils literal"><span class="pre">login-allowed</span></tt> class defined in the <tt class="docutils literal">manifests/site.pp</tt><tt> </tt>file. It sets the authorized key file for the ubuntu user.<br/><br/>On the puppetmaster edit <tt class="docutils literal">manifests/site.pp</tt> to update the public key with your EC2 public key. You can get the public key from<tt class="docutils literal"><span class="pre"> ~ubuntu/.ssh/authorized_key</span></tt> on the puppetmaster.<br/><br/>To bootstrap the new puppetmaster configuration run the puppet client:<br/><blockquote>sudo puppet --node_terminus=plain /etc/puppet/manifests/puppetmaster.pp</blockquote><br/>Note that you'll have to set the node_terminus to plain to avoid calling the node classifier script when configuring the puppetmaster itself. Otherwise the puppet run would fail since the puppetmaster certificate name (which defaults the to fqdn of the instance) doesn't have a corresponding file in the S3 bucket.<br/><br/>We have now our puppetmaster configured to look up the node classification for each puppet client.<br/><br/></div><br/><div id="update-start-instance-py-to-provide-a-node-definition" class="section"><br/><h3>Update start_instance.py to provide a node definition</h3><br/>It's time to update the <tt class="docutils literal">Cloud conductor</tt> to provide the relevant node classification information whenever it starts a new instance.<br/><br/>Update the bzr branch on the <tt class="docutils literal">Cloud conductor</tt> system:<br/><blockquote>bzr pull --remember lp:~mathiaz/uec-puppet-config-tut3</blockquote><br/>The <tt class="docutils literal">start_instance.py</tt> script has been updated to write the node classification information when it creates the instance file in the S3 bucket. That information is actually set in the <tt class="docutils literal">start_instance.yaml</tt> file under the node key. All of the node classification information expected by the puppetmaster from the external node classifier script is set under the <tt class="docutils literal">node</tt><tt> </tt>key in <tt class="docutils literal">start_instance.yaml</tt>. See the <a class="reference external" href="http://projects.reductivelabs.com/projects/puppet/wiki/External_Nodes">puppet external node</a> documentation for more information on the information that can be provided by the external node script.<br/><br/>Review the <tt class="docutils literal">start_instance.yaml</tt> file to make sure the S3 bucket name, the puppetmaster server IP and CA certificate are still valid for your own setup.<br/><br/>Start an instance:<br/><blockquote>./start_instance.py -c start_instance.yaml AMI_NUMBER</blockquote><br/>Following <tt class="docutils literal">/var/log/syslog</tt> you should see something similar to this:<br/><blockquote>Apr 7 19:15:37 domU-12-31-39-07-D6-52 puppetmasterd[1644]: 77ad2a3c-5d52-4ca7-9fea-b99b767b09d0 has a waiting certificate request</blockquote><br/>The instance has booted and registered with the puppetmaster.<br/><blockquote>Apr 7 19:16:01 domU-12-31-39-07-D6-52 CRON[2188]: (root) CMD (/usr/local/bin/check_csr --log-level=debug <a class="reference external" href="https://mathiaz-puppet-nodes-1.s3.amazonaws.com">https://mathiaz-puppet-nodes-1.s3.amazonaws.com</a>)<br/>Apr 7 19:16:02 domU-12-31-39-07-D6-52 check_csr[2189]: DEBUG: List of waiting csr: 77ad2a3c-5d52-4ca7-9fea-b99b767b09d0<br/>Apr 7 19:16:02 domU-12-31-39-07-D6-52 check_csr[2189]: DEBUG: Checking 77ad2a3c-5d52-4ca7-9fea-b99b767b09d0<br/>Apr 7 19:16:02 domU-12-31-39-07-D6-52 check_csr[2189]: DEBUG: Checking url <a class="reference external" href="https://mathiaz-puppet-nodes-1.s3.amazonaws.com/77ad2a3c-5d52-4ca7-9fea-b99b767b09d0">https://mathiaz-puppet-nodes-1.s3.amazonaws.com/77ad2a3c-5d52-4ca7-9fea-b99b767b09d0</a><br/>Apr 7 19:16:03 domU-12-31-39-07-D6-52 check_csr[2189]: INFO: Signing request: 77ad2a3c-5d52-4ca7-9fea-b99b767b09d0</blockquote><br/>The puppetmaster checked if the client request is expected and signs it.<br/><blockquote>Apr 7 19:17:39 domU-12-31-39-07-D6-52 node_classifier[2240]: DEBUG: Checking url <a class="reference external" href="https://mathiaz-puppet-nodes-1.s3.amazonaws.com/77ad2a3c-5d52-4ca7-9fea-b99b767b09d0">https://mathiaz-puppet-nodes-1.s3.amazonaws.com/77ad2a3c-5d52-4ca7-9fea-b99b767b09d0</a><br/>Apr 7 19:17:39 domU-12-31-39-07-D6-52 node_classifier[2240]: INFO: Getting node configuration: 77ad2a3c-5d52-4ca7-9fea-b99b767b09d0<br/>Apr 7 19:17:39 domU-12-31-39-07-D6-52 node_classifier[2240]: DEBUG: Node configuration (77ad2a3c-5d52-4ca7-9fea-b99b767b09d0): classes: [login-allowed]<br/>Apr 7 19:17:39 domU-12-31-39-07-D6-52 puppetmasterd[1644]: Compiled catalog for 77ad2a3c-5d52-4ca7-9fea-b99b767b09d0 in 0.01 seconds</blockquote><br/>The puppetmaster compiled a manifest for the client according to the information provided by the node classifier script.<br/><br/>Make sure that the instance that has been started doesn't have any ssh key associated with it:<br/><blockquote>euca-describe-instances</blockquote><br/>Make a note of the instance ID and its public DNS name.<br/><br/>Login into the instance:<br/><ol class="arabic"><br/> <li><br/><p class="first">Run <tt class="docutils literal"><span class="pre">euca-get-console-output</span> instance_ID</tt> to get the ssh fingerprint. You may need to scroll back to get the fingerprints.</p><br/></li><br/> <li><br/><p class="first">Login into the instances using your EC2 public key:</p><br/><br/><blockquote>ssh -i ~/.ssh/ec2_key <a class="reference external" href="mailto:ubuntu@public_dns">ubuntu@public_dns</a></blockquote><br/></li><br/></ol><br/></div><br/><div id="conclusion" class="section"><br/><h3>Conclusion</h3><br/>The <a class="reference external" href="http://bazaar.launchpad.net/%7Emathiaz/%2Bjunk/uec-ec2-puppet-config-tut3/annotate/head%3A/scripts/start_instance.py">start_instance.py</a> script is currently very simple and should be considered as a proof of concept.<br/><br/>Storing the node classification information into an S3 bucket makes it also easy to edit the content of the file. It also provides an easy way to get a list of the nodes that have been started by the Cloud Conductor as well as their classification.<br/><br/>If you look at the <a class="reference external" href="http://bazaar.launchpad.net/%7Emathiaz/%2Bjunk/uec-ec2-puppet-config-tut3/annotate/head%3A/scripts/start_instance.py">start_instance.py</a> script you'll notice that the ACL on the S3 bucket is 'public-read'. That means anyone can read the list of your nodes as well as the list of classes and other node classification information for each of them. You may wanna use S3 private url instead.<br/><br/>We now have a puppet infrastructure where <tt class="docutils literal">instances</tt> are started by a<tt class="docutils literal"> Cloud conductor</tt> in order to achieve a specific task. These <tt class="docutils literal">instances</tt> automatically connect to the <tt class="docutils literal">puppetmaster</tt> to get configured automatically for the task they've been created for. All of the <tt class="docutils literal">instances</tt> configuration is stored in a reliable and scalable system: S3.<br/><br/>With instances being created on demand our puppet infrastructure can grow quickly. The <tt class="docutils literal">puppetmaster</tt> can easily be responsible for managing hundreds of instances. Next we'll have a look at how improving the performance of the<tt class="docutils literal"> puppetmaster</tt>.<br/><br/></div>Unknownnoreply@blogger.com6tag:blogger.com,1999:blog-3358115540372447242.post-52245861497327213372010-03-30T17:03:00.000-04:002010-11-29T19:08:41.814-05:00MySQL 5.1 Bug Zap: Bug day resultToday was targeted at looking through the <a href="https://bugs.launchpad.net/ubuntu/+source/mysql-dfsg-5.1">mysql-dfsg-5.1 bugs</a> to triage them. We ended up with all bugs having their importance set and their status set to Triaged or Incomplete.<br/><br/>Tomorrow will be dedicated to fixing most of them as well as some upgrade testing. I'll also have a look at the new mysql upstart job that has replaced the mysld_safe script.<br/><br/>Looking at the bugs today I've found a couple of bugs that look easy to fix:<br/><ul><br/> <li><a href="https://bugs.launchpad.net/ubuntu/+source/mysql-dfsg-5.1/+bug/552053">Bug 552053</a>: mysqld_safe should be available in mysql-server</li><br/> <li> <a href="https://bugs.launchpad.net/ubuntu/+source/mysql-dfsg-5.1/+bug/498939">Bug 498939</a>: mysql- packages section on synaptic</li><br/></ul><br/>To get started grab a copy of the package branch:<br/><pre>bzr init-repo mysql-dfsg-5.1/<br/>cd mysql-dfsg-5.1/<br/>bzr branch lp:ubuntu/mysql-dfsg-5.1<br/></pre><br/>Fix a bug and push the branch to launchpad:<br/><pre>bzr push lp:~YOUR-LOGIN/ubuntu/mysql-dfsg-5.1/zap-bug-XXXXXX</pre><br/>And finish up by creating a merge proposal for the Lucid package branch. I'll take a look at the list of merge proposal throughout the day and include them in the upload schedule for tomorrow.<br/><div id="_mcePaste" style="overflow:hidden;position:absolute;left:-10000px;top:0;width:1px;height:1px;"><a href="https://bugs.launchpad.net/ubuntu/+source/mysql-dfsg-5.1/+bug/552053">mysqld_safe should be available in mysql-server</a></div>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3358115540372447242.post-8564329806878227372010-03-29T16:45:00.000-04:002010-11-29T19:08:41.815-05:00Ubuntu Server Bug Zap: MySQL 5.1<p>Following up on the <a class="reference external" href="http://blog.dustinkirkland.com/2010/03/server-bug-zapping-kvm-in-retrospective.html">kvm</a> and <a class="reference external" href="http://fnords.wordpress.com/2010/03/07/samba-bugzapping/">samba</a> bug zap days I'm organizing a two day bug<br/>zap around MySQL.</p><br/><div class="section" id="first-phase-bug-triaging"><br/><h3>First phase: bug triaging</h3><br/><p>First in line is triaging all the bugs related to <a class="reference external" href="http://lucid-mysql-bug-zap.notlong.com">mysql-dfsg-5.1 package</a>. As<br/>of Tue Mar 30 00:23:04 UTC 2010 there are 27 bugs waiting to be looked at.</p><br/><p>The goal is to have the <a class="reference external" href="https://wiki.ubuntu.com/Bugs/Importance">importance</a> set for all bugs and have as many bugs<br/><a class="reference external" href="https://wiki.ubuntu.com/Bugs/Status">status</a> moved to either Triaged or Invalid/Won't Fix.</p><br/><p>A few resources are available to help out:</p><br/><blockquote><br/><ul class="simple"><br/><li>The <a class="reference external" href="https://wiki.ubuntu.com/DebuggingMySQL">Debugging MySQL wiki page</a></li><br/></ul><br/></blockquote><br/><blockquote><br/><ul class="simple"><br/><li>The #ubuntu-server IRC channel</li><br/><li>The <a class="reference external" href="http://lucid-mysql-bug-zap.notlong.com">up-to-date list of bugs</a> that still need to be triaged.</li><br/></ul><br/></blockquote><br/><p>Objective: get the <a class="reference external" href="http://lucid-mysql-bug-zap.notlong.com">list of bugs</a> to <em>zero</em>.</p><br/></div>Unknownnoreply@blogger.com4tag:blogger.com,1999:blog-3358115540372447242.post-50545353222118747012010-03-25T12:54:00.000-04:002010-11-29T19:08:41.817-05:00Using puppet in UEC/EC2: Automating the signing process<p>I outlined in the <a class="reference external" href="http://ubuntumathiaz.wordpress.com/2010/03/24/using-puppet-in-uecec2-puppet-support-in-ubuntu-images/">previous article</a> how to setup a puppetmaster instance on UEC/EC2 and how to start instances that will automatically register with the puppetmaster. We're going to look at automating the process of signing puppet client certificate requests.</p><br/><div class="section" id="overview"><br/><h3>Overview</h3><br/><p>Our puppet infrastructure on the cloud can be broken down into three components:</p><br/><ul class="simple"><br/><li>The <tt class="docutils literal"><span class="pre">Cloud</span> <span class="pre">conductor</span></tt> responsible for starting new instances in our cloud.</li><br/><li>A <tt class="docutils literal"><span class="pre">Puppetmaster</span></tt> responsible for configuring all the instances running in our cloud.</li><br/><li><tt class="docutils literal"><span class="pre">Instances</span></tt> acting as puppet clients asking to be setup correctly.</li><br/></ul><br/><p>The idea is to have the <tt class="docutils literal"><span class="pre">Cloud</span> <span class="pre">conductor</span></tt> start <tt class="docutils literal"><span class="pre">instances</span></tt> and notify the <tt class="docutils literal"><span class="pre">puppetmaster</span></tt> that these new instances are coming up. The <tt class="docutils literal"><span class="pre">puppetmaster</span></tt> can then automatically sign their certificate requests.</p><br/><p>We'll use S3 as the way to communicate between the <tt class="docutils literal"><span class="pre">Cloud</span> <span class="pre">conductor</span></tt> and the <tt class="docutils literal"><span class="pre">puppetmaster</span></tt>. The <tt class="docutils literal"><span class="pre">Cloud</span> <span class="pre">conductor</span></tt> will also assign a random certificate to each instance it starts.</p><br/><p>The <tt class="docutils literal"><span class="pre">Cloud</span> <span class="pre">conductor</span></tt> will be located on a sysadmin workstation while the <tt class="docutils literal"><span class="pre">puppetmaster</span></tt> and <tt class="docutils literal"><span class="pre">instances</span></tt> will be running in the cloud. The <a class="reference external" href="https://code.launchpad.net/~mathiaz/+junk/uec-ec2-puppet-config-tut2">bzr branch</a> contains all the scripts necessary to setup such a solution.</p><br/></div><br/><div class="section" id="the-cloud-conductor-start-instance-py"><br/><h3>The Cloud conductor: start_instance.py</h3><br/><ol class="arabic"><br/><li><p class="first">Get the tutorial2 bzr on the <tt class="docutils literal"><span class="pre">Cloud</span> <span class="pre">conductor</span></tt> (an admin workstation):</p><br/><blockquote><br/><p>bzr branch lp:~mathiaz/+junk/uec-ec2-puppet-config-tut2</p><br/></blockquote><br/><p>In the <tt class="docutils literal"><span class="pre">scripts/</span></tt> directory <a class="reference external" href="http://bazaar.launchpad.net/%7Emathiaz/%2Bjunk/uec-ec2-puppet-config-tut2/annotate/head%3A/scripts/start_instance.py">start_instance.py</a> plays the role of the <tt class="docutils literal"><span class="pre">Cloud</span> <span class="pre">conductor</span></tt>. It creates new instances and stores their certname in S3. The <em>start_instance.yaml</em> configuration file provides almost the same information as the <tt class="docutils literal"><span class="pre">user-data.yaml</span></tt> file we used in the previous article.</p><br/></li><br/><li><p class="first">Edit the <em>start_instance.yaml</em> file and update each setting:</p><br/><ul class="simple"><br/><li>Choose a unique S3 bucket name.</li><br/><li>Use the private DNS hostname of the instance running the puppetmaster.</li><br/><li>Add the puppetmaster ca certificate found on the puppetmaster.</li><br/></ul><br/></li><br/><li><p class="first">Make sure your AWS/UEC credentials are available in the environment. The <em>start_instance.py</em> uses these to access EC2 to start new instances and S3 to store the instance certificate names.</p><br/></li><br/><li><p class="first">Start a new instance of the Lucid Beta1 AMI:</p><br/><blockquote><br/><p>./start_instance.py -c ./start_instance.yaml ami-ad09e6c4</p><br/></blockquote><br/><p><em>start_instance.py</em> starts a new instance using the AMI specified on the command line. The instance user data holds a random UUID for the puppet client certificate name. <em>start_instance.py</em> also creates a new file in its S3 bucket named after the puppet client certificate name.</p><br/></li><br/><li><p class="first">On the puppetmater looking at the puppetmaster log you should see a certificate request show up after some time:</p><br/><blockquote><br/><p>Mar 19 19:09:33 ip-10-245-197-226 puppetmasterd[20273]: a83b0057-ab8d-426e-b2ab-175729742adb has a waiting certificate request</p><br/></blockquote><br/></li><br/></ol><br/></div><br/><div class="section" id="automating-the-signing-process-on-the-puppetmaster"><br/><h3>Automating the signing process on the puppetmaster</h3><br/><p>It's time to setup the puppetmaster to check if there are any certificate requests waiting and signs only the ones started by the <tt class="docutils literal"><span class="pre">Cloud</span> <span class="pre">conductor</span></tt>. We'll use the <a class="reference external" href="http://bazaar.launchpad.net/%7Emathiaz/%2Bjunk/uec-ec2-puppet-config-tut2/annotate/head%3A/scripts/check_csr.py">check_csr.py cron job</a> that will get the list of waiting certificate requests via <tt class="docutils literal"><span class="pre">puppetca</span> <span class="pre">--list</span></tt> and checks whether there is a corresponding file in the S3 bucket.</p><br/><ol class="arabic"><br/><li><p class="first">On the puppetmaster get the <cite>tutorial2</cite> bzr branch:</p><br/><blockquote><br/><p>bzr pull --remember lp:~mathiaz/+junk/uec-ec2-config/tut2 /etc/puppet/</p><br/></blockquote><br/></li><br/><li><p class="first">The <a class="reference external" href="http://bazaar.launchpad.net/%7Emathiaz/%2Bjunk/uec-ec2-puppet-config-tut2/annotate/head%3A/manifests/puppetmaster.pp">puppetmaster.pp</a> manifest has been updated to setup the <tt class="docutils literal"><span class="pre">check_csr.py</span></tt> cron job to run every 2 minutes. You need to update the cron job command line in <tt class="docutils literal"><span class="pre">/etc/puppet/manifests/puppetmaster.pp</span></tt> with your own S3 bucket name.</p><br/></li><br/><li><p class="first">Update the puppetmaster configuration:</p><br/><blockquote><br/><p>sudo puppet /etc/puppet/manifests/puppetmaster.pp</p><br/></blockquote><br/></li><br/><li><p class="first">Watching <tt class="docutils literal"><span class="pre">/var/log/syslog</span></tt> you should see <em>check_csr</em> being run by cron every other minute:</p><br/><blockquote><br/><p>Mar 19 19:10:01 ip-10-245-197-226 CRON[21858]: (root) CMD (/usr/local/bin/check_csr --log-level=debug <a class="reference external" href="https://mathiaz-puppet-nodes-1.s3.amazonaws.com">https://mathiaz-puppet-nodes-1.s3.amazonaws.com</a>)</p><br/></blockquote><br/><p><em>check_csr</em> gets the list of waiting certificate requests and checks if there is a corresponding file in its S3 bucket:</p><br/><blockquote><br/><p>Mar 19 19:10:03 ip-10-245-197-226 check_csr[21859]: DEBUG: List of waiting csr: a83b0057-ab8d-426e-b2ab-175729742adb<br/>Mar 19 19:10:03 ip-10-245-197-226 check_csr[21859]: DEBUG: Checking a83b0057-ab8d-426e-b2ab-175729742adb<br/>Mar 19 19:10:03 ip-10-245-197-226 check_csr[21859]: DEBUG: Checking url <a class="reference external" href="https://mathiaz-puppet-nodes-1.s3.amazonaws.com/a83b0057-ab8d-426e-b2ab-175729742adb">https://mathiaz-puppet-nodes-1.s3.amazonaws.com/a83b0057-ab8d-426e-b2ab-175729742adb</a></p><br/></blockquote><br/><p>If so it will sign the certificate request:</p><br/><blockquote><br/><p>Mar 19 19:10:03 ip-10-245-197-226 check_csr[21859]: INFO: Signing request: a83b0057-ab8d-426e-b2ab-175729742adb</p><br/></blockquote><br/></li><br/></ol><br/></div><br/><div class="section" id="s3-bucket-acl"><br/><h3>S3 bucket ACL</h3><br/><p>For now the S3 bucket ACL is set so that anyone can get the list files available in the bucket. However only authenticated requests can create new files in the bucket. Given that the filename are just random UUID this is not a big issue.</p><br/></div><br/><div class="section" id="using-sqs-instead-of-s3"><br/><h3>Using SQS instead of S3</h3><br/><p>Another implementation of the same idea is to use SQS to handle the notification of the <tt class="docutils literal"><span class="pre">puppetmaster</span></tt> by the <tt class="docutils literal"><span class="pre">Cloud</span> <span class="pre">conductor</span></tt> about new <tt class="docutils literal"><span class="pre">instances</span></tt>. While SQS would seem to be the best tool to provide that functionality it is not available in UEC in Lucid.</p><br/></div><br/><div class="section" id="conclusion"><br/><h3>Conclusion</h3><br/><p>We end up with a puppet infrastructure where legitimate instances are automatically accepted. Now that instances can easily show up and be automatically enrolled <em>what</em> should these be configured as? We'll dive into this issue in the next article.</p><br/></div>Unknownnoreply@blogger.com9tag:blogger.com,1999:blog-3358115540372447242.post-5129098033992647432010-03-24T12:56:00.000-04:002010-11-29T19:08:41.822-05:00Using puppet in UEC/EC2: puppet support in Ubuntu images<p>One of the focus for the Lucid release cycle in the Ubuntu Server team is to <a class="reference external" href="https://blueprints.launchpad.net/ubuntu/+spec/server-lucid-puppet-uec-ec2-integration">improve the integration between puppet and UEC/EC2</a>. I'll discuss in a series of articles how to setup a puppet infrastructure to manage Ubuntu Lucid instances running on UEC/EC2. I'll focus on the bootstrapping process rather than writing puppet recipes.</p><br/><p>Today we'll look at configuring a puppetmaster into an instance and how to start instances that will register automatically with the puppetmaster.</p><br/><p>We'll work with the <a class="reference external" href="http://uec-images.ubuntu.com/releases/lucid/beta-1/">Lucid Beta1 image</a> on EC2. All the instances started through out this article will be based on this AMI.</p><br/><div class="section" id="puppetmaster-setup"><br/><h2>Puppetmaster setup</h2><br/><p>Let's start by creating a puppetmaster running on EC2. We'll setup all the puppet configuration via ssh using a bzr branch on Launchpad: <a class="reference external" href="https://code.launchpad.net/~mathiaz/+junk/uec-ec2-puppet-config-tut1">lp:~mathiaz/+junk/uec-ec2-puppet-config-tut1</a>.</p><br/><p>Start an instance of the Lucid Beta1 AMI using an ssh key. Once it's running write down its public and private DNS addresses. The public DNS address will be used to setup the puppetmaster via ssh. The private DNS address will be used as the puppetmaster hostname given out to puppet clients.</p><br/><p>We'll actually install the puppetmaster using puppet itself.</p><br/><p>Log on the started instance via ssh to install and setup the puppet master:</p><br/><ol class="arabic"><br/><li><p class="first">Update apt files:</p><br/><blockquote><br/><p>sudo apt-get update</p><br/></blockquote><br/></li><br/><li><p class="first">Install the puppet and bzr packages:</p><br/><blockquote><br/><p>sudo apt-get install puppet bzr</p><br/></blockquote><br/></li><br/><li><p class="first">Change the ownership of the puppet directory so that the ubuntu user can directly edit the puppet configuration files:</p><br/><blockquote><br/><p>sudo chown ubuntu:ubuntu /etc/puppet/</p><br/></blockquote><br/></li><br/><li><p class="first">Get the puppet configuration branch:</p><br/><blockquote><br/><p>bzr branch --use-existing-directory lp:~mathiaz/+junk/uec-ec2-puppet-config-tut1 /etc/puppet/</p><br/></blockquote><br/><p>Before doing the actual configuration let's have a look at the content of the <tt class="docutils literal"><span class="pre">/etc/puppet/</span></tt> directory created from the bzr branch.</p><br/><p>The layout follows the <a class="reference external" href="http://projects.reductivelabs.com/projects/puppet/wiki/Puppet_Best_Practice">recommended puppet practices</a>. The puppet module available in the <tt class="docutils literal"><span class="pre">modules</span></tt> directory defines a <tt class="docutils literal"><span class="pre">puppet::master</span></tt> class. The class makes sure that the puppetmaster package is installed and that the puppetmaster service is running. The <tt class="docutils literal"><span class="pre">manifests/puppetmaster.pp</span></tt> file defines the default node to be configured as a puppetmaster.</p><br/></li><br/><li><p class="first">We'll now run the <tt class="docutils literal"><span class="pre">puppet</span></tt> client to setup the instance as a puppetmaster:</p><br/><blockquote><br/><p>sudo puppet /etc/puppet/manifests/puppetmaster.pp</p><br/></blockquote><br/></li><br/></ol><br/></div><br/><div class="section" id="starting-a-new-instance"><br/><h2>Starting a new instance</h2><br/><p>Now that we have puppetmaster available in our cloud we'll have look at how a new instances of the Lucid Beta1 AMI can be started and automatically setup to register with the puppetmaster.</p><br/><p>We're going to use the <a class="reference external" href="http://bazaar.launchpad.net/~ubuntu-branches/ubuntu/lucid/cloud-init/lucid/annotate/head%3A/doc/examples/cloud-config-puppet.txt">cloud-config puppet syntax</a> to boot an instance and have it configure itself to connect to the puppetmaster using its user data information:</p><br/><ol class="arabic"><br/><li><p class="first">On the puppetmaster instance create a <tt class="docutils literal"><span class="pre">user-data.yaml</span></tt> file to include the relevant puppetmaster configuration:</p><br/><blockquote><br/><p>cp /usr/share/doc/cloud-init/examples/cloud-config-puppet.txt user-data.yaml</p><br/></blockquote><br/></li><br/><li><p class="first">Update the <tt class="docutils literal"><span class="pre">server</span></tt> setting to point to the puppetmaster <em>private</em> dns hostname. I also strongly recommend to include the puppmaster ca certificate as the <tt class="docutils literal"><span class="pre">ca_cert</span></tt> setting.</p><br/><p>The example <tt class="docutils literal"><span class="pre">certname</span></tt> setting uses a string extrapolation to make each puppet client certificate unique: for now %i is replace by the instance Id while %f is replaced by the FQDN of the instance.</p><br/><p>The sample file has extensive comments about the format of the file. One of the key point is that you can set any of the puppet configuration options via the user data passed to the instance.</p><br/><p>Note that you can remove all the comments to make the <tt class="docutils literal"><span class="pre">user-data.yaml</span></tt> file easier to copy and paste. However don't remove the first line (<tt class="docutils literal"><span class="pre">#cloud-config</span></tt>) as this is used by the instance boot process to start the puppet installation.</p><br/></li><br/><li><p class="first">Launch a new instance using the content of the <tt class="docutils literal"><span class="pre">user-data.yaml</span></tt> file you've just created as the <tt class="docutils literal"><span class="pre">user-data</span></tt> option passed to the new instance.</p><br/></li><br/><li><p class="first">You can watch the puppetmaster log on the puppetmaster instance to see when the new instance will request a new certificate:</p><br/><blockquote><br/><p>tail -f /var/log/syslog</p><br/></blockquote><br/></li><br/><li><p class="first">After some time you should see a request coming in:</p><br/><blockquote><br/><p>puppetmasterd[2637]: i-fdb31b96.ip-10-195-18-227.ec2.internal has a waiting certificate request</p><br/></blockquote><br/><p>During the boot process of the new instance the puppet cloud-config plugin used the user-data information to automatically install the puppet package, generate the <tt class="docutils literal"><span class="pre">/etc/puppet/puppet.conf</span></tt> file and start the <tt class="docutils literal"><span class="pre">puppetd</span></tt> daemon.</p><br/></li><br/><li><p class="first">You can then approve the new instance:</p><br/><blockquote><br/><p>sudo puppetca -s i-fdb31b96.ip-10-195-18-227.ec2.internal</p><br/></blockquote><br/></li><br/><li><p class="first">Watching the puppetmaster log you'll see that after some time the new instance will connect and get its new manifest compiled and sent:</p><br/><blockquote><br/><p>puppetmasterd[2637]: Compiled catalog for i-fdb31b96.ip-10-195-18-227.ec2.internal in 0.03 seconds</p><br/></blockquote><br/></li><br/></ol><br/><p>In conclusion we now have an instance acting as a puppetmaster and have a single user-data configuration for the whole puppet infrastructure. That user data can be passed to new instances which will automatically register with our puppetmaster.</p><br/><p>Even though we're able to make all our instances automatically register with our puppetmaster we still need to manually sign each request as outlined in step 6 above. We'll have a look at automating this step in the <a href="http://ubuntumathiaz.wordpress.com/2010/03/25/using-puppet-in-uecec2-automating-the-signing-process/">next article</a>.</p><br/></div>Unknownnoreply@blogger.com5tag:blogger.com,1999:blog-3358115540372447242.post-82752020402122813062010-02-17T15:21:00.000-05:002010-11-29T19:08:41.825-05:00FOSDEM 2010I had the opportunity to attend <a href="http://fosdem.org/2010/">FOSDEM </a>this year. The most amazing (and frustrating) part of the event was the huge number of talks that were given. Making choices was sometimes hard. However the FOSDEM team recorded most of the sessions and the videos are available now. Perfect for such a busy event!<br/><br/>Here a few highlights of the presentations I attended:<br/><br/><h3>Linux distribution for the cloud</h3><br/>I started the day by attending Peter Eisentraut (a Debian and PostgreSQL core developer) session about Linux distributions for the cloud. He focused on the provisioning aspect of clouds by giving a history of how operating systems were installed: from floppy drives to Cloud images. He dedicated one slide to Ubuntu's cloud offering including Canonical Landscape commenting that Ubuntu is clearly the leader of distributions in the cloud space. He also outlined what were the current problems such as lack of standards and integration of existing software stacks. He pointed out that linux distributions could drive this.<br/><br/>The second part of his talk was focused on the Linux desktop and the impact of cloud services on it. Giving ChromeOS as an example he outlined how applications themselves were moved to the cloud. He then listed the problems with Cloud services with regards to the Free Software principles: non-free server side code, non-free hosting, little or not control over data, lack of open-source community.<br/><br/>He concluded by outlining the challenge in the domain: how could free software principles be transposed to the Cloud and its services? One great reference is Eben Moglen<a href="http://www.isoc-ny.org/?p=1338"> talk named "Freedom in the Cloud".</a><br/><br/><h3>Beyond MySQL GA</h3><br/>Kristian Nielsen, a MariaDB developer, gave an overview of the Developer ecosystem around MySQL. He listed a few patches that were available to add new functionalities and fix some bugs: the Google patch, Percona patches, eBay patches, Galera multi-master replication for InnoDB as well as a growing list of storage engines. Few options are available to use them:<br/><ul><br/> <li>packages from third party repositories (such as ourdelta and percona)</li><br/> <li>MariaDB maintains and integrates most of the patches</li><br/> <li>a more do-it-yourself approach where you maintain a patch serie.</li><br/></ul><br/><br/>I talked with Kristian after the presentation about leveraging bzr and LP to make the maintenance easier. It could look like this:<br/><ol><br/> <li>Each patch would be available and maintained in a bzr branch - in LP or else where.</li><br/> <li>The Ubuntu MySQL package branch available in LP would be used as the base for creating a debian package (or the Debian packaging branch since Debian packages are also available in LP via bzr)</li><br/> <li>bzr-looms would glue the package bzr branch with the patches bzr branches. The loom could be available from LP or elsewhere.</li><br/> <li>bzr-builder would be used to create a recipe to build binary packages out of the loom branch.</li><br/> <li>Packages would be published in PPAs ready to be installed on Ubuntu systems.</li><br/></ol><br/><h3>The Cassandra distributed database</h3><br/>I finally managed to get in the NoSQL room to attend Eric Evans overview of the Cassandra project. He is a full time developer and employee of Rackspace. The project was started by Facebook to power their inbox search. Even though the project had been available for some years the developer community really started to grow in March 2009. It is now part of the Apache project and about to graduate to a Top Level Project there.<br/><br/>It is inspired by Dynamo from Amazon and provide a O(1) DHT with eventual consistency and a consistent hashing. Multiple client APIs are available:<br/><ul><br/> <li>Thrift</li><br/> <li>Ruby</li><br/> <li>Python</li><br/> <li>Scala</li><br/></ul><br/>I left before the end of the talk as I wanted to catch the complete presentation about using git for packaging.<br/><h3>Cross distro packaging with (top)git</h3><br/>Thomas Koch gave an overview of using git and related tools to help in maintaining Debian packaging. He works in web shop where every web application is deployed as a Debian package.<br/><br/>The upstream release tarball is imported in a upstream git branch using the pristine-tar tool. Packaging code (ie the debian/ directory) is kept in a different branch.<br/><br/>Patches to the upstream code are managed by topgit as seperate git branches. He also noted that topgit was able to export the whole stack of patches in the quilt Debian source format using the <em>tg export</em> command.<br/><br/>Here is the list of tools associated with his workflow:<br/><ul><br/> <li>pristine-tar</li><br/> <li>git-buildpackage</li><br/> <li>git-import-orig</li><br/> <li>git-dch</li><br/> <li>topgit</li><br/></ul><br/>The workflow he outlined looked very similar to the one based around bzr and looms.<br/><h3>Scaling Facebook with OpenSource tools</h3><br/>David Recordon from Facebook gave a <a href="http://fosdem.org/2010/schedule/events/scalingfacebook">good presentation</a> on the challenges that Facebook runs into when it comes to scale effectively.<br/><br/>Here are a few numbers I caught during the presentation to give an idea about the scale of the Facebook infrastructure (Warning: they may be wrong - watch the video to double check):<br/><ul><br/> <li>8 billion minutes spent on Facebook every day</li><br/> <li>2.5 billions of pictures uploaded every month</li><br/> <li>400 billion page/view per month</li><br/> <li>25 TB of log per day</li><br/> <li>40 billions pictures stored in 4 resolutions which bring a grand total of 160 billions photos</li><br/> <li>4 millions of line codes in php</li><br/></ul><br/>Their overall architecture can be broken into the following components:<br/><ol><br/> <li> Load balancers</li><br/> <li>Web server (php): Most of the code is written in PHP: the language is simple, it fits well for fast development environments and there are a lot of developers available. A few of the problems are CPU, Memory, how to reuse the PHP logic in other systems and the difficulty to write extensions to speed up critical parts of the code. An overview of the HipHop compiler was given: a majority of their PHP code can be converted to C++ code which is then compiled and deployed on their webserver. An apache module is coming up soon probably as a fastcgi extension.</li><br/> <li>memcached (fast, simple): A core component of their infrastructure. It's robust and scales well: 120 millions queries/second. They wrote up some patches which are now making their way to upstream.</li><br/> <li>Services (fast, complicated): David gave an overview of some of the services that Facebook opensourced:<br/> <ul><br/> <li>Thrift: an rpc server, now part of the Apache incubator.</li><br/> <li>Hive: build on top of hadoop it is now part of the Apache project. It's an SQL-like frontend to hadoop aiming at simplifying access to the hadoop infrastructure so that more people (ie non-engineers) can write and run data analysis jobs.</li><br/> <li>Scribe: a performant and scalable logging system. Logs are stored in a hadoop/hive cluster to help in data analysis.</li><br/> </ul><br/></li><br/> <li>Databases (slow, simple): 1000's of MySQL servers are used as a persistence layer. InnoDB is used for the storage engine and multiple independent clusters are used for reliability. Joins are done at the webserver layer. The database layer is actually the persistence storage layer with memcached acting as a distributed index.</li><br/></ol><br/><h3>Other talks that seemed interesting</h3><br/>I had planned to a attend a few other talks as well. Unfortunately either their schedule was conflicting with another interesting presentation or the room was completely full (which seemed to happen all day long with the NoSQL room). Here is a list of them:<br/><ol><br/> <li> <a href="http://fosdem.org/2010/schedule/events/nosql_fun_profit">NoSQL for Fun & Profit</a></li><br/> <li><a href="http://fosdem.org/2010/schedule/events/nosql_mongodb_intro">Introduction to MongoDB</a></li><br/> <li><a href="http://fosdem.org/2010/schedule/events/cloudlets">Cloudlets: universal server images for the cloud</a></li><br/> <li><a href="http://fosdem.org/2010/schedule/events/dist_project_builder">Continuous Packaging with Project-Builder.org</a></li><br/></ol>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3358115540372447242.post-58098833911788712492009-12-21T06:19:00.000-05:002010-11-29T19:08:41.829-05:00RFC: Boot-time configuration syntax for UEC/EC2 imagesAs part of the<a href="https://blueprints.launchpad.net/ubuntu/+spec/server-lucid-ec2-config"> Boot-time configuration for UEC/EC2 images specification</a> a configuration file can be passed to instances as user-data to customize some part of the instance without writing and maintaining custom scripts.<br/><br/>The goal is to support most common operations done on instance boot as well as help to bootstrap the instance to be part of an existing configuration management infrastructure.<br/><br/>It currently supports:<br/><ul><br/> <li>apt configuration</li><br/> <li>package installation</li><br/></ul><br/>Other requested features looked into include:<br/><ul><br/> <li><a href="http://alestic.com/2009/08/runurl">runurl</a> support</li><br/> <li>ssh host keys setup</li><br/></ul><br/>Should these be included as well?<br/><br/>Here is an example of a configuration file (using <a href="http://en.wikipedia.org/wiki/YAML">YAML</a> as the syntax):<br/><pre># Update apt database on first boot<br/># (ie run apt-get update)<br/>#<br/># Default: true<br/>#<br/>apt_update: false<br/><br/># Upgrade the instance on first boot<br/># (ie run apt-get upgrade)<br/>#<br/># Default: false<br/>#<br/>apt_upgrade: true<br/><br/># Add apt repositories<br/>#<br/># Default: none<br/>#<br/>apt_sources:<br/><br/> # PPA shortcut:<br/> # * Setup correct apt sources.list line<br/> # * Import the signing key from LP<br/> #<br/> # See https://help.launchpad.net/Packaging/PPA for more information<br/> #<br/> - source: "ppa:user/ppa" # Quote the string<br/><br/> # Custom apt repository:<br/> # * Creates a file in /etc/apt/sources.list.d/ for the sources list entry<br/> # * [optional] Import the apt signing key from the keyserver<br/> # * Defaults:<br/> # + keyserver: keyserver.ubuntu.com<br/> # + filename: 00-boot-sources.list<br/> #<br/> # See sources.list man page for more information about the format<br/> #<br/> - source: "deb http://archive.example.org lucid main restricted" # Quote the string<br/> keyid: 12345678 # GPG key ID published on a key server<br/> keyserver: keyserver.example.org<br/> filename: 01-mirror-example.org.list<br/><br/> # Custom apt repository:<br/> # * The apt signing key can also be specified<br/> # by providing a pgp public key block<br/> # <br/> # The apt repository will be added to the default sources.list file:<br/> # /etc/apt/sources.list.d/00-boot-sources.list<br/> #<br/> - source: "deb http://mirror.example.net/karmic/ ./" # Quote the string<br/> key: | # The value needs to start with -----BEGIN PGP PUBLIC KEY BLOCK-----<br/> -----BEGIN PGP PUBLIC KEY BLOCK-----<br/> Version: SKS 1.0.10<br/><br/> mI0ESXTsSQEEALuhrVwNsLIzCoaVRnrBIYraSUYCJatFcuvnhi7Q++kBBxx32JE487QgzmZc<br/> ElIiiPxz/nRZO8rkbHjzu05Yx61AoZVByiztP0MFH15ijGocqlR9/R6BMm26bdKK22F7lTRi<br/> lRxXxOsL2GPk5gQ1QtDXwPkHvAhjxGydV/Pcf81lABEBAAG0HUxhdW5jaHBhZCBQUEEgZm9y<br/> IE1hdGhpYXMgR3VniLYEEwECACAFAkl07EkCGwMGCwkIBwMCBBUCCAMEFgIDAQIeAQIXgAAK<br/> CRANXKLHCU0EIIJHBAC1NCwdLwchCPIQU2bd562/YWcB7QSgYD3j+Llqm8v6ghFQ0Bdygbn1<br/> M6tzpwDiPxXQfZRqGhJsluCVHGLCQYNm0HDNisP4+YrZF3UkmAXDwZuh8K3LmvUPM+lLY8YJ<br/> 1qnFHp3eN9M8/SYEFN0wlaVAurZD13NaU34UePd46vPtzA==<br/> =eVIj<br/> -----END PGP PUBLIC KEY BLOCK-----<br/><br/># Add apt configuration files<br/># Add an apt.conf.d/ file with the relevant content<br/>#<br/># See apt.conf man page for more information.<br/>#<br/># Defaults:<br/># + filename: 00-boot-conf<br/>#<br/>apt_conf:<br/><br/> # Creates an apt proxy configuration in /etc/apt/apt.conf.d/01-proxy<br/> - filename: "01-proxy"<br/> content: |<br/> Acquire::http::Proxy "http://proxy.example.org:3142/ubuntu";<br/><br/> # Add the following line to /etc/apt/apt.conf.d/00-boot-conf<br/> # (run debconf at a critical priority)<br/> - content: |<br/> DPkg::Pre-Install-Pkgs:: "/usr/sbin/dpkg-preconfigure --apt -p critical|| true";<br/><br/># Provide debconf answers<br/>#<br/># See debconf-set-selections man page.<br/>#<br/># Default: none<br/>#<br/>debconf_selections: | # Need to perserve newlines<br/> # Force debconf priority to critical.<br/> debconf debconf/priority select critical<br/><br/> # Override default frontend to readline, but allow user to select.<br/> debconf debconf/frontend select readline<br/> debconf debconf/frontend seen false<br/><br/># Install additional packages on first boot<br/>#<br/># Default: none<br/>#<br/>packages:<br/> - openssh-server<br/> - postfix<br/></pre><br/>I would like to get feedback on the format as well as ideas for other features, either on the <a href="https://wiki.ubuntu.com/ServerLucidCloudConfig">wiki page</a> or in the comments section.Unknownnoreply@blogger.com2tag:blogger.com,1999:blog-3358115540372447242.post-15170894194385376252009-11-30T14:57:00.000-05:002010-11-29T19:08:41.833-05:00RFP: packages to promote to main and demote to universe for Lucid Lynx
LTSThe Ubuntu Server team is requesting feedback on the<a href="https://wiki.ubuntu.com/LucidServerSeeds"> list of packages</a> to be promoted to main and demoted to universe during this release cycle.<br/><br/>Lucid being an LTS release we wanna make sure that packages in main are maintainable for 5 years. Useful packages should be promoted to main while packages that provide duplicated functionalities or are not maintained anymore should be demoted to universe.<br/><br/>The <a href="https://wiki.ubuntu.com/LucidServerSeeds">LucidServerSeeds wiki page</a> is used to track packages under discussion. If you want to add a package to this discussion you should edit the relevant section (either Proposed universe demotions or Proposed main promotions) of the wiki page.<br/><br/>For example the current list of proposed packages to be moved to universe includes<br/><ul><br/> <li>nis</li><br/> <li>elinks</li><br/> <li>lm-sensors</li><br/> <li>sensord</li><br/> <li>cricket</li><br/> <li>radvd</li><br/> <li>logwatch</li><br/> <li>vlock</li><br/> <li>lilo</li><br/> <li>libxp6</li><br/></ul><br/>The current packages being discussed for main promotion include acl, ctdb and tdb-tools (to support Samba cluster). A switch from autofs 4 to autofs 5 is also under discussion.<br/><br/>Any feedback is welcome and should be added to the <a href="https://wiki.ubuntu.com/LucidServerSeeds">wiki page</a>.Unknownnoreply@blogger.com2tag:blogger.com,1999:blog-3358115540372447242.post-74492784052379697782009-10-16T16:27:00.001-04:002010-11-29T23:09:55.353-05:00Oct 13 - Oct 16 Wrap-up<h3>UEC</h3><br />
<ul><li>loads of testing. Uncovered new bugs and help Dusting fix most of them.<br />
<ul><li> multiple installs on two sets of hardware in Montreal.</li>
<li>stress testing.</li>
</ul><br />
</li>
<li>help scott and other to debug their UEC install.</li>
<li>review and upload image-store-proxy (working now).</li>
</ul><br />
<h3>Upgrades testing</h3><br />
<ul><li> help out mvo to add logic to handle mysql 5.0 upgrade from jaunty to karmic.</li>
<li>Support MySQL cluster setup.</li>
</ul><br />
<h3>Sponsoring</h3><br />
<ul><li>review and sponsor checkbox for marc.</li>
<li>review and sponsor landscape-client new upstream release.</li>
</ul>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3358115540372447242.post-507901292221044672009-10-05T04:00:00.001-04:002010-11-29T23:10:28.442-05:00Sep 28 - Oct 02 Wrap-upLoads of Karmic Beta -server isos testing.<br />
<br />
One day of UEC Beta testing: chased down with Dustin and Matt the failure of the auto-registration upstart scripts. Turns out to be a bug in Upstart - known by Scott who has a simple fix (dbus related).<br />
<br />
Investigate failed RAID installation: this is a known boot loader issue. Added a section about it to the Karmic Release note.<br />
<br />
Install UNR Karmic beta on my mini 10v. Write up <a href="http://ubuntumathiaz.wordpress.com/2009/10/01/test-run-ubuntu-netbook-remix-9-10-beta-on-my-dell-mini-10v/">blog post</a> about it. Looks slick.<br />
<br />
Put shorewall back into main. Fell off to universe due to a package rename in Debian.<br />
<br />
More work on directory/krb5 infrastructure using puppet: add support for slapd modules and schemas to the puppet configuration. Slow progress towards a fully automated deployment of a directory+krb5 infrastructure for testing purposes in EC2.<br />
<br />
Update server team knowledge with the lists of daily New,Undecided Bugs so that daily triaging can be kicked off. The lists are automatically generated on <a href="http://qa.ubuntu.com/reports/ubuntu-server-team/">qa.ubuntu.com</a>.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3358115540372447242.post-89039473193888618572009-10-01T18:13:00.000-04:002010-11-29T19:08:41.850-05:00Test run: Ubuntu Netbook Remix 9.10 Beta on my Dell Mini 10vImpressive for a beta release. Of course there are few glitches but overall it feels great: I'm writing this article from my Mini 10v running an Ubuntu Netbook Remix 9.10 Beta live system.<br/><br/>At the begining of the week I received a Dell Mini 10v I had ordered a few of weeks ago. I had chosen to upgrade some of the default components: my Mini 10v comes with 2Gb of RAM and a 16 GB SSD drive. And of course Ubuntu Hardy 8.04 LTS is installed by default at the factory. Now that the Beta of Karmic <a href="https://lists.ubuntu.com/archives/ubuntu-announce/2009-October/000125.html">has been released</a> I decided to take the opportunity to download <a href="http://releases.ubuntu.com/karmic/ubuntu-9.10-beta-netbook-remix-i386.iso">the Ubuntu Netbook Remix iso</a> and boot from a usb stick to see how this variant of Ubuntu looked like.<br/><h3>Load Ubuntu Netbook Remix on a USB key</h3><br/>But first things first. In order to be able to boot the UNR Beta iso, I had to put it on a usb stick. The <em>USB Startup Disk Creator</em> application located under System -> Administration proved to be best option:<br/><ol><br/> <li>Download the <a href="http://releases.ubuntu.com/karmic/ubuntu-9.10-beta-netbook-remix-i386.iso">UNR Beta iso image</a>.</li><br/> <li>Connect your usb key to the computer. I was actually using a 1GB SD card from my camera with an USB adapter.</li><br/> <li>Open USB Startup Disk Creator.</li><br/> <li>Select the UNR beta iso image and the usb drive (which may need to be formatted).</li><br/> <li>And make the startup disk.</li><br/></ol><br/><h3>The boot experience</h3><br/>I plugged the usb stick in one of the Mini 10v usb port, powered on my netbook and hit F12 early in the boot sequence to bring the boot menu. And there - as the second choice - was my USB stick.<br/><br/>Loading the whole system took some time in which I had the time to admire the new boot experience - well I wasn't that surprised as my main laptop had been running Karmic for a while now. But still it looked slick as the new black and white theme matched very well with my Mini 10v colours - black for most of the parts with a light grey stripe below the keyboard.<br/><br/>After being auto-logged in I was greeted with the new launcher and started to poke around. Turns out that tapping on the touch pad doesn't work. I had to use the buttons at the bottom to actually click (which is a bit annoying since the pad is sensitive around the click area - it can lead to some mouse movement while trying to click).<br/><h3>No wireless available</h3><br/>Restricted driver popped up to tell me that I could install some non-free drivers. I had two choices all related to the wireless card:<br/><ol><br/> <li>The B43xxx wireless driver. I tried to activate it: packages seemed to get installed - however the driver was still disabled after that.</li><br/> <li>The STA wireless driver. Tried to activate it as well. This time the driver seemed to have correctly installed. However a reboot of the system was required - which is a bit annoying when you run from a live USB key.</li><br/></ol><br/>Selecting each driver popped a prompt for entering a password in order to be able to install packages. Turns out the password is empty and just pressing the Enter key make things go away. I wonder if this dialogue could be completely deactivated during a live session - that would improve the experience of complete new user.<br/><br/>So no wireless available on my Mini 10v running from the live USB key. Time to plug a wired network cable. And a few seconds later I was connected to the Internet.<br/><h3>Application Names ...</h3><br/>In the Favorites sub menu - which is the first thing you see when your session starts - there are a couple of applications: Mozilla Web Browser, Evolution Mail and Calendar, Cheese, Empathy, Help, Ubuntu One and Install Ubuntu Netbook Remix 9.10. All of these choices have recognizable names except for Cheese and Empathy. Of course I know about these being a long time Ubuntu user - however it may be more difficult for a first time user. Even though there is a small webcam as part of the Cheese icon and the Empathy icon kind of relates to communication having a descriptive name would probably be helpful.<br/><h3>... and Ubuntu One ...</h3><br/>As for the Ubuntu One option, it doesn't give a clue of what this is about. So my curious nature lead me to start the application (well... I knew what Ubuntu One was as I had been an early beta-tester). The Ubuntu One icon appeared in the top menu bar. I could go the web and log into my account by right-clicking on the icon. However I didn't find an obvious way to associate my local instance with my remote account.<br/><h3>... and sound</h3><br/>Further poking around lead me to the Sound and Video sub menu where I tried to record a sound. First attempt failed. Opening the Volume Control from the File Menu and going to the Input tab showed me that the input was actually muted. Unmute it and voila - a few moments later I could hear my voice being played back!<br/><br/>So all in all I was pleasantly surprised by the beta version of UNR. A few glitches here and there (to <a href="https://bugs.launchpad.net/ubuntu/+filebug">be reported in LP</a> of course) but overall the experience was positive!<br/><h3>Next step:</h3><br/>Actually install the system on the local SSD drive and experience the fast boot of Ubuntu on my Mini 10v. With an SSD drive I expect it to be below (9.)<em>10 seconds</em>.Unknownnoreply@blogger.com15tag:blogger.com,1999:blog-3358115540372447242.post-30751280515517026822009-09-27T16:22:00.001-04:002010-11-29T23:10:35.553-05:00Sep 20 - Sep 25 Wrap-upSpent most of my week in Portland to attend conferences.<br />
<h3>Conferences</h3><br />
<ul><li> Attended LDAPCon 2009 and published <a href="http://ubuntumathiaz.wordpress.com/2009/09/25/a-summary-of-ldapcon-2009/">report</a>.</li>
</ul><br />
<ul><li>Attended LinuxCon 2009.</li>
</ul><br />
<h3>Image Store Proxy</h3><br />
<ul><li> Updated image-store-proxy to 1.0. This version brings support for gpg signed images. Still need testing against the real-world Canonical Image Store infrastructure.</li>
</ul>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3358115540372447242.post-72773136627991062832009-09-25T16:02:00.000-04:002010-11-29T19:08:41.864-05:00A summary of LDAPCon 2009<div id="a-summary-of-ldapcon-2009" class="document"><br/><br/>On Sunday, September 20th and Monday, September 21st I attended <a class="reference external" href="http://www.symas.com/ldapcon2009/papers.shtml">LDAPCon 2009</a> in Portland, OR. Most of the open source projects were there - with the notable absence of <a class="reference external" href="http://port389.org/">Port 389</a> (Redhat) - as well as some vendors (Apple and <a class="reference external" href="http://www.unboundid.com/">UnboundID</a>). Most of the <a class="reference external" href="http://www.symas.com/ldapcon2009/papers.shtml">slides are available online</a>.<br/><div id="apache-directory-project" class="section"><br/><h3><a class="reference external" href="http://directory.apache.org/">Apache Directory</a> project</h3><br/>The <a class="reference external" href="http://directory.apache.org/">Apache Directory</a> folks gave several presentations:<br/><br/><a class="reference external" href="http://directory.apache.org/">Apache Directory Server</a> provides an integrated product with most of the standard network services: in addition to ldap, dns, dhcp, ntp and kerberos services can be enabled as part of a deployment. Kerberos support seems to be in early stage as it <strong>almost</strong> works. Another interesting aspect of the project is its integration with the Eclipse environment. Apache Directory Server is embedded in Apache Directory Studio. The latter provides a management tool for Directory administrator. If the Eclipse integration in Ubuntu is improved Apache Directory Studio would be a very good addition to the archive.<br/><br/>An overview of implementing replication in the Apache Directory Server project was given. <a class="reference external" href="http://www.rfc-editor.org/rfc/rfc4533.txt">RFC 4533</a> is used as the basis for LDAP replication in <a class="reference external" href="http://www.openldap.org/">OpenLDAP</a>. The goal here was to be able to replicate between Apache Directory Server and <a class="reference external" href="http://www.openldap.org/">OpenLDAP</a>. This may be the start to a standard replication protocol between different directory products.<br/><br/>Three components needed to be implemented:<br/><ul><br/> <li>the consumer part is the easiest and can be a standalone component. It receives LDAP entries updates and can do whatever it wants with them. It reminds me of similar requests I heard at the MySQL User Conference last April where people were interested in having an easier access to the MySQL replication log.</li><br/></ul><br/><ul><br/> <li>the producer is more complex to implement as it requires to keep a log of the modifications done on the server.</li><br/></ul><br/><ul><br/> <li>conflict resolution is the hardest part and mandatory if multi-master is to be supported. The Apache Directory Server decided to implement a strategy of <em>last writer wins</em> as they're trying to not require any user intervention for conflict resolution. I'm not convinced this is the best strategy though.</li><br/></ul><br/>While implementing replication support they've also added support for store procedures and triggers.<br/><br/></div><br/><div id="lsc-project-ldap-synchronization-connector" class="section"><br/><h3><a class="reference external" href="http://lsc-project.org/">LSC Project</a>: LDAP Synchronization Connector</h3><br/>Corporate environments usually have multiple identity repositories and keeping all of them in sync can be quite a challenge. The <a class="reference external" href="http://lsc-project.org/">LSC project</a> aims at automating the task of keeping all identity stores up-to-date. Written in java it can read and write to any database or LDAP directory. On-the-fly transformation of data sources are possible and the framework tries to make it easy to implement new synchronisation policies.<br/><br/>Another great tool that could be added to the directory administrator toolbox to help integrate Ubuntu in existing infrastructures.<br/><br/></div><br/><div id="storing-ldap-data-in-mysql-cluster-openldap-and-opends" class="section"><br/><h3>Storing LDAP Data in MySQL Cluster (<a class="reference external" href="http://www.openldap.org/">OpenLDAP</a> and <a class="reference external" href="http://www.opends.org/">OpenDS</a>)</h3><br/>This was a joined presentation between the <a class="reference external" href="http://www.openldap.org/">OpenLDAP</a> and <a class="reference external" href="http://www.opends.org/">OpenDS</a> projects. A new backend has been added to store entries using the MySQL Cluster NDB API. The main advantage is to be able to access the same data over SQL and LDAP as well as providing a highly-available infrastructure with data distributed on multiple nodes. Both <a class="reference external" href="http://www.opends.org/">OpenDS</a> and <a class="reference external" href="http://www.openldap.org/">OpenLDAP</a> have worked together to create a common data model highlighting that cooperation does happen in the LDAP space.<br/><br/></div><br/><div id="a-panel-discussion-among-the-representatives-of-the-various-ldap-projects-on-roadmaps" class="section"><br/><h3>A Panel discussion among the representatives of the various LDAP Projects on roadmaps</h3><br/>Sunday ended up with a panel where representatives of different directory vendors answered questions from the audience. Each open source project briefly outlined a few points they were trying to improve: documentation for <a class="reference external" href="http://www.openldap.org/">OpenLDAP</a>, data migration for <a class="reference external" href="http://directory.apache.org/">Apache Directory</a> and multiple schema support for <a class="reference external" href="http://www.opends.org/">OpenDS</a>. The issue of virtual directories was also discussed with the need of more GUIs to cover administration tools as well as workflows. Apache Directory Studio was given as a potential good starting point to build these higher level tools. The subject of standard ACL's was also covered. It seems that this is still a sensitive issue in the community and projects are still arguing about a common solution. One option put forward was to look at the X500 ACL model and start from there.<br/><br/>The last item of discussion covered how to expand the user base of directories. The world of directories is rather small and its use cases are usually associated with Identity Management (User and Group, Authentication). Having good client APIs was mentioned as an option. However the whole group ran out of ideas quickly and got kind of stuck in front of this problem.<br/><br/></div><br/><div id="directory-standardization-status" class="section"><br/><h3>Directory Standardization Status</h3><br/>Directory standardization happens within two bodies: <a class="reference external" href="http://www.x500standard.com/">X500 in ISO/IEC</a> and LDAP in IETF. The most important topic currently discussed in both bodies is password policies. A new draft of an IETF document is being worked on by Howard Chu and Ludovic Poitou.<br/><br/><dl class="docutils"> <dt>Other topics being worked on cover:</dt> <dd><br/><ul><br/> <li>Internationalization (with Unicode support in LDAPprep and SASLprep)</li><br/></ul><br/><ul><br/> <li>simple LDAP Transactions (to cover adding entries to different containers)</li><br/></ul><br/><ul><br/> <li>replacing DIGEST-MD5 with SCRAM</li><br/></ul><br/><ul><br/> <li>vCard support</li><br/></ul><br/></dd> </dl>On the front of Directory Application schemas support for <strong>NFSv4 Federated Filesystem</strong> and <strong>an Information Model for Kerberos</strong> are currently being worked on with drafts available for review.<br/><br/><dl class="docutils"> <dt>The question of starting a new LDAP working group within the IETF was raised. Topics that could be covered include:</dt> <dd><br/><ul class="first last simple"><br/> <li>LDAP Chaining Operation</li><br/> <li>Access controls: based on the X.500 model with extensibility added.</li><br/> <li>LDIF update</li><br/> <li>LDAP Sync/ LDAP Sync-based Replication</li><br/> <li>Complex Transactions</li><br/> <li>Password Policies</li><br/> <li>Directory views</li><br/> <li>Schema versioning</li><br/></ul><br/></dd> </dl></div><br/><div id="ldap-in-the-java-world" class="section"><br/><h3>LDAP in the java world</h3><br/>LDAP support in java is being actively worked on especially on the SDK front. <a class="reference external" href="http://www.opends.org/">OpenDS</a>, <a class="reference external" href="http://directory.apache.org/">Apache Directory Server</a> and <a class="reference external" href="http://www.unboundid.com/">UnboundID</a> have released new open-sourced SDKs to improve the aging JNDI and Netscape java SDKs. All of them are rather low-level implementations. The three projects are also working together to find a common ground.<br/><br/>There is also some progress made at the persistence level. The <a class="reference external" href="http://www.datanucleus.org/">DataNucleus</a> project gave an overview of adding LDAP support to the standard JDO interface. The goal is to provide a reference implementation of JDO for an LDAP data store.<br/><br/></div><br/><div id="unified-authentication-service-in-openldap" class="section"><br/><h3>Unified Authentication Service in <a class="reference external" href="http://www.openldap.org/">OpenLDAP</a></h3><br/>Howard Chu gave an overview of the new modules developed in <a class="reference external" href="http://www.openldap.org/">OpenLDAP</a> related user authentication. Based on the work from nss-ldapd the nssov overlay provides integration with the pam stack as well as the nss stack. Disconnected mode in the pcache overlay has been added in the latest version of openldap as discussed during the Ubuntu Developer Summit last May. Most of this work is already available in Ubuntu Karmic and improvements should be made during the Lucid release cycle.<br/><br/>Another interesting module is the integrated certification authority. If a search request for the userCertificate and userKey attributes for an entry is made and these attributes don't exist they're generated on the fly. This should help out in creating an X.509 base PKI.<br/><br/></div><br/><div id="ldap-innovations-in-the-opends-project" class="section"><br/><h3>LDAP Innovations in the <a class="reference external" href="http://www.opends.org/">OpenDS</a> project</h3><br/>The last session of the conference was given by Ludovic Poitou of the <a class="reference external" href="http://www.opends.org/">OpenDS</a> project. New features available in <a class="reference external" href="http://www.opends.org/">OpenDS</a> include tasks as well as extended syntax rules. Time matching rules have also been added so that queries like "give me entries that have a last login time older than 3 weeks" can be expressed directly in ldap and processed by the server. That brought some interesting issues when clients and servers don't share the same timezone.<br/><br/></div><br/><div id="a-few-gems-from-beer-conversations" class="section"><br/><h3>A few gems from beer conversations</h3><br/>After the official sessions ended most of the attendees congregated to have diner followed by beers. Howard showcased his G1 phone running slapd while Ludovic was showing off an LDAP client application on his iPhone. And of course by then end of the conference both systems were connected: the iPhone was able to look up contact information on the G1 running slapd.<br/><br/>On an unrelated note <a class="reference external" href="http://www.openldap.org/">OpenLDAP</a> is faster than <a class="reference external" href="http://www.opends.org/">OpenDS</a>, even in beer drinking. However the <a class="reference external" href="http://www.openldap.org/">OpenLDAP</a> project was compared to a Beetle car with a Porsche engine whereas <a class="reference external" href="http://www.opends.org/">OpenDS</a> was actually building a Porsche.<br/><br/>Even though not all the players in the directory space were represented at the conference, most of the key players from the open source world were there presenting their work. Friendly competition exists amongst the different projects which turns into cooperation on topics that matters such as interoperability and data formats.<br/><br/>It seems that the directory world is rather small and its use cases are restricted to specific situations compared to RDBMS. This is rather unfortunate as directories offer a compelling alternatives to databases as a data store infrastructure. The community seems to be aware of this issue and is looking into breaking out of its traditional fields of applications.<br/><br/></div><br/></div>Unknownnoreply@blogger.com9tag:blogger.com,1999:blog-3358115540372447242.post-19169012967627313982009-09-18T16:34:00.001-04:002010-11-29T23:10:41.813-05:00Sep 11 - Sep 18 Wrap-up<h2>Image-store-proxy</h2><br />
Package image-store-proxy to enable the Image Store tab in Eucalyptus. The package (python-image-store-proxy) has made its way to main and on the -server isos in time for alpha6 with the help of Thierry and Kees.<br />
<h2>Server-karmic-directory-enabled-user-login</h2><br />
Kept on investigating the use of puppet to build an ldap/krb5 infrastructure on EC2. Integrated dnsmasq and puppetmaster configuration. Discovered a few bugs along the way and reported them upstream. My current work is available from <a href="https://code.launchpad.net/%7Emathiaz/+junk/puppet-config/"> an LP branch</a>. And puppet is awesome!<br />
<h2>Alpha6 ISO testing</h2><br />
Loads of alpha6 testing.<br />
<h2>Landscape-client Stable Release Update</h2><br />
Reviewed the landscape-client and smart <a href="https://bugs.launchpad.net/bugs/347983">SRU requests</a> from the Landscape team.<br />
<h2>Bug scripts</h2><br />
With the help of Brian <a href="https://code.launchpad.net/%7Emathiaz/+junk/multi-package-bugs-fixed/">my bug scripts</a> are now regularly run on qa.ubuntu.com. All bug lists used in the SRU review and the triaging process can be found <a href="http://qa.ubuntu.com/reports/ubuntu-server-team/"> on qa.ubuntu.com</a>.<br />
<h2>Misc</h2><br />
Updated my status report script to publish a draft of my activity report on my blog as the weekly "wrap-up".Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3358115540372447242.post-88762779113323100822009-09-11T15:22:00.001-04:002010-11-29T23:10:51.490-05:00Sep 07 - Sep 11 Wrap-up<h2><a href="https://blueprints.launchpad.net/ubuntu/+spec/server-karmic-directory-enabled-user-login"> Server-karmic-directory-enabled-user-login</a></h2><br />
Upload new sssd package to fix lintian errors and pull two fixes from upstream. Brainstorm with upstream about testing the package.<br />
<br />
Prepare and upload openldap 2.4.18 to Karmic once the FFe was granted. That should complete the last part of the specification and brings disconnected mode support on the client via the cache overlay.<br />
<br />
Looked into using puppet to build an openldap/krb infrastructure to test all the directory related components on the client side (sssd, openldap pcache overlay). The idea is to be able to pull up and down complete environments within minutes using a combination of EC2/UEC and puppet.<br />
<br />
Follow up on puppet promotion into main for karmic.<br />
<br />
Ended up writing a custom puppet type to handle slapd modules using the default karmic configuration. This gave me a good overview of how puppet is working.<br />
<h2>Imate-store packaging</h2><br />
Looked at packaging. Follow-up call with Gustavo. Should have a package ready on Monday in time for alpha6. More polishing will be done for beta.<br />
<h2>Apport in the default server install</h2><br />
Add apport to the default server install as requested by steve beattie for the <a href="https://blueprints.launchpad.net/ubuntu/+spec/karmic-qa-apport-in-ubuntu-server">karmic-qa-apport-in-ubuntu-server specification</a>.<br />
<h2>Linux-virtual missing virtio modules</h2><br />
Chase down and confirmed that linux-virtual kernel doesn't have any of the virtio modules. <a href="https://bugs.launchpad.net/bugs/423426">Bug 423426</a> is milestoned and should be on the release team radar. This has a high importance as virtio vms cannot boot in karmic. Tim is working on it.<br />
<h2>Mysql maintenance</h2><br />
Caught up on (lots of) mysql 5.0 and 5.1 bugs. Updated <a href="https://wiki.ubuntu.com/DebuggingMySQL">DebuggingMysql</a> page in the process of triaging bugs.<br />
<br />
Upload mysql 5.0 and 5.1 to fix a couple of bugs. Both mysql-server-core-5.{0,1} packages provide mysql-server-core which should be used by packages requiring the mysqld binary (such as akonadi).<br />
<h2>Sru-workflow</h2><br />
Write up a script to get a list of ubuntu-server SRU bugs assigned to people. This produces the remaining list to be reviewed during the team meeting with the updated SRU workflow in the ubuntu-server team.<br />
<h2>Sponsoring</h2><br />
Reviewed checkbox merge proposal from Marc. Asked for a FFe as there is one new feature.<br />
<h2>LDAPcon/LinuxCon</h2><br />
Arrange travel for LDAPcon/linuxcon in Portland, OR next week.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3358115540372447242.post-26849920065012558832009-06-13T00:00:00.000-04:002010-11-29T19:08:41.872-05:00Merges of the Weekend: suggestions from the Ubuntu Server teamGot some time for a couple of merges this week-end? I've just updated the list of packages that look easy to merge on the <a href="https://wiki.ubuntu.com/ServerTeam/Roadmap">Ubuntu Server Team roadmap</a>:<br/><ul><br/> <li>vsftpd and amavisd-new in main</li><br/></ul><br/><ul><br/> <li>asterisk, heimdal and boinc in universe.</li><br/></ul><br/>The<a href="https://wiki.ubuntu.com/UbuntuDevelopment/Merging"> Merging wiki page</a> gives an overview on the process and once your debdiff is ready you can upload it to karmic or file a bug and ask for <a href="https://wiki.ubuntu.com/SponsorshipProcess">sponsorship</a> if you don't have access to the archive yet.<br/><br/>And if you knock down every package on the list above <a href="https://merges.ubuntu.com/">Merge-O-Matic</a> provides with a full list of packages waiting to be merged.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3358115540372447242.post-43397875958807411322009-04-30T16:31:00.000-04:002010-11-29T19:08:41.874-05:00Are configuration management tools still needed in the cloud?Cloud is the buzz word of the year and with the <a href="http://www.ubuntu.com/products/whatisubuntu/serveredition/cloud/uec">Ubuntu Enterprise Cloud</a> available in Ubuntu 9.04 everyone will be able to build its own private cloud to experiment. As a cloud base infrastructure provides more flexibility and dynamism in the computing infrastructure it seems that configuration management tools will become more and more important in the future.<br/><h4>What's the purpose of a configuration management tool ?</h4><br/>According to <a href="http://en.wikipedia.org/wiki/Configuration_management">wikipedia</a>:<br/><blockquote><strong>Configuration management</strong> (CM) is a field of management that focuses on establishing and maintaining consistency of a system's or product's performance and its functional and physical attributes with its requirements, design, and operational information throughout its life.<sup class="reference"><a href="http://en.wikipedia.org/wiki/Configuration_management#cite_note-0"></a></sup> For information assurance, CM can be defined as the management of security features and assurances through control of changes made to hardware, software, firmware, documentation, test, test fixtures, and test documentation throughout the life cycle of an information system</blockquote><br/>While the definition above is quite generic system administrators are using configuration management tools such as <a href="http://reductivelabs.com/trac/puppet">puppet</a> or <a href="http://cfengine.org/">cfengine</a> in order to automate system deployments and making sure that every instance providing a specific service has the same configuration. Another service provided by these tools is to automatically distribute configuration changes to all running systems.<br/><h4>How does this apply to a cloud infrastructure?</h4><br/>The cloud model as implemented by the Ubuntu Enterprise Cloud is based on the golden image principle. Each system is based on a static image. The cloud infrastructure is then used to spawn new instances of a specific image. This is one of the characteristics of such an infrastructure: deploying new systems is easier, faster and cheaper. Potential resources are much larger than before.<br/><br/>However one of the issue with the golden image model is that over time there is a drift between running systems and the golden image. When a configuration update is made to the service the offline golden image also needs to be updated. Moreover a configuration management system is needed to push the changes to running systems.<br/><br/>Let's take the example of a web hosting infrastructure running 20 instances of an apache server. How would a new virtual host be defined?<br/><br/>With a configuration management system a new virtual host is defined in the central repository and the tool deploys the new virtual host definition to all running systems.<br/><br/>Applying the combination of the golden image feature with the ease of deployment provided by a cloud infrastructure would lead to defining the new virtual host in one running system, updating the golden image, spawning 20 <strong>new</strong> instances and swapping them with the old ones in the web infrastructure.<br/><br/>It seems strange at first to re-bundle a new image and redeploy all of your servers just for a one-line configuration change. One reason may be that system re-installation has always been seen as a last resort option in a traditional infrastructure. This assumption is no longer true in the cloud with its fast and easy provisioning feature.<br/><h4>What are the advantages of the golden image pattern?</h4><br/>Rolling back a configuration change is much faster as both revisions of the service are running at the same time. System administrator don't need to learn another tool and can just use their standard ways of administrating a single server.<br/><br/>However some issues remain:<br/><br/>How about applying a configuration change to different golden images? A dozen of images still need to be booted and the change made everywhere. We're back to square one. Configuration management tools have the concept of classes and each system will apply specific configurations according to the defined classes. This is done to avoid redundancy in configuration definition. Having just a set of golden images creates redundant configuration. However the amount of images to change is much smaller than dealing with hundreds of instances.<br/><br/>How about tracking changes between configuration? Most of the configuration management tools suggest to keep the central repository under revision control so that changes made to the environment can easily be tracked. With golden images we're lacking the tools to store multiple versions of golden images and perform image diffs: what is the difference between the running system and its base offline image, between two revisions of the same golden image, between two running systems? Having access to such tools would be very useful to system administrators to help the debugging process.<br/><br/>In conclusion configuration management tools have been used for some time by groups running big infrastructures with lots and lots of systems to manage. The dynamism of the cloud brings the same problems to its users even if they are only using a couple of instances to run their infrastructure. Configuration management tools should probably be considered as an essential tool when moving into the cloud.Unknownnoreply@blogger.com4tag:blogger.com,1999:blog-3358115540372447242.post-29792622423769453982009-03-05T19:01:00.000-05:002010-11-29T19:08:41.877-05:00March, 12th 2009: The Thursday samba bugs were exterminatedI call for Ubuntu Bug warriors to unite on the 12th day of the month of March of the year of 2009 and march all together to squash bugs related to the samba package. Instructions for first timers will be provided in a wiki page as well as a list of prime targets. Veterans are also encouraged to join and focus on the most complex issues while providing support for the rest of the troops in the #ubuntu-bugs IRC channel on Freenode.<br/><br/>Join us in the battle for improving the robustness of the three daemons smbd, nmbd and winbindd and turn the next Ubuntu Bug day into a victory for all of the Samba users in Ubuntu.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3358115540372447242.post-100881094677941892008-09-18T18:20:00.002-04:002010-11-29T19:07:09.618-05:00Automate Ubuntu Server iso testingFor each milestone the ubuntu-server isos need to be tested. There are two of them and <a href="https://wiki.ubuntu.com/Testing/Cases/ServerInstall">ten test cases defined</a>. That makes twenty installations to be performed and checked.<br />
<br />
The first one is fun and you got to discover all the new options. During the second one you're still amazed by all the shiny new features. The excitement fades away one installation after the other. By the time you've reached the thirteenth it's been four hours and you're getting sick of the blue and red colors. Your mind starts to wander "if I could automate all of this, I could watch the latest episode of [name your favorite TV show here] instead of triggering an epileptic fit due to the red and blue stroboscopic effect of the installer"...<br />
<br />
Virtualization, templating and scripting come to the rescue ! Here is an overview of the workflow I've developed and refined to conduct ubuntu-server iso testing at each milestone.<br />
<h3>Generate one preseed for each test case</h3><br />
Although each test case has its own preseed file most of the content is identical in all of them. Only the package installation and the partition setup can differ. The <a href="http://www.makotemplates.org/">mako template engine</a> is used to generate all of the presseed files.<br />
<br />
Each test case template inherits from a base template which has all of the content. The base template defines two functions - pkg_install and partition - that will be called in each test case template to define the correct content of the preseed file.<br />
<h3>Remaster the iso for each test case</h3><br />
Once the preseed has been created the original server iso is remastered. The preseed is added to the iso and the isolinux configuration file is modified to use it. The installer is also booted with a debconf level of critical. All necessary debconf questions have preseeded answers either from the kernel command line (via the isolinux configuration line) or via the generated preseed file. The goal is to have the installation fully automated without any interaction required.<br />
<h3>Create a guest for each test case</h3><br />
The next step is to define a virtual guest for each test case that is set to boot from the remastered iso. A qcow2 file is created to be used as the root partition. A template file is used to create the libvirt configuration file for the guest. It has a custom network address and correct paths to the root disk and remastered iso file.<br />
<h3>Automate the whole setup phase</h3><br />
The three operations outlined above are automated for each ubuntu-server isos to be tested. In the end there are twenty guests ready to be started.<br />
<br />
The directory looks like this:<br />
<code><br />
intrepid-server-amd64-default/<br />
intrepid-server-amd64-dns-server/<br />
intrepid-server-amd64-lamp/<br />
[...]<br />
intrepid-server-i386-default/<br />
intrepid-server-i386-dns-server/<br />
[...]</code><br />
<br />
Each intrepid-server-* directory has the following structure:<br />
<code><br />
preseed<br />
remaster.iso<br />
vm/libvirt.xml<br />
vm/root.qcow2<br />
vm/root.qcow2.orig<br />
</code><br />
<h3>Run the installation phase of each test case</h3><br />
Once the guests are setup they are booted one after the other. However a guest installation requires IO on the host. Once installed the guest vm reboots and then idles. In order to not bring down the server, guests are only booted if the load is below a certain threshold.<br />
<h3>Perform the test procedure and report the result</h3><br />
After a couple of hours all of the guests are installed, rebooted and ready to be tested. This part of the process boils down to login via ssh on each guest and follow the test case procedure from the <a href="https://wiki.ubuntu.com/Testing/Cases/ServerInstall">ServerInstall wiki page</a>. The outcome is then reported to the <a href="http://iso.qa.ubuntu.com/">iso testing tracker</a>.<br />
<br />
This part is still manual. The next step is to automate it: run the test procedure according to the installation that has been done and report the result to a central location. Once implemented whenever a new ubuntu-server iso is available it can be automatically downloaded, configured, installed and tested. Continuous integration testing is just around the corner !Unknownnoreply@blogger.com0