I did my very first @ignitebristol talk yesterday evening and I loved every moment of it. 15 seconds a slide was quite an experience! Great talks to listen to before I made my way to stage – so many great people there.

The Cloud is a way of thinkingfeeling and implementing  platforms and software that enable us to move past the concerns of unitary machines and into those of autonomic services.

There are many blogs, books, talks and presentations which define ways to build a ‘cloud’ architecture for your services. Sadly most concentrate on how to work within one framework or another built by corporations trying to tie you to their way of thinking.

The cloud has always been much more than this. I speak of course of RackSpace, Amazon and any number of other ‘cloud’ providers. Put simply:

Neither virtualization nor tooling defines the Cloud

The plethora of tools and tooling frameworks which surround these group of companies is immense. All of them designed to fix the woeful inadequacies of the underlying platform when dealing with the Cloud idea.  These companies and their products have their place in the world and many a business owes them their existence and prosperity. However:

The general acceptance of a method of execution does not define the success of understanding of an idea.

These companies do not offer you access to a real cloud. They offer large-scale virtualization technologies which mimic what a cloud should do – it’s a bit like putting an after market exhaust and a dump valve on an on old Nissan. Sure it sounds good but you really won’t pull away from anything very fast at all.

The speed, ease, reliability and flexibility of deployment, scaling, monitoring and operation within these platforms is inadequate.

The idea isn’t to scale in 30 minutes. It’s to allow the machines to scale themselves within context and constraint within minutes of understanding the need to do so.

The idea isn’t for a humans or machines to watch simple statistics to aid decision-making. It’s to allow your services to gain a contextual understanding of each constituent part’s operation and allow the convergent intelligence inherent to make real-time decisions.

The idea isn’t to re-invent the wheel when it comes to deployment but to leverage decades of experience to make sure we move past this triviality and tackle the hard problems.

That’s impossible in most current ‘cloud’ platforms unless dealing with the most trivial of services.  The last decade of my professional life has been in one ‘cloud’ environment or another. Uniformly, all have failed to show any glimmer of real understanding of the Cloud idea. Till last year, but more on that later.

This is the small, gentle introduction to what will be a series showing you how to build a real cloud services architecture.

You’ll need a GitHub account, a Joyent public cloud account, a local CFEngine installation and some understanding of Python and Bash.

It will be fun, I hope you’ll join me .

Follow me (@khushil) or bookmark this blog to get each installment as it arrives. Click on the ‘About Me‘ page to find out how to get in touch with me.

One of the first things I do on any machine is lock it’s network access down. It makes sense to make sure you only run and expose what you need, to who you need to.

Let’s take a look at what’s open by default on a standard smartmachine install:

Khushils-MacBook-Pro:~ kdep$ nmap -Pn SERVER_IP_ADDRESS
Starting Nmap 6.25 ( http://nmap.org ) at 2012-12-19 22:18 GMT
Nmap scan report for SERVER_IP_ADDRESS
Host is up (0.021s latency).
Not shown: 992 closed ports
22/tcp open ssh
37/tcp filtered time
119/tcp filtered nntp
135/tcp filtered msrpc
139/tcp filtered netbios-ssn
445/tcp filtered microsoft-ds
563/tcp filtered snews
6969/tcp filtered acmsoda
Nmap done: 1 IP address (1 host up) scanned in 43.24 seconds

That’s not very secure.

Update 20-12-2012 : As @jasonh pointed out, that is actually secure, with only port 22 being open. I’d managed to completely miss that the other ports are actually filtered, meaning that in response to the SYN packet, there was no response, not even a RST. In my defence it was late at night and I may have been slightly tipsy.

In theory, you could stop at this point and wend your merry way hence. However, as I’m from Linux where a lot of stuff come installed and started on a default install and I’m used to doing the following as a minimum when I build/configure, I’ll document it anyway:

We’re going to use IPFilter to lockdown connectivity to our box and it’s own network edge.

Login and edit the file /etc/ipf/ipf.conf.  This file contains some basic pre-installed configuration. At first, we will want to lock this down quite hard. Let’s make sure that only SSH is exposed and then only to our IP address. The lines you’re looking for in the file are:

# Allow all out going connections
pass out from SERVER_IP_ADDRESS to any keep state
# Allow SSH
pass in quick from YOUR_IP_ADDRESS to SERVER_IP_ADDRESS port=22
# Block everything else coming in
block in from any to SERVER_IP_ADDRESS

That first line tells IPF to allow all outbound connections and maintain state.

The second allows port 22  connections but only from YOUR_IP_ADDRESS. This assumes that you’re ssh daemon is listening on port 22 – which it will be unless you’ve fiddled with it.

Ensure that you get at least the YOUR_IP_ADDRESS correct in the step above. Failure could lock you out of your own system, requiring support intervention – or as @AlainODea suggested, crontab a reset in now+20min just to be on the safe side. Something like the following should do the trick. Remove once you’re happy you haven’t locked yourself out of the system.

0,20 * * * * /user/sbin/ipf -D > /dev/null 2>&1

The last line blocks all other traffic to any other ports on our SERVER_IP_ADDRESS.

Now that’s done, we can enable the service:

svcadm enable network/ipfilter

Now let’s make sure IPF actually picks up our ruleset:

ipf -Fa -f /etc/ipf/ipf.conf

Now logout and log back in. Let’s check that IPF picked it up:

[root@somewhere ~]# ipfstat -ioh
9 pass out from to any keep state
7 pass in quick from any to port = 22
2146 block in from any to

In this case I’ve connected a few times so my count is more that 1 (which your’s should be at this stage). Take a look at the number of blocked connections though – interesting huh?

You can now work on this machine knowing that you’re relatively safe. As you add services and want to allow those services access from the outside world, add to the /etc/ipf/ipf.conf file as needed – always remember that rule order is important. For more IPF filter samples see here (quite old but relevant).

The term ‘cloud‘ is bandied about quite readily these days. To my mind there is one operating system that can actually be called a ‘cloud operating system‘ – SmartOS. I’ll dedicate another post to the differences that distinguish SmartOS from other OS in this ‘cloud’ space, but for now, take my word for it 🙂

I’m going to be setting up a medium sized infrastructure based on Joyentsmartmachines‘ – which you can think of as non-Global zones (for those from the Solaris world).

So, let’s get started.

First, we create our smartmachine – I’ve opted for a 2vCPU and 2GB machine for now, I can easily resize as required in the future thanks to SmartOS. Once I’ve logged in I make sure my base OS is up to date:

pkgin -fy update && pkgin -fy upgrade

This should pull down anything needs upgrading and leave you with a nice new smartmachine ready to roll.

Now, handily the chaps at Joyent have provided a quick guide to install CFEngine which I’ll be basing this tutorial around at this stage. It’s pretty basic but it get’s you to a good place to move forward from. Bear in mind that it seems to be geared toward installing a Global Zone within the Joyent SDC product.

First we need to install some pre-req’s using the pkgin installation system on smartmachines:

pkgin in gcc47 gmake tokyocabinet openssl pcre
export PATH=/opt/local/bin:/opt/local/sbin:/opt/local/gcc47/bin:$PATH

With those out of the way, let’s get the latest source and get it built:

cd /tmp
curl -k -o cfengine-3.5.1.tar.gz 'https://cfengine.com/source-code/download?file=cfengine-3.5.1.tar.gz'
gtar xfvz cfengine-3.5.1.tar.gz

This next step strays from the Joyent tutorial as we specify the working directory for CFEngine3 – a good idea IMHO.

cd cfengine-3.5.1

Whilst compiling this for ouselves, one of my engineers found a small bug as documented in https://cfengine.com/dev/issues/3097 – follow the advice therein for the fix.

Now issue the following command – all on one line:

CPPFLAGS="-I/opt/local/include" LDFLAGS="-L/opt/local/lib -R/opt/local/lib" ./configure --prefix=/opt/sfw/cfengine3 --enable-static --with-workdir=/var/cfengine

Now make and install CFEngine3

gmake install

You need to give it’s default policy files. These would normally live in /var/cfengine/masterfiles but with the source install, they get dumped in /opt/cfengine3/share/CoreBase – so go ahead and

cp -R /opt/sfw/cfengine3/share/CoreBase/* /opt/sfw/cfengine3/var/cfengine/masterfiles/

Now take a look in /opt/sfw/cfengine3/var/cfengine/masterfiles/def.cf and in particular at the ‘acl => slist‘ clause.

Make sure that includes the network or IP of the install machine itself. Once you’re happy with that clause we can go ahead and bootstrap :

/opt/sfw/cfengine3/bin/cf-agent --bootstrap XXX.XXX.XXX.XXX

If no errors occur you will get a ‘Bootstrap completed’ message and you’re away!

That’s it – you’ve installed and got the hub to bootstrap.

Now comes the hard bits – the configuration!

It’s worth remembering that smartmachines at Joyent Cloud are non-global zones.

As such they don’t have access to privileged syscalls like ‘adjtime’ so NTP isn’t going to work localy.

It’s best to use the time from the GLOBAL zone and adjust from there.

I spend a good 30 minutes today wondering why my ‘file’ clause wasn’t working.

Here’s what it looked like:

# Pull in rules which are GIT managed
 file { "/usr/local/zeus/zxtm/conf/rules":
 source => "puppet:///zeus/files/rules",
 owner => "root",
 group => "root",
 mode => 600,
 sourceselect => all,
 recurse => true,

Now it just maybe me, but I expect to be able to use ‘zeus/files/rules’ as, you know, that’s where the files live. However as this is Puppet DSL and the ‘file’ type expects all files to live in a ‘files’ directory off the top of the module, the clause should look like:

# Pull in rules which are GIT managed
 file { "/usr/local/zeus/zxtm/conf/rules":
 source => "puppet:///zeus/rules",
 owner => "root",
 group => "root",
 mode => 600,
 sourceselect => all,
 recurse => true,

It’s a small point of order but it’s things like this that really get under my skin about Puppet. Also – who the hell writes something like this in Ruby? 😉