Connecting Clouds
Thoughts on Cloud Computing

Connecting Clouds

Amazon – the enterprise cloud

September 2nd, 2009 . by Maximilian Ahrens

Amazon recently announced some new and pretty nice features: the so-called VPC.

The VPC is a VPN access gateway and net segmentation which allows enterprises to separate instances on the EC2 platform against other customers instances.

This is certainly nice, but it also shows that the introduction of enterprise customers on the highly standardized and simplified EC2 creates challenges for enterprise customers and Amazon as the platform owner. These are technical but also legal and procedural issues. Larger companies are used to being able to first define what they want and then check this against offerings in the market.

This way is hindering standardization big time. This is also why new features are being introduced in the EC2 platform – but does this really help?

Enterprises do have specific demands for their networks etc. and they simply cannot easily use a different technology. But introducing new pieces in the EC2 platform makes it harder to maintain and this will drive up costs, which in turn will
create a move in the smaller enterprise customers that are more flexible towards other platforms. So there is single “one-size-fits-all” platform – enterprises will connect to a set of different internal and external platform. (And I haven’t even spoken of the legal aspects of this)

Don’t get me wrong – I think the new VPC features are very important for many customers and will also help enterprise customers to use EC2.
Nevertheless, in the end no single platform can support any demand, therefore it is not a technology question rather than a focus  question determining which features can be supported in a single cloud platform.
Technology is similar anyway – the key is the feature set and legal issues that are demanded by the customer and to leave everything else.

Cloud Computing is about simplification, cost saving and agility. The more complexity is being introduced into the cloud computing platforms, the less it is going to fulfill these goals. Complexity always drives costs and also makes the adoption more difficult. So customers need to choose carefully which feature set they really need in the cloud computing environment – and this feature set is going to be different with different customers and different customer projects. That is why customers will have different clouds they make use of.
BTW: This is not the only reason – avoiding strong lock-in effects is certainly another one. Lock-in is mainly driven by the data. Accessing the data from inside and outside the cloud is key to get the flexibility that is required.

Rackspace outage: Enterprise Clouds need to be built differently

June 30th, 2009 . by Maximilian Ahrens

Rackspace on Monday experienced an outage that completely shut down its hosted sites – apparently due to a problem with its Dallas datacenter.

I think you know what’s coming – The Cloud.

Yep, a “breathable”, automated Private Cloud dynamically sharing resources across multiple datacenters could significantly mitigate outages like this.

Looking at all kinds of public clouds the issue becomes critical. Current providers focus primarily on providing best prices for a specific set of Web 2.0 applications. Now, as the cloud industry is moving towards the enterprise — this sets up completely new challenges. An outage like the Rackspace one will keep another good set of enterprise customers from hopping on the cloud.
Providing another level of failover security by keeping the prices reasonable is about two things: being able to switch between different cloud infrastructures and — for the highly secured and resilient infrastructures: having another level of hard- and software resources available. That is what the standard public clouds cannot easily provide, but this is available on the sites of the known outsourcing providers. In this case customers can decide between security (SLA and resilience) and price – to find the perfect balance between these antagonistic parameters.
Our Zimory public cloud provides different levels and also connects high-standard outsourcing data centers to the cloud.

The point is – cheap infrastructure is easily broken – but there are opportunities in the Cloud.

Cloud Computings next big thing

June 23rd, 2009 . by Maximilian Ahrens

John Foley just wrote on The Cloud’s Next Big Thing in Cloud Computing: Software testing.
That is exactly what we see with our enterprise customers. Whereas the typical Web 2.0 provider uses the public cloud offerings for running their consumer applications the enterprises are not yet ready for this.

They really start using it for testing and developing their custom software – which is still being built on top of the known infrastructure stack. This has two implications for the cloud infrastructure: On the one hand, site enterprises need standard stacks to really test their software in comparable conditions. How do you run a load test for an app that is being deployed in a VMWare ESX environment in the EC2 cloud? Secondly, as the applications are being developed on the known stacks of soft- and hardware infrastructure they will not benefit from the cloud once it is deployed in a production environment.

This is a more delicate challenge: To get scalability in the applications they need to be rewritten – or new technologies have to be created.

Along these ideas our enterprise customers see this challenge on the horizon – that is going to be the next big thing in the cloud if you look on the technology side!

IBM in the Clouds – what’s it all mean?

June 18th, 2009 . by Maximilian Ahrens

Anyone following clouds certainly noticed the activity Monday when IBM jumped into the fray. For many of us who have been in the open source game for a while, this came as good news indeed.

It was not that long ago that IBM jump started Linux and OSS into the mainstream when they announced that they would support Linux for their customers. Could their latest announcement jump start the cloud?

Certainly, the very technology that IBM already possesses (and Sun, and a few others) makes cloud computing ripe for them, and them right for the cloud. As IBM works out the advantages of the cloud, I think they’ll find opportunities to meld with many of us already in the thick of the cloud. We welcome their participation, and will be happy to help.

Implementing High-Availability Services on Zimory Public Cloud (Part 1)

June 17th, 2009 . by Benjamin Schmidt

The term “high availability” is defined by the Institute of Electrical and Electronics Engineers (IEEE) as: “Availability of resources in a computer system, in the wake of component failures in the system.”

A system can be called highly available, if its applications and service are available even in the case of an error without direct human interaction. This implies, that a user experiences no or only little interruption. High availability does not mean permanent availability, though.

Reliability and Availability

There are two strategies to uphold and guarantee a service.

  1. Usage of highly reliable components, with a low probability of a downtime (error elimination): Errors can never be ruled out since they are unpredictable. In practice, this is defined as Mean-Time-Between-Failure (MTBF). Higher MTBF logically is connected with higher investment and maintenance costs.
  2. Usage of reliable components with the ability of automatic recovery: Errors are compensated so that the system as a whole in still functional. After error recognition, availability can be guaranteed by
  • error correction: restarting of processes or of the whole (operating) system
  • error compensation: redundant components uphold service

Availability is the percentage of a predefined time unit of which the system functions properly; a specification about how often the system experienced a downtime is not defined though. This means that a system can experience multiple downtimes but needs a fast restoring, to achieve its required availability time. The ration of uptime to downtime is often expressed in multiple “nines”. The following table shows the influence of the availability on the actual downtime of a system per year and per month. Also maintenance work falls into this time.

Availability
Class
Term Availability
in %
Downtime
per Year
Downtime
per Week
2 resilient 99% 3.7 days 1.7 hours
3 available 99.9% 8.8 hours 10.1 minutes
4 highly available 99.99% 52.2 minutes 1.0 minute
5 error insensitive 99.999 5.3 minutes 6 seconds
6 error tolerant 99.9999% 32 seconds 0.6 seconds
7 error resistant 99.99999% 3 seconds 0.06 seconds

The Harvard Research Group (HRG) divides high availability into its Availability Environment Classification (AEC) in 6 classes:

  • Conventional (AEC-0): Function can be interrupted, data integrity is not essential.
  • Highly Reliable (AEC-1): Function can be interrupted, data integrity must be however ensured.
  • High Availability (AEC-2): Function may be minimum interrupted only within fixed times during the main operating hours.
  • Fault Resilient (AEC-3): Function must be maintained within fixed times during the main operating hours continuously.
  • Fault Tolerant (AEC-4): Function must be maintained continuously, 24*7 enterprise (24 hours, 7 days the week) must be ensured.
  • Disaster tolerant (AEC-5): Function must be available under all circumstances.

High availability is defined in enterprises frequently in the framework by service level agreements (SLA), and represents a substantial valuation criteria for IT-services.

What About the Cloud?

The aforementioned strategies to uphold and guarantee a service, are a little harder to implement using a cloud environment. Most, if not all, cloud providers will not offer any information regarding their data centers, e.g. their geographic location or tier classification. The offered service level agreements refer to the platform as a whole but does not take the uptime of an individual host into consideration.

Zimorys Public Cloud seeks to combine both – reliability and availability. Data centers and connected cloud providers are distinguished by three levels of quality – gold, silver and bronze – reflected in binding service level agreements. However, individual resources could have more quality characteristics such as certifications, fail over systems and guaranteed support level.

The following definition for the classification of data centers is used:

  • Tier 4: Has multiple active supply paths for power and air-conditioning, has redundant components, it is fault-tolerant and provides an availability of at least 99,995%
  • Tier 3: Has multiple active supply paths for power and air-conditioning, with only one system active in standard use; has redundant components and is manageable at the same time and provides an availability of at least 99,982%
  • Tier 2: Has one path each for power and air-conditioning; has redundant components and provides an availability of at least 99,741%
  • Tier 1: Has one path each for power and air-conditioning; has redundant components and provides an availability of at least 99,671%

Zimory makes certifications and quality standards of the various data centers transparent and easily understandable. Users have the option to select higher-level certifications for specific applications and to choose less expensive services for other applications.

In addition to providing reliable hardware infrastructure, Zimory Public Cloud also allows the building of multi-tier software architecture. Multi-tier architectures are scalable, since
the individual layers are logically separated. For example, in distributed system architectures, the data layer runs on a central database server, the logic layer runs on a remote application server, and the delivery is handled by a web server. In such an architecture, the individual components can be adapted to increasing load by replication. For example, if many users use the application, a clone of the application server can be created, which shares the requests with the first server. This clone operation can be triggered through the API of the Public Cloud with a rule-based trigger.

Each user of the Public Cloud also has full control of the underlying network layer. All virtual machines of an individual user are deployed in an own exclusive VLAN, allowing the building of clusters, e.g. to implement high availability services. In a series of articles, I will demonstrate how high availability application and database clusters can be built in the cloud, using open source projects such as heartbeat, DRBD, HAProxy and MySQL. So stay tuned to find out how you can quench that extra “nine” into the uptime of your multi-tier application.

Lightning hits Amazon

June 11th, 2009 . by Maximilian Ahrens

There has been a lighting bringing parts of amazon down (see here)

The question is: how can you avoid this – without building untenable worldwide distributed software. The answer is simple: multiple sites that are seamlessly connected (um, “real” cloud computing). If you have two sites available the probability of lightning striking both sites at the same time is rather low. But the challenge is the seamless connectivity. You can have two compute sites, but starting up the workloads that were running on the site that was shut down is more complicated.

You need to have very efficient dispersed snap-shotting technologies to allow a failover of a recent state in a remote site. After all — isn’t that really the promise of cloud computing? Not a single mammoth Wallmart (or, in this case, Amazon) data center, but a real distributed, “breathing” datacenter.

So running a real multi-site cloud that creates added value with cold standby instances is not only building a overlay management layer, but also pretty tough technologies — but it can be done — and frankly, is being
done.

Enterprise Class Cloud

May 8th, 2009 . by Maximilian Ahrens

A lot has been said – and written – about Cloud Computing in general and much of that discussion has been about how, when and even if enterprises will fully adopt the cloud.

My feeling is, they will have to. Competitive cost pressures, stalled innovation and even peer pressures will all lead companies to adopt the cloud. But is the cloud ready for the complexities of true enterprise applications?

Today’s cloud is ready for many IT needs. At Zimory, we provide technology for companies looking for server capacity, as well as companies looking to monetize excess server capacity. But there is a third piece of the puzzle that we believe will enable full enterprise adoption, and that’s fully enabling transactional (database) applications for the cloud.

Stay tuned, this will get exciting.

Openness = Standards?

April 29th, 2009 . by Maximilian Ahrens

There are multiple ways to create an open cloud, but the idea is clear:
enterprises are afraid of putting all their money in a single bucket. That
is what you have to do if you are working on one of today’s cloud stacks.

A broadly discussed way is to provide a standard that makes different
systems interoperable – that is the open cloud.

Does this really work?

As I’ve mentioned previously, I have a history in SOA. I think this can
serve as an example of how standards do not always help create true openness
– true interoperability.

There are all kinds of WS* standards that have been developed between
companies as well as by
standards bodies. All these standards ultimately define the smallest common
denominator. That means although all products implement the standards — to
really make use of them people use the differentiating parts, that do not
conform to standards. That’s why it is such a pain to run multi-vendor SOA
implementations. They always comes along with a lot of coding and
integration.

So my belief is that standards are helpful to create basic interoperability,
but to create real openness as customers need that to switch clouds
dynamically, this level does not help.

I think another lesson can be taken from the open source world – this I will
talk about in my next blog post.

Open Cloud again

April 24th, 2009 . by Maximilian Ahrens

After my keynote at the Cloud Slam 09 conference yesterday – today I am
following a panel discussion at Under the Radar. They are talking about
the enterprise perspective of cloud computing.

The guys are taking the exact same position as I did at Cloud Slam. The
Net/net: enterprises will not run only on public clouds, but also not
only on internal clouds. Public clouds are destroying too much of existing
investment in technology and application development. Internal clouds on the
other hand do not work only with internal machines. That is simply because
of the missing economies of scale.

So how to overcome this?

I think one important point is to create an open platform – only openness
can help to reduce the lock-in and allow companies to get the confidence in
running real work-loads in the cloud. Also the integration between internal
and external clouds will be supported by open technologies.

I will write a few blog posts in the next time along my ideas of the open
cloud platform.

The breathing cloud is the enterprise solution

April 17th, 2009 . by Maximilian Ahrens

Steve Lohr, the veteran New York Times reporter, recently blogged (see here) about the results of a McKinsey story on cloud computing.

The report — “Clearing the Air on Cloud Computing,” – and Lohr, bring up a very important point about the hype and reality of the cloud. The cloud is not about Enterprises outsourcing their entire data centers into the cloud. The true enterprise cloud will be about building “breathing clouds” – that is, an Enterprise’s ability to lower costs by provisioning computing power for average workloads, not peak. The cloud will enable these companies to efficiently repurpose computing power internally (saving more money and power) and yet be able to pick up external capacity as needed for peak (even unexpected peak) loads.

« Previous Entries