Cloud Architecture Patterns: App VM

November 29th, 2010 No comments


Provide packaged software stack as Platform-as-a-Service (PaaS) platform for running applications



As Known As



We all know the three different types of cloud services from Infrastructure-as-a-Service (IaaS), PaaS, to Software-as-a-Service (SaaS). If you want to leverage PaaS, you have to choose one of the PaaS service providers like Google or Microsoft. Leveraging an external PaaS has its own benefits.

What if you want to keep your applications running in-house but still enjoy the benefits of PaaS? Today you don’t have much choice. Google, for example, does not sell its App Engine as a product that you can install and run on premise. You have to run it on the Google cloud.


Really Simple Guidelines to Write Great Code Samples

November 24th, 2010 2 comments

When a developer learns a new programming language or API, the first thing is probably to try out a HelloWorld sample. As said, real programmers don’t read documents. Although I don’t fully agree on that, it has some truth in it.

In my own experiences, I normally continue with other samples after HelloWorld one. When something is not quite clear, I check out the API reference or read some tutorials. Anyway, I am not telling you how to learn a new language or API, but trying to make a point here on the importance of code samples for the developers. In my opinion, samples are the most effective way to empower your users.

I think you would agree with me, there are too many bad samples. Here are some typical symptoms:

  1. Too much boilerplate code to a point that the code illustrating the API usage got buried. Typical boilerplate code includes extensive exception handling, GUI, logging, etc. Some samples even have a common library that could confuse your users totally.
  2. Too many API calls in one sample. You may need several APIs for a use case, but don’t aim one sample for multiple use cases.
  3. Too much object oriented. Object oriented programming is a best practice for application development. But it could confuse your developers sometimes.
  4. Dependencies on other APIs. To run the sample, your users need to install other libraries which may or may not need extra configuration or tuning. To understand the sample, users need to understand additional APIs. Extra burden, really!
  5. Of course, typical bad smells of programming which are not unique for samples. For example, bad naming, unnecessary global variables, using object attributes for passing values between methods, etc.

Now, how you can develop great samples? Besides the best practices writing great applications, you want to follow the following guidelines:

Wire Compatibility of Web Services

November 23rd, 2010 No comments

As a software professional, you may have heard about the source compatibility and binary compatibility. With the Web Services, a new type of compatibility came up. This is what I call wire compatibility. It’s not related to the programming but the XML messages passed on the wire. Since we don’t use XML directly but programming APIs, the wire compatibility surfaces and affects the source and binary compatibility.

Too abstract? You bet. Let’s pick up an example here. Because VMware vSphere API is defined in WSDL, I will use it in the following discussion.

In vSphere 4.1, the method PowerOnMultiVM_Task() gets an additional parameter called option typed as OptionValue array. The following are related parts in the WSDL:

<operation name="PowerOnMultiVM_Task">
  <input message="vim25:PowerOnMultiVM_TaskRequestMsg" />
  <output message="vim25:PowerOnMultiVM_TaskResponseMsg" />
  <fault name="RuntimeFault" message="vim25:RuntimeFaultFaultMsg"/>
<complexType name="PowerOnMultiVMRequestType">
    <element name="_this" type="vim25:ManagedObjectReference" />
    <element name="vm" type="vim25:ManagedObjectReference" maxOccurs="unbounded" />
    <element name="option" type="vim25:OptionValue" minOccurs="0" maxOccurs="unbounded" />

As you can see, the minOccurs of the option element is zero, meaning it’s optional. If you have an application built with 4.0 (no option parameter by then), the SOAP request still works. So it’s compatible on the wire.

Critical Lessons Learned at Facebook on Scalability and Reliability

November 21st, 2010 1 comment is no doubt the biggest web site surpassing Google in terms of Web traffics in an article published half year ago. Given its scale, the lessons learned would be very helpful for others to build scalable IT infrastructures. This post is based on my notes taken at the talk by Robert Johnson and Sanjeev Kumar at LISA 2010 conference. Should there be any mistakes, they are all mine.

According to the speakers, the architecture of is relatively simple: Web servers in the front, databases at the back. In the middle is a caching layer with a lot of memcached servers. If you recall my previous post, they use PHP extensively.

Unlike other sites, like email sites, whose users are well mapped and isolated to different servers, social Websites like Facebook have unique challenges in that their users are linked together. Errors in one part of a system may cascade easily and bring down the whole site.

Here are several important lessons Facebook learned while building software and operating the site:

What You Can Learn from IBM Research on Designing Private Cloud

November 16th, 2010 No comments

IBM Researcher Kyung Ryu presented a private cloud RC2 at LISA 2010 conference. As a typical IBM project, the presentation has 20+ co-authors. The following is based on my notes taken from the session, therefore may contain my misunderstandings.

Having an internal cloud is not a big deal these days. You can find several products from the market. What is truly unique and challenging for RC2 is that it supports very different virtualization platforms from X86 based hypervisors on X-series servers, to IBM PowerVM on P-series, to the mainframe based native virtualization on Z-series. Therefore RC2 is really a hybrid private cloud.

The talk focused on system architecture with several diagrams. I cannot reproduce these diagrams but would list the key components of the system:

What Lessons You Can Learn from Google on Building Infrastructure

November 15th, 2010 No comments

Last week I attended a great talk by Google Fellow Jeffrey Dean at Stanford University. Jeff talked about his first hand experience on building software systems at Google since 1999 and lessons learned. The following summary is solely based on my notes, therefore may contain my misunderstandings.

A Brief History

During the past 10 years or so, the scale of the Google infrastructure has grown exponentially: # docs 1,000X; #query, 1,000X; per doc index, 3X; update rate from months to seconds, 50,000X; query latency, 5X; computer and computing powers, 1,000X. The underlying infrastructure has experienced 7 major revisions in the last 11 years.

At the concept level, the search infrastructure is simple. It has web servers upfront taking search queries. The queries are then passed on to two different types of servers: index servers and doc servers. For the index server, the input is the query string and the output is an array of doc-id and score pairs. For the doc servers, the input is the doc-id and query pair and the output is the title and snippet of the doc. Note that the snippet of the doc is query dependent so that you can find your keywords highlighted in the result pages. How to quickly and accurately calculate the output based on input involves a lot of advanced algorithms, and is not in the scope of Jeff’s talk.

Cloud Architecture Patterns: Façade VM

November 15th, 2010 No comments


Provide a single point of contact for a large-scale system consisting of many virtual machines so that they are viewed as one giant VM from outside



As Known As

Giant VM


When a system becomes big, you need multiple VMs to support the workload. For ease of use reasons, external users don’t want to manage multiple connections to each of the virtual machines. Who wants to remember a list of IP or DNS names for a service? Also, you just cannot expect your users to pick up the least-busy VMs for balanced workloads across your cluster of VMs. And to scale your application when your overall workload increases, you want a seamless way adding new capacities without notifying others.

Finally, if you offer a public service, you don’t want to allocate a public IP address for each of your VMs. These days, public IPs are scarce resources and may cost you money.

LISA 2010 Conference

November 14th, 2010 No comments

Later last week I attended 24th LISA conference by USENIX at San Jose Convention Center. The name LISA stands for Large Installation System Administration. It’s a great conference focusing on technology, training, and professional development for system administrators.

I am not a system administrator, but wanted to know more about system administration in general because of devops movement. So I attended many technical sessions covering from storage, networking, release engineering, cloud computing, social network website management, to career development as a system admin. I will blog some of these sessions I attended based on my notes.

Cloud Architecture Patterns: Stateless VM

November 8th, 2010 No comments


Ensure a virtual machine does not carry a permanent state so that it can be easily provisioned, migrated, and managed in the cloud.

Also Known As

Disposable VM




Virtualization is the cornerstone for cloud computing, especially at the infrastructure level. With many virtual machines created, managing them becomes a big challenge.

Among these challenges are system provisioning, backup, archiving, and patching different virtual machines. These administrative tasks take lots of CAPEX and OPEX.

We need a better way to architect applications for the cloud.


Making a VM stateless solves a lot of problems. For one thing, you force applications to save data outside of the virtual machine. No longer do you need to back up the virtual machine – only the data. It also makes the system provisioning easier without differentiating the different instances. When the stateless VMs crash for whatever reasons, you don’t lose much. Just add a new virtual machine and voila!

Cloud Architecture Patterns: Aspectual Centralization

November 7th, 2010 No comments


Separate concerns in large scales computing by leveraging different types of services in the cloud




The history of computing reveals different eras starting from mainframe to client/server to Web computing. With mainframes, computing is contained within the boundary of a mainframe. With client/server and web computing we see the separation of the presentation from the data. With all these computing models, the data is owned and maintained by different applications. The IT staffs who run and maintain the applications are responsible for backing up and maintaining data.

With the rise of cloud computing, I see a new trend that will fundamentally change the game and push productivity to all new levels. I call this “Aspectual Centralization” (AC). This is as important to cloud architecture as Model-View-Controller (MVC) is to software architecture.


With AC, different aspects of an application are extracted out and delegated to centralized services: data services, messaging services, logging services, and so on.

Top 3 Trends Every IT Professional Should Care About

November 6th, 2010 No comments

IBM DeveloperWorks recently published the result of a survey of 2000 IT professionals excluding IBM employees. The key findings are:

  1. Cloud Computing to overtake on-premise computing. For the question, “how do you rate the potential for cloud computing to overtake on-premise computing as the primary way organizations acquire IT by 2015?” 30.4% said likely, 21.6% most likely, and 13.6% definitely.
  2. Mobile application development to dominate. 55% of respondents see app development on mobile grows than other platforms in 5 years.
  3. IT professionals need, but often lack, industry-specific knowledge. 28.3% thought moderately important, 45.6% very important, and 15.9% extremely important. This is not an IT trend per se, but represents the demands for IT professionals.

The first two findings are mentioned and somewhat confirmed by another survey by Forrest and Dr. Dobbs, which is more developer oriented:

Cloud Computing Expo 2010 Silicon Valley

November 3rd, 2010 No comments

The Cloud Computing Expo takes place in the Santa Clara Convention Center from this Monday to Thursday. This twice-a-year event attracted thousands of attendees. Thanks to the invitation from Jeremy Geelan, I went to conference checking out several sessions and the exhibitions.

I found many familiar companies in the exhibition, from Oracle, Microsoft, VMware, and many other companies. Unlike VMworld, I don’t find many IHVs in the show. The ISVs demoed their products with strong focuses on Cloud. Microsoft for example demoed its Windows Azure family of services; VMware demoed its vCloud Director. I even found IBM booth which was much smaller than I expected. It turned out to be its recently acquired CastIron part.

Here are several companies I found interesting technologies from my trip:

Code2Cloud Reborn With a Greater Purpose

November 2nd, 2010 No comments

If you attended last year’s VMworld keynote by Steve Herrod or watched the online broadcasting, you may still recall the website (see the top banner here). That was a very simple Web application meant for the keynote attendees to submit names and email addresses to win a chance to go to the backstage with the Foreigner band. The website was hosted at Terramark vCloud and continued to run for about one month afterward.

Cloud Architecture Patterns: VM Pool

November 1st, 2010 4 comments


Provide a mechanism to fast provision virtual machines (VMs) and manage their lifecycles by maintaining a pool of virtual machines.




Virtual machines can be expensive to create. It takes several minutes to create a new virtual machine. Technologies like linked clone and storage offloading can help speed up the process, but it still takes time. And these alternative approaches, in some use cases, do not help when you need instant provisioning.


It’s generally a good practice to pool resources that are expensive to create. In programming, you pool threads and check them out on demand. When it’s done, you check them in back to the pool. This is what most Web Servers do for high performance.

You can leverage the same idea for VM provisioning. You create new virtual machines and put them into a pool. When there is a new request, you just check out one virtual machine from the pool. The following diagram shows how it works.

DoubleCloud Translated to Japanese

October 27th, 2010 2 comments

Several weeks ago, Christopher Wells sent me an email for permission for translating my blog into Japanese so that more people in Japan can benefit. Today he told me he had translated one of my posts into Japanese and posted it on his blog site. Here is the first paragraph in Japanese version:

オペレーティング・システム(OS)はソフトウェアの一部で、コンピュータのハードウェアを管理し、様々なアプリケーション用の一般サービスを提供しま す。クラウド・コンピューティングの台頭にともなって、OSがまだ関連するのか、また、将来のクラウドにおいてOSがどんな役割を果たすのか疑問に思う人 もいるでしょう。

Now a little quiz: which original article did Chris translate? I will leave it to you to guess out. :-) Hint: here is the link to the full article.

Cloud Architecture Patterns: VM Factory

October 26th, 2010 No comments

In my last blog, I wrote about pattern idea from the famous books “Design Pattern” and “A Pattern Language” and how it can be applied to cloud architecture design. Below and in later posts in this series I shall follow the content outline used there to illustrate the cloud architecture patterns.


Provide a standard way to create new virtual machines based on user requirements.




There are enormous combinations of virtual machines with different operating systems, middleware, and applications. And then we have user data to the mix! We need a standard way to create new virtual machines.

General speaking, there are three basic ways to create new virtual machines:

Undocumented VI and vSphere API Methods: A Little History

October 25th, 2010 No comments

Most developers may have noticed the asynchronous methods in vSphere API like PowerOnVM_Task method, but not so many know their synchronous peers like PowerOnVM before 4.1. VMware vSphere API Reference doesn’t mention them at all. But you can find them in WSDL(check out the WSDL snippets at the end of this article).

There is an exception however. In VI Perl, these synchronous methods are exposed. There, you can choose which one to use. In vSphere Java API 2.0, these methods are exposed only in the stub layer but not the object layer. You don’t want to use stub methods directly when you can use objects, therefore I don’t talk much about it even in my book. Somehow I came across a question in the forum asking about this. So I think it may be good to share a little history and insight here.

The differences of these twin methods are minimal. They have exactly same parameters but different returns. The methods whose names include _Task suffix have Task returned. When you have the Task return, the operation may not yet be done at the server side. But with the Task object, you can track the progress, and even get the result data objects.

How You Can Use vSphere APIs to Collect vCenter and ESX Logs

October 20th, 2010 3 comments

If you manage a vSphere infrastructure, you may want to collect logs for troubleshooting, debugging, etc. You can get these logs from vSphere Client manually. You can also use vSphere API to collect them automatically.

The related managed object type in vSphere API is the DiagnosticManager. It helps to access logs from either a vCenter server or ESX server. It has no property but three methods:

1. queryDescriptions() provides a list of diagnostic files for a given system. It takes in an optional parameter host for specifying the HostSystem to extract information from. When you connect to the ESX server directly, the parameter isn’t needed. In vSphere Java API, you just pass in a null. When you connect to the vCenter server and the parameter isn’t specified, the method assumes you’re looking for vCenter logs. The return of this method is an array of DiagnosticManagerLogDescriptor data objects. The data object includes six properties: creator, fileName, format, info, key, and mimeType.

Categories: vSphere API Tags: , , , ,

Cloud Architecture Patterns: Overview

October 18th, 2010 No comments

Design patterns have been very popular among software developers since the book by Gang of Four (Enrich Gamma, Richard Helm, Ralph Johnson, John Vlissides) in 1995. If you go to an interview for a software engineering position today, the chances are you most likely get one or more questions on design patterns. 

Like many concepts in software that from other disciplines, the “pattern” idea was “borrowed” from “A Pattern Language,” a book by Christopher Alexander on architectural patterns published 20 years before we started to talk about software design pattern. His book turns out to be a great way to summarize and document reusable design elements for different engineering works.

As Christopher points out, “Each pattern describes a problem which occurs over and over again in our environment, and then described the core of the solution to that problem, in such a way you can use the solution a million times over, without ever doing it the same way twice.”

Today I “borrow” the same idea and apply the concept to cloud computing on architecture designs. The main focus is on virtual machine-based architecture patterns so that you can best leverage virtual machines for your cloud computing.

What is a Cloud Architecture Pattern?

An architectural pattern extracts the common and re-usable design concepts and components. If we put it in the big picture, it should be somewhere below the overall system architecture and above software design as the middle layer.

Really Simple Tricks to Speed up Your CLIs 10 Times Using vSphere Java API

October 15th, 2010 4 comments

I recently had a short discussion with my colleague on implementing CLIs with vSphere Java API. One problem is that if you have multiple commands to run, each of them connects to the server and authenticate over and over. You’d better remember to logout the connection each time after you are done, or leave many un-used connections on the server that could significantly slow down your ESX or vCenter server (read this blog for details).

You can have two solutions to this problem. The first one is to have your own “interpreter” command. After you type the command, it shows you prompt for more sub-commands. It’s very much like the “ftp” command in that sense. You can have subcommands like “login” or “open” or “connect” for connecting to a server, and other commands. The “interpreter” command can then hold the ServiceInstance object until it’s closed in the end.

You can save about 0.3 to 0.5 second on creating new HTTPS connection and login for each command after the first one. It’s not a big deal given that vSphere Java API has hugely reduced that time from 3 to 4 seconds with Apache AXIS. So if you switch to vSphere Java API, you get instant 10 time performance gain. Still, if you have many commands to run, it could be a decent saving.

With this solution, you can also implement batch mode in which you can save all your commands into a file and then execute them all with one command. You can find many examples like PowerShell which support interactive mode and batch mode.

Another solution is just having normal commands. The problem becomes how to avoid the authentication for each command after the first. Luckily we have something for you in the API.

Categories: vSphere API Tags: , ,