Archive for November, 2010

Cloud Architecture Patterns: App VM

November 29th, 2010 No comments


Provide packaged software stack as Platform-as-a-Service (PaaS) platform for running applications



As Known As



We all know the three different types of cloud services from Infrastructure-as-a-Service (IaaS), PaaS, to Software-as-a-Service (SaaS). If you want to leverage PaaS, you have to choose one of the PaaS service providers like Google or Microsoft. Leveraging an external PaaS has its own benefits.

What if you want to keep your applications running in-house but still enjoy the benefits of PaaS? Today you don’t have much choice. Google, for example, does not sell its App Engine as a product that you can install and run on premise. You have to run it on the Google cloud.


Really Simple Guidelines to Write Great Code Samples

November 24th, 2010 2 comments

When a developer learns a new programming language or API, the first thing is probably to try out a HelloWorld sample. As said, real programmers don’t read documents. Although I don’t fully agree on that, it has some truth in it.

In my own experiences, I normally continue with other samples after HelloWorld one. When something is not quite clear, I check out the API reference or read some tutorials. Anyway, I am not telling you how to learn a new language or API, but trying to make a point here on the importance of code samples for the developers. In my opinion, samples are the most effective way to empower your users.

I think you would agree with me, there are too many bad samples. Here are some typical symptoms:

  1. Too much boilerplate code to a point that the code illustrating the API usage got buried. Typical boilerplate code includes extensive exception handling, GUI, logging, etc. Some samples even have a common library that could confuse your users totally.
  2. Too many API calls in one sample. You may need several APIs for a use case, but don’t aim one sample for multiple use cases.
  3. Too much object oriented. Object oriented programming is a best practice for application development. But it could confuse your developers sometimes.
  4. Dependencies on other APIs. To run the sample, your users need to install other libraries which may or may not need extra configuration or tuning. To understand the sample, users need to understand additional APIs. Extra burden, really!
  5. Of course, typical bad smells of programming which are not unique for samples. For example, bad naming, unnecessary global variables, using object attributes for passing values between methods, etc.

Now, how you can develop great samples? Besides the best practices writing great applications, you want to follow the following guidelines:

Wire Compatibility of Web Services

November 23rd, 2010 No comments

As a software professional, you may have heard about the source compatibility and binary compatibility. With the Web Services, a new type of compatibility came up. This is what I call wire compatibility. It’s not related to the programming but the XML messages passed on the wire. Since we don’t use XML directly but programming APIs, the wire compatibility surfaces and affects the source and binary compatibility.

Too abstract? You bet. Let’s pick up an example here. Because VMware vSphere API is defined in WSDL, I will use it in the following discussion.

In vSphere 4.1, the method PowerOnMultiVM_Task() gets an additional parameter called option typed as OptionValue array. The following are related parts in the WSDL:

<operation name="PowerOnMultiVM_Task">
  <input message="vim25:PowerOnMultiVM_TaskRequestMsg" />
  <output message="vim25:PowerOnMultiVM_TaskResponseMsg" />
  <fault name="RuntimeFault" message="vim25:RuntimeFaultFaultMsg"/>
<complexType name="PowerOnMultiVMRequestType">
    <element name="_this" type="vim25:ManagedObjectReference" />
    <element name="vm" type="vim25:ManagedObjectReference" maxOccurs="unbounded" />
    <element name="option" type="vim25:OptionValue" minOccurs="0" maxOccurs="unbounded" />

As you can see, the minOccurs of the option element is zero, meaning it’s optional. If you have an application built with 4.0 (no option parameter by then), the SOAP request still works. So it’s compatible on the wire.

Critical Lessons Learned at Facebook on Scalability and Reliability

November 21st, 2010 1 comment is no doubt the biggest web site surpassing Google in terms of Web traffics in an article published half year ago. Given its scale, the lessons learned would be very helpful for others to build scalable IT infrastructures. This post is based on my notes taken at the talk by Robert Johnson and Sanjeev Kumar at LISA 2010 conference. Should there be any mistakes, they are all mine.

According to the speakers, the architecture of is relatively simple: Web servers in the front, databases at the back. In the middle is a caching layer with a lot of memcached servers. If you recall my previous post, they use PHP extensively.

Unlike other sites, like email sites, whose users are well mapped and isolated to different servers, social Websites like Facebook have unique challenges in that their users are linked together. Errors in one part of a system may cascade easily and bring down the whole site.

Here are several important lessons Facebook learned while building software and operating the site:

What You Can Learn from IBM Research on Designing Private Cloud

November 16th, 2010 No comments

IBM Researcher Kyung Ryu presented a private cloud RC2 at LISA 2010 conference. As a typical IBM project, the presentation has 20+ co-authors. The following is based on my notes taken from the session, therefore may contain my misunderstandings.

Having an internal cloud is not a big deal these days. You can find several products from the market. What is truly unique and challenging for RC2 is that it supports very different virtualization platforms from X86 based hypervisors on X-series servers, to IBM PowerVM on P-series, to the mainframe based native virtualization on Z-series. Therefore RC2 is really a hybrid private cloud.

The talk focused on system architecture with several diagrams. I cannot reproduce these diagrams but would list the key components of the system:

What Lessons You Can Learn from Google on Building Infrastructure

November 15th, 2010 No comments

Last week I attended a great talk by Google Fellow Jeffrey Dean at Stanford University. Jeff talked about his first hand experience on building software systems at Google since 1999 and lessons learned. The following summary is solely based on my notes, therefore may contain my misunderstandings.

A Brief History

During the past 10 years or so, the scale of the Google infrastructure has grown exponentially: # docs 1,000X; #query, 1,000X; per doc index, 3X; update rate from months to seconds, 50,000X; query latency, 5X; computer and computing powers, 1,000X. The underlying infrastructure has experienced 7 major revisions in the last 11 years.

At the concept level, the search infrastructure is simple. It has web servers upfront taking search queries. The queries are then passed on to two different types of servers: index servers and doc servers. For the index server, the input is the query string and the output is an array of doc-id and score pairs. For the doc servers, the input is the doc-id and query pair and the output is the title and snippet of the doc. Note that the snippet of the doc is query dependent so that you can find your keywords highlighted in the result pages. How to quickly and accurately calculate the output based on input involves a lot of advanced algorithms, and is not in the scope of Jeff’s talk.

Cloud Architecture Patterns: Façade VM

November 15th, 2010 No comments


Provide a single point of contact for a large-scale system consisting of many virtual machines so that they are viewed as one giant VM from outside



As Known As

Giant VM


When a system becomes big, you need multiple VMs to support the workload. For ease of use reasons, external users don’t want to manage multiple connections to each of the virtual machines. Who wants to remember a list of IP or DNS names for a service? Also, you just cannot expect your users to pick up the least-busy VMs for balanced workloads across your cluster of VMs. And to scale your application when your overall workload increases, you want a seamless way adding new capacities without notifying others.

Finally, if you offer a public service, you don’t want to allocate a public IP address for each of your VMs. These days, public IPs are scarce resources and may cost you money.

LISA 2010 Conference

November 14th, 2010 No comments

Later last week I attended 24th LISA conference by USENIX at San Jose Convention Center. The name LISA stands for Large Installation System Administration. It’s a great conference focusing on technology, training, and professional development for system administrators.

I am not a system administrator, but wanted to know more about system administration in general because of devops movement. So I attended many technical sessions covering from storage, networking, release engineering, cloud computing, social network website management, to career development as a system admin. I will blog some of these sessions I attended based on my notes.

Cloud Architecture Patterns: Stateless VM

November 8th, 2010 No comments


Ensure a virtual machine does not carry a permanent state so that it can be easily provisioned, migrated, and managed in the cloud.

Also Known As

Disposable VM




Virtualization is the cornerstone for cloud computing, especially at the infrastructure level. With many virtual machines created, managing them becomes a big challenge.

Among these challenges are system provisioning, backup, archiving, and patching different virtual machines. These administrative tasks take lots of CAPEX and OPEX.

We need a better way to architect applications for the cloud.


Making a VM stateless solves a lot of problems. For one thing, you force applications to save data outside of the virtual machine. No longer do you need to back up the virtual machine – only the data. It also makes the system provisioning easier without differentiating the different instances. When the stateless VMs crash for whatever reasons, you don’t lose much. Just add a new virtual machine and voila!

Cloud Architecture Patterns: Aspectual Centralization

November 7th, 2010 No comments


Separate concerns in large scales computing by leveraging different types of services in the cloud




The history of computing reveals different eras starting from mainframe to client/server to Web computing. With mainframes, computing is contained within the boundary of a mainframe. With client/server and web computing we see the separation of the presentation from the data. With all these computing models, the data is owned and maintained by different applications. The IT staffs who run and maintain the applications are responsible for backing up and maintaining data.

With the rise of cloud computing, I see a new trend that will fundamentally change the game and push productivity to all new levels. I call this “Aspectual Centralization” (AC). This is as important to cloud architecture as Model-View-Controller (MVC) is to software architecture.


With AC, different aspects of an application are extracted out and delegated to centralized services: data services, messaging services, logging services, and so on.

Top 3 Trends Every IT Professional Should Care About

November 6th, 2010 No comments

IBM DeveloperWorks recently published the result of a survey of 2000 IT professionals excluding IBM employees. The key findings are:

  1. Cloud Computing to overtake on-premise computing. For the question, “how do you rate the potential for cloud computing to overtake on-premise computing as the primary way organizations acquire IT by 2015?” 30.4% said likely, 21.6% most likely, and 13.6% definitely.
  2. Mobile application development to dominate. 55% of respondents see app development on mobile grows than other platforms in 5 years.
  3. IT professionals need, but often lack, industry-specific knowledge. 28.3% thought moderately important, 45.6% very important, and 15.9% extremely important. This is not an IT trend per se, but represents the demands for IT professionals.

The first two findings are mentioned and somewhat confirmed by another survey by Forrest and Dr. Dobbs, which is more developer oriented:

Cloud Computing Expo 2010 Silicon Valley

November 3rd, 2010 No comments

The Cloud Computing Expo takes place in the Santa Clara Convention Center from this Monday to Thursday. This twice-a-year event attracted thousands of attendees. Thanks to the invitation from Jeremy Geelan, I went to conference checking out several sessions and the exhibitions.

I found many familiar companies in the exhibition, from Oracle, Microsoft, VMware, and many other companies. Unlike VMworld, I don’t find many IHVs in the show. The ISVs demoed their products with strong focuses on Cloud. Microsoft for example demoed its Windows Azure family of services; VMware demoed its vCloud Director. I even found IBM booth which was much smaller than I expected. It turned out to be its recently acquired CastIron part.

Here are several companies I found interesting technologies from my trip:

Code2Cloud Reborn With a Greater Purpose

November 2nd, 2010 No comments

If you attended last year’s VMworld keynote by Steve Herrod or watched the online broadcasting, you may still recall the website (see the top banner here). That was a very simple Web application meant for the keynote attendees to submit names and email addresses to win a chance to go to the backstage with the Foreigner band. The website was hosted at Terramark vCloud and continued to run for about one month afterward.

Cloud Architecture Patterns: VM Pool

November 1st, 2010 4 comments


Provide a mechanism to fast provision virtual machines (VMs) and manage their lifecycles by maintaining a pool of virtual machines.




Virtual machines can be expensive to create. It takes several minutes to create a new virtual machine. Technologies like linked clone and storage offloading can help speed up the process, but it still takes time. And these alternative approaches, in some use cases, do not help when you need instant provisioning.


It’s generally a good practice to pool resources that are expensive to create. In programming, you pool threads and check them out on demand. When it’s done, you check them in back to the pool. This is what most Web Servers do for high performance.

You can leverage the same idea for VM provisioning. You create new virtual machines and put them into a pool. When there is a new request, you just check out one virtual machine from the pool. The following diagram shows how it works.