How to Be a Smart IT Customer?

January 5th, 2011 No comments

This is the last note I took from LISA 2010 conference. It’s a great talk by Loren Jan Wilson drawing his experience with vendors while working at a super computer center.

The super computer, Intrepid, consists of 40,960 nodes on 40 racks. Each node has 4 core CPU. Of all the nodes, 640 nodes are dedicated for I/O. There is no local storage at each node. The super computer links to a very large tape library for archiving.

While operating the super computer, the speaker had some issues with high-speed network switches, e.g. 6% random port death, 15% quad port flaky but never fail 100%. To complicate the issue, there is no log and CLI for troubleshooting, but Web interface.

I believe the trouble the speaker faced before is not a single case in the industry, and never be. As long as you have to buy equipment/software from vendors, there will be issues one way or the other. A great thing the speaker did is to summarize and share the tips on how a customer should work with an IT vendor for a successful IT project.

I find these tips are very helpful, and think customers and vendors should all know about them as listed below:

Cloud Architecture Patterns: Cloud Broker

January 3rd, 2011 No comments


Provide a single point of contact and management for multiple cloud service providers and maximize the benefits of leveraging multiple external clouds.




When you are buying and selling stocks or other securities, you hire a broker to execute the trade on your behalf. One reason for that is convenience. You don’t need to take care of the details of placing orders and working with multiple stock exchanges, and whatever else is required to trade securities.

How about working with multiple cloud service providers? For sure, you can go online to any cloud provider as long as you have your credit card ready. But is the service provider the best fit for your requirements? Do you have a backup plan if you are not satisfied with your service provider? Can you easily switch among your service providers to minimize cost or maximize flexibility? If you are not sure, you may then need something like a cloud broker.


Top 5 Predictions on Cloud Computing for 2011

December 29th, 2010 No comments

While 2011 is coming soon, many technologists and medias are busy with predictions for 2011. I got an email from the chief of Cloud Computing Journal Jeremy Geelan (@jg21) for my predictions. Here are my thoughts on the cloud computing for 2011 and beyond:

  1. The focus of cloud computing will gradually shift from IaaS to PaaS which becomes key differentiator in competition. Developer enablement becomes more important than ever in ecosystem evangelism, full software lifecycle integration, IDE support, API and framework, and etc.
  2. Many more mergers and acquisitions (M&As) will take place in cloud space for companies to build stronger cloud portfolio. For big players, it should include dual vertically complete stacks both as services and products. Whoever gets there first will gain enormous advantages over its competitors.

Join Me at Partner Exchange 2011

December 22nd, 2010 2 comments

VMware Partner Exchange takes place twice a year. One happens at the same time/location with VMworld US; the other in places like Las Vegas, Orlando. It’s a dedicated conference to educating and enabling partners for success with VMware. It has merged with Technology Exchange where you can find many technical presentations. I have been speaking at TechnologyExchange since I joined VMware in 2007. Here is the related articles I wrote earlier.

The coming ParterExchange will be in Orlando FL from Feb 7 to 11. Please join us to hear VMware’s plans for the coming year, learn of new technologies and partner programs, and understand the training roadmap. Here is the content catalog with all the sessions. Don’t forget the famous hands-on labs throughout the week. I will talk about securing vSphere infrastructure with vSphere API.

Open Source In Action: Open Source Projects from VMware

December 21st, 2010 2 comments

As a leading edge software company, VMware has a long history of support for open source software in its products. It also contributes back many patches and projects to the open source community including the vijava API that I created. With SpringSource and Zimbra acquisitions, more open source projects are associated with VMware brand.

Here is a list of 10 home grown open source projects from VMware. Please feel free to click links for more details and play with them.

1. Dr. Memory. It’s “a memory monitoring tool capable of identifying memory-related programming errors such as accesses of uninitialized memory, accesses to unaddressable memory (including outside of allocated heap units and heap underflow and overflow), accesses to freed memory, double frees, memory leaks, and (on Windows) access to un-reserved thread local storage slots.”

2. Virtual USB Analyzer. A “free and open source tool for visualizing logs of USB packets, from hardware or software USB sniffer tools. As far as we know, it’s the world’s first tool to provide a graphical visualization along with raw hex dumps and high-level protocol analysis.”

Cloud Architecture Patterns: Service VM

December 20th, 2010 No comments


Provide an easy way to provision new infrastructure and application services for a computing cloud




To run a large-scale computing infrastructure, you will need many different types of services, including compute, storage, and networking, among others. After virtualization has successfully detached compute from the physical hardware, it’s very easy to provision and scale compute. But compute requires storage and networking which are lagging behind. To maximize the benefits of virtualization and cloud computing, it’s natural to push the storage and networking in the same direction.

Looking beyond the infrastructure to consider applications, we need various types of services such as database, directory, messaging, and more. I’ve covered the App VM pattern that allows using IaaS for PaaS in a previous blog. While you can pack some of these services into an application VM, the problem is that it scales well but does not follow the aspectual centralization pattern.


Must Knows About Release Engineering: Lessons From Google

December 15th, 2010 No comments

This is yet another post based on my notes taken at LISA 2010 conference. The talk is The 10 commandments in release engineering by Dinah McNutt from Google. Dinah did a great job in summarizing the basics of release engineering therefore it’s worthwhile to compile my note and share it here.

Note that although typical release engineering does not produce virtual appliances, the basic principles are the same. You will find these basics helpful as well.

Release engineering is a critical part of software engineering and should be treated as products in their own rights. But often there is disconnect between development writing the code and the system administrator who installs it. Release process is usually an afterthought.

Typical Release Process

The following steps are executed during a release run:

From Developer to Devops: What System Administration Skills Should You Know?

December 14th, 2010 2 comments

With the rising trend of devops movement, I was curious about the system administration from a software developer’s perspective. That’s why I sat through Adam Moskowitz’s session “The Path to Senior Sysadmin.” Adam summarized the system administrator’s skills to three categories: hard tech skills, squishy tech sill, and software skills as detailed in following. Again, this is based on my note taken from LISA 2010 conference. For other posts related to the conference, check here.

Hard Tech Skills

  • All the commands for system administration;
  • System backup;
  • Some programming skills like Shell scripting, Perl/Python, C (read);
  • Software engineering knowledge like versioning, process;

Cloud Architecture Patterns: VM Pipeline

December 13th, 2010 No comments


Provide a configurable structure for modularized information processing




Complicated data processing involves many distinctive and repetitive steps. Each of these steps can be mapped to a software module that is independently developed and assembled for particular cases of data processing.

Given the elastic nature of cloud computing, it’s a perfect platform for data processing. We need a solution that is flexible in two ways:

1.     Modularized components for data processing;

2.     Configurable so that different modules can be re-used easily in various cases.


Gadget Talk: What Are the Newest and Most Innovative Gadgets?

December 9th, 2010 No comments

I attended SDForum emerging technology SIG meeting on Wednesday night I. It’s about cool gadgets by award winning tech journalist and entrepreneur Fred Davis, who was part of Weird, CNET, Ask Jeeves, Ziff-Davis publishing on PC Magazine, PC Week, MacUser, A+. The following are gadgets Fred talked about in his presentation. It’s based on my note which may not be accurate. Hopefully it gives ideas what to buy for the holidays.

  1. iPod Touch. It’s like an iPhone but no phone support. You can use Skype with WIFI however. Price: $189 with 8G
  2. iPod Nano. Small music player with price about $139.
  3. iPod Shuffle. Smallest music player in the iPod family. Price: $49
  4. iPad. This is hottest consumer devices that created a record on fastest adoption. Price: $499
  5. iPad keyboard/wireless keyboard.
  6. Kindle Wifi 3G. It’s an e-book reader from Amazon. Compared with other readers or general purpose devices, it comes with no color but best selection of books and digital right management. $189
  7. MiFi 2200 mobile hotspot. It uses Verizon 3G to provide wireless connection to 5 WIFI enabled devices. Price: $269 and could be cheaper with iPad bundle.
Categories: News & Events Tags: , ,

How Twitter Operates Its IT infrastructure: From Process to Tools

December 6th, 2010 No comments

This post is based on my notes taken at the talk by John Adams at LISA 2010 conference. Any mistakes, if any, are all mine. Should you be interested in other sites, check out Google, Facebook, LinkedIn.

As one of the leading social Web site with 165M users, Twitter demands a huge infrastructure support its operation. There are 700M searches and 1,000 tweets per second and can go up to almost 4,000 at peak. The number of tweets is not that impressive, but these tweets need to be distributed to numerous followers which could be several millions after one account.

These days Twitter gets 75% traffic from API and 25% from the Web. The new Web interface heavily uses AJAX and acts as API client to its backend.

As John put it, “nothing works the first time.” His recommendation is to use the best available technology for scaling. You will need to plan and build for more than one time to get it right.

Squares Aren’t Rectangles? A Common Misunderstanding of Object Oriented Design From MSDN Magazine

December 5th, 2010 3 comments

While reading the recent Dec 2010 issue of MSDN magazine, I found an article (Multiparadigmatic .Net, Part 4: Object Orientation) with misunderstandings on object oriented design. I was surprised that the author reached conclusions like, “squares aren’t rectangles,” and “no happy ending here.” The conclusions are based on misunderstandings of object oriented design.

Let me show you what the root problem is and how to get a happy ending. After reading this, you won’t be bothered by “squares aren’t rectangles.”

What’s the problem?

As most people already know, inheritance or generalization (I prefer the latter) is an important feature of OOD. Using it effectively can lead to a good object model and concise codebase. In an inheritance relationship, a subtype must maintain “IS-A” relationship with its super type, for example, a Student type IS-A Person. I think most people are just fine with this.

vSphere Performance Counters for Monitoring ESX and vCenter

December 3rd, 2010 11 comments

VMware vSphere provides comprehensive performance metrics for your needs on performance monitoring and diagnosis. These stats are available through not only vSphere Client but also vSphere APIs. To understand the overall performance management concepts, you want to read this article: Fundamentals of vSphere Performance Management.
Once having the basics, you may wonder what types of stats are exposed. The following table summaries all the 315 performance counters available in vSphere 4.1. As you might have guessed, the information is generated using open source Sphere Java API and then imported into WordPress using WP-Table Reloaded. You can easily sort and search the table.

Update: Carter Shanklin and Luc Dekens have articles on performance counters as well:

Cloud Architecture Patterns: App VM

November 29th, 2010 No comments


Provide packaged software stack as Platform-as-a-Service (PaaS) platform for running applications



As Known As



We all know the three different types of cloud services from Infrastructure-as-a-Service (IaaS), PaaS, to Software-as-a-Service (SaaS). If you want to leverage PaaS, you have to choose one of the PaaS service providers like Google or Microsoft. Leveraging an external PaaS has its own benefits.

What if you want to keep your applications running in-house but still enjoy the benefits of PaaS? Today you don’t have much choice. Google, for example, does not sell its App Engine as a product that you can install and run on premise. You have to run it on the Google cloud.


Really Simple Guidelines to Write Great Code Samples

November 24th, 2010 2 comments

When a developer learns a new programming language or API, the first thing is probably to try out a HelloWorld sample. As said, real programmers don’t read documents. Although I don’t fully agree on that, it has some truth in it.

In my own experiences, I normally continue with other samples after HelloWorld one. When something is not quite clear, I check out the API reference or read some tutorials. Anyway, I am not telling you how to learn a new language or API, but trying to make a point here on the importance of code samples for the developers. In my opinion, samples are the most effective way to empower your users.

I think you would agree with me, there are too many bad samples. Here are some typical symptoms:

  1. Too much boilerplate code to a point that the code illustrating the API usage got buried. Typical boilerplate code includes extensive exception handling, GUI, logging, etc. Some samples even have a common library that could confuse your users totally.
  2. Too many API calls in one sample. You may need several APIs for a use case, but don’t aim one sample for multiple use cases.
  3. Too much object oriented. Object oriented programming is a best practice for application development. But it could confuse your developers sometimes.
  4. Dependencies on other APIs. To run the sample, your users need to install other libraries which may or may not need extra configuration or tuning. To understand the sample, users need to understand additional APIs. Extra burden, really!
  5. Of course, typical bad smells of programming which are not unique for samples. For example, bad naming, unnecessary global variables, using object attributes for passing values between methods, etc.

Now, how you can develop great samples? Besides the best practices writing great applications, you want to follow the following guidelines:

Wire Compatibility of Web Services

November 23rd, 2010 No comments

As a software professional, you may have heard about the source compatibility and binary compatibility. With the Web Services, a new type of compatibility came up. This is what I call wire compatibility. It’s not related to the programming but the XML messages passed on the wire. Since we don’t use XML directly but programming APIs, the wire compatibility surfaces and affects the source and binary compatibility.

Too abstract? You bet. Let’s pick up an example here. Because VMware vSphere API is defined in WSDL, I will use it in the following discussion.

In vSphere 4.1, the method PowerOnMultiVM_Task() gets an additional parameter called option typed as OptionValue array. The following are related parts in the WSDL:

<operation name="PowerOnMultiVM_Task">
  <input message="vim25:PowerOnMultiVM_TaskRequestMsg" />
  <output message="vim25:PowerOnMultiVM_TaskResponseMsg" />
  <fault name="RuntimeFault" message="vim25:RuntimeFaultFaultMsg"/>
<complexType name="PowerOnMultiVMRequestType">
    <element name="_this" type="vim25:ManagedObjectReference" />
    <element name="vm" type="vim25:ManagedObjectReference" maxOccurs="unbounded" />
    <element name="option" type="vim25:OptionValue" minOccurs="0" maxOccurs="unbounded" />

As you can see, the minOccurs of the option element is zero, meaning it’s optional. If you have an application built with 4.0 (no option parameter by then), the SOAP request still works. So it’s compatible on the wire.

Critical Lessons Learned at Facebook on Scalability and Reliability

November 21st, 2010 1 comment is no doubt the biggest web site surpassing Google in terms of Web traffics in an article published half year ago. Given its scale, the lessons learned would be very helpful for others to build scalable IT infrastructures. This post is based on my notes taken at the talk by Robert Johnson and Sanjeev Kumar at LISA 2010 conference. Should there be any mistakes, they are all mine.

According to the speakers, the architecture of is relatively simple: Web servers in the front, databases at the back. In the middle is a caching layer with a lot of memcached servers. If you recall my previous post, they use PHP extensively.

Unlike other sites, like email sites, whose users are well mapped and isolated to different servers, social Websites like Facebook have unique challenges in that their users are linked together. Errors in one part of a system may cascade easily and bring down the whole site.

Here are several important lessons Facebook learned while building software and operating the site:

What You Can Learn from IBM Research on Designing Private Cloud

November 16th, 2010 No comments

IBM Researcher Kyung Ryu presented a private cloud RC2 at LISA 2010 conference. As a typical IBM project, the presentation has 20+ co-authors. The following is based on my notes taken from the session, therefore may contain my misunderstandings.

Having an internal cloud is not a big deal these days. You can find several products from the market. What is truly unique and challenging for RC2 is that it supports very different virtualization platforms from X86 based hypervisors on X-series servers, to IBM PowerVM on P-series, to the mainframe based native virtualization on Z-series. Therefore RC2 is really a hybrid private cloud.

The talk focused on system architecture with several diagrams. I cannot reproduce these diagrams but would list the key components of the system:

What Lessons You Can Learn from Google on Building Infrastructure

November 15th, 2010 No comments

Last week I attended a great talk by Google Fellow Jeffrey Dean at Stanford University. Jeff talked about his first hand experience on building software systems at Google since 1999 and lessons learned. The following summary is solely based on my notes, therefore may contain my misunderstandings.

A Brief History

During the past 10 years or so, the scale of the Google infrastructure has grown exponentially: # docs 1,000X; #query, 1,000X; per doc index, 3X; update rate from months to seconds, 50,000X; query latency, 5X; computer and computing powers, 1,000X. The underlying infrastructure has experienced 7 major revisions in the last 11 years.

At the concept level, the search infrastructure is simple. It has web servers upfront taking search queries. The queries are then passed on to two different types of servers: index servers and doc servers. For the index server, the input is the query string and the output is an array of doc-id and score pairs. For the doc servers, the input is the doc-id and query pair and the output is the title and snippet of the doc. Note that the snippet of the doc is query dependent so that you can find your keywords highlighted in the result pages. How to quickly and accurately calculate the output based on input involves a lot of advanced algorithms, and is not in the scope of Jeff’s talk.

Cloud Architecture Patterns: Façade VM

November 15th, 2010 No comments


Provide a single point of contact for a large-scale system consisting of many virtual machines so that they are viewed as one giant VM from outside



As Known As

Giant VM


When a system becomes big, you need multiple VMs to support the workload. For ease of use reasons, external users don’t want to manage multiple connections to each of the virtual machines. Who wants to remember a list of IP or DNS names for a service? Also, you just cannot expect your users to pick up the least-busy VMs for balanced workloads across your cluster of VMs. And to scale your application when your overall workload increases, you want a seamless way adding new capacities without notifying others.

Finally, if you offer a public service, you don’t want to allocate a public IP address for each of your VMs. These days, public IPs are scarce resources and may cost you money.