Archive

Author Archive

What You Can Learn from IBM Research on Designing Private Cloud

November 16th, 2010 No comments

IBM Researcher Kyung Ryu presented a private cloud RC2 at LISA 2010 conference. As a typical IBM project, the presentation has 20+ co-authors. The following is based on my notes taken from the session, therefore may contain my misunderstandings.

Having an internal cloud is not a big deal these days. You can find several products from the market. What is truly unique and challenging for RC2 is that it supports very different virtualization platforms from X86 based hypervisors on X-series servers, to IBM PowerVM on P-series, to the mainframe based native virtualization on Z-series. Therefore RC2 is really a hybrid private cloud.

The talk focused on system architecture with several diagrams. I cannot reproduce these diagrams but would list the key components of the system:

What Lessons You Can Learn from Google on Building Infrastructure

November 15th, 2010 No comments

Last week I attended a great talk by Google Fellow Jeffrey Dean at Stanford University. Jeff talked about his first hand experience on building software systems at Google since 1999 and lessons learned. The following summary is solely based on my notes, therefore may contain my misunderstandings.

A Brief History

During the past 10 years or so, the scale of the Google infrastructure has grown exponentially: # docs 1,000X; #query, 1,000X; per doc index, 3X; update rate from months to seconds, 50,000X; query latency, 5X; computer and computing powers, 1,000X. The underlying infrastructure has experienced 7 major revisions in the last 11 years.

At the concept level, the search infrastructure is simple. It has web servers upfront taking search queries. The queries are then passed on to two different types of servers: index servers and doc servers. For the index server, the input is the query string and the output is an array of doc-id and score pairs. For the doc servers, the input is the doc-id and query pair and the output is the title and snippet of the doc. Note that the snippet of the doc is query dependent so that you can find your keywords highlighted in the result pages. How to quickly and accurately calculate the output based on input involves a lot of advanced algorithms, and is not in the scope of Jeff’s talk.

Cloud Architecture Patterns: Façade VM

November 15th, 2010 No comments

Intent

Provide a single point of contact for a large-scale system consisting of many virtual machines so that they are viewed as one giant VM from outside

Category

Structural

As Known As

Giant VM

Motivation

When a system becomes big, you need multiple VMs to support the workload. For ease of use reasons, external users don’t want to manage multiple connections to each of the virtual machines. Who wants to remember a list of IP or DNS names for a service? Also, you just cannot expect your users to pick up the least-busy VMs for balanced workloads across your cluster of VMs. And to scale your application when your overall workload increases, you want a seamless way adding new capacities without notifying others.

Finally, if you offer a public service, you don’t want to allocate a public IP address for each of your VMs. These days, public IPs are scarce resources and may cost you money.

LISA 2010 Conference

November 14th, 2010 No comments

Later last week I attended 24th LISA conference by USENIX at San Jose Convention Center. The name LISA stands for Large Installation System Administration. It’s a great conference focusing on technology, training, and professional development for system administrators.

I am not a system administrator, but wanted to know more about system administration in general because of devops movement. So I attended many technical sessions covering from storage, networking, release engineering, cloud computing, social network website management, to career development as a system admin. I will blog some of these sessions I attended based on my notes.

Cloud Architecture Patterns: Stateless VM

November 8th, 2010 No comments

Intent

Ensure a virtual machine does not carry a permanent state so that it can be easily provisioned, migrated, and managed in the cloud.

Also Known As

Disposable VM

Category

Behavioral

Motivation

Virtualization is the cornerstone for cloud computing, especially at the infrastructure level. With many virtual machines created, managing them becomes a big challenge.

Among these challenges are system provisioning, backup, archiving, and patching different virtual machines. These administrative tasks take lots of CAPEX and OPEX.

We need a better way to architect applications for the cloud.

Solution

Making a VM stateless solves a lot of problems. For one thing, you force applications to save data outside of the virtual machine. No longer do you need to back up the virtual machine – only the data. It also makes the system provisioning easier without differentiating the different instances. When the stateless VMs crash for whatever reasons, you don’t lose much. Just add a new virtual machine and voila!

Cloud Architecture Patterns: Aspectual Centralization

November 7th, 2010 No comments

Intent

Separate concerns in large scales computing by leveraging different types of services in the cloud

Category

Behavioral

Motivation

The history of computing reveals different eras starting from mainframe to client/server to Web computing. With mainframes, computing is contained within the boundary of a mainframe. With client/server and web computing we see the separation of the presentation from the data. With all these computing models, the data is owned and maintained by different applications. The IT staffs who run and maintain the applications are responsible for backing up and maintaining data.

With the rise of cloud computing, I see a new trend that will fundamentally change the game and push productivity to all new levels. I call this “Aspectual Centralization” (AC). This is as important to cloud architecture as Model-View-Controller (MVC) is to software architecture.

Solution

With AC, different aspects of an application are extracted out and delegated to centralized services: data services, messaging services, logging services, and so on.

Top 3 Trends Every IT Professional Should Care About

November 6th, 2010 No comments

IBM DeveloperWorks recently published the result of a survey of 2000 IT professionals excluding IBM employees. The key findings are:

  1. Cloud Computing to overtake on-premise computing. For the question, “how do you rate the potential for cloud computing to overtake on-premise computing as the primary way organizations acquire IT by 2015?” 30.4% said likely, 21.6% most likely, and 13.6% definitely.
  2. Mobile application development to dominate. 55% of respondents see app development on mobile grows than other platforms in 5 years.
  3. IT professionals need, but often lack, industry-specific knowledge. 28.3% thought moderately important, 45.6% very important, and 15.9% extremely important. This is not an IT trend per se, but represents the demands for IT professionals.

The first two findings are mentioned and somewhat confirmed by another survey by Forrest and Dr. Dobbs, which is more developer oriented:

Cloud Computing Expo 2010 Silicon Valley

November 3rd, 2010 No comments

The Cloud Computing Expo takes place in the Santa Clara Convention Center from this Monday to Thursday. This twice-a-year event attracted thousands of attendees. Thanks to the invitation from Jeremy Geelan, I went to conference checking out several sessions and the exhibitions.

I found many familiar companies in the exhibition, from Oracle, Microsoft, VMware, and many other companies. Unlike VMworld, I don’t find many IHVs in the show. The ISVs demoed their products with strong focuses on Cloud. Microsoft for example demoed its Windows Azure family of services; VMware demoed its vCloud Director. I even found IBM booth which was much smaller than I expected. It turned out to be its recently acquired CastIron part.

Here are several companies I found interesting technologies from my trip:

Code2Cloud Reborn With a Greater Purpose

November 2nd, 2010 No comments

If you attended last year’s VMworld keynote by Steve Herrod or watched the online broadcasting, you may still recall the code2cloud.com website (see the top banner here). That was a very simple Web application meant for the keynote attendees to submit names and email addresses to win a chance to go to the backstage with the Foreigner band. The website was hosted at Terramark vCloud and continued to run for about one month afterward.

Cloud Architecture Patterns: VM Pool

November 1st, 2010 4 comments

Intent

Provide a mechanism to fast provision virtual machines (VMs) and manage their lifecycles by maintaining a pool of virtual machines.

Category

Creational

Motivation

Virtual machines can be expensive to create. It takes several minutes to create a new virtual machine. Technologies like linked clone and storage offloading can help speed up the process, but it still takes time. And these alternative approaches, in some use cases, do not help when you need instant provisioning.

Solution

It’s generally a good practice to pool resources that are expensive to create. In programming, you pool threads and check them out on demand. When it’s done, you check them in back to the pool. This is what most Web Servers do for high performance.

You can leverage the same idea for VM provisioning. You create new virtual machines and put them into a pool. When there is a new request, you just check out one virtual machine from the pool. The following diagram shows how it works.