Tag Archives: software architecture

Platform vs. Stairform

It’s probably fair to say anyone working in software knows a term called platform. It’s a term borrowed from transportation industry, where a raised and flat space on which passengers trains in a station. In software, it means something you can leverage, either an environment for running your software or a development library for building your applications.

Like many things in software, platform has never had a clear definition. Different people basically have their own versions of definitions. That is not necessarily a bad thing – at least it helped

Posted in Software Development | Also tagged , | Leave a comment

Non-Patentable Software Architecture

I was working at Rational Software when IBM bought it almost 10 years ago. After the acquisition, we got emails and trainings encouraging us to file patent disclosures. As you may have known, IBM has been the top patent holder in the USA, maybe worldwide as well. At that time, I coined the term “non-patentable software architecture.”

Pattern system was invented to protect and encourage innovations. There are three criteria for a valid pattern: novelty, usefulness, and non-obviousness. Those are all in relative terms. As I can tell, there is rarely anything

Posted in Software Development | Also tagged | 1 Response

What You Can Learn from IBM Research on Designing Private Cloud

IBM Researcher Kyung Ryu presented a private cloud RC2 at LISA 2010 conference. As a typical IBM project, the presentation has 20+ co-authors. The following is based on my notes taken from the session, therefore may contain my misunderstandings.

Having an internal cloud is not a big deal these days. You can find several products from the market. What is truly unique and challenging for RC2 is that it supports very different virtualization platforms from X86 based hypervisors on X-series servers, to IBM PowerVM on P-series, to the mainframe based native virtualization on Z-series. Therefore RC2 is really a hybrid private cloud.

The talk focused on system architecture with several diagrams. I cannot reproduce these diagrams but would list the key components of the system:

Posted in Cloud Computing | Also tagged , , , , | Leave a comment

What Lessons You Can Learn from Google on Building Infrastructure

Last week I attended a great talk by Google Fellow Jeffrey Dean at Stanford University. Jeff talked about his first hand experience on building software systems at Google since 1999 and lessons learned. The following summary is solely based on my notes, therefore may contain my misunderstandings.

A Brief History

During the past 10 years or so, the scale of the Google infrastructure has grown exponentially: # docs 1,000X; #query, 1,000X; per doc index, 3X; update rate from months to seconds, 50,000X; query latency, 5X; computer and computing powers, 1,000X. The underlying infrastructure has experienced 7 major revisions in the last 11 years.

At the concept level, the search infrastructure is simple. It has web servers upfront taking search queries. The queries are then passed on to two different types of servers: index servers and doc servers. For the index server, the input is the query string and the output is an array of doc-id and score pairs. For the doc servers, the input is the doc-id and query pair and the output is the title and snippet of the doc. Note that the snippet of the doc is query dependent so that you can find your keywords highlighted in the result pages. How to quickly and accurately calculate the output based on input involves a lot of advanced algorithms, and is not in the scope of Jeff’s talk.

Posted in Software Development | Also tagged , | 1 Response

Decomposition and Challenges in Parallel Programming: Is It Useful for Cloud Computing?

A recent article from Dr. Dobb’s introduced Fundamental Concepts of Parallel Programming. Richard Gerber and Andrew Binstock, authors of Programming with Hyper-Threading Technology, discussed three different forms of de-compositions for multi-threading:

  1. Functional decomposition. It’s one of the most common ways to achieve parallel execution. Using this approach, individual tasks are catalogued. If two of them can run concurrently, they are scheduled to do so by the developer.
  2. Producer/Consumer. It’s a form of functional decomposition in which one thread’s output is the input to a second. Can be hard to avoid, but frequently detrimental to performance.
  3. Data decomposition, a.k.a. “data level parallelism.” It breaks down tasks by the data they work on, rather than by nature of the task. Programs that are broken down via data decomposition generally have many threads performing the same work, just on different data items.

To make the three forms easy to understand, the authors used gardening as analogy where the threads map to gardners. For exmaple, the fuctional decomposition in gardening is to have one gardner to move the lawn and the other to weed. I find this analogy very intuitive and easy to follow. Even you don’t know multh-threading, you can guess it out from the gardening analogy.

The challenges while working with multi-threading are:

Posted in Cloud Computing | Also tagged , | Leave a comment

Zimbra Architecture Overview – A Must Read Document

It’s been a while after VMware acquired Zimbra. In a VMware Console blog, VMware CTO Steve Herrod explained how Zimbra fits in VMware’s mission to simplify IT.

After the acquisition, I actually tried the online demo at Zimbra website. The impression I had was that the Zimbra client was very much like Microsoft Outlook in a browser.

Posted in Cloud Computing, Software Development | Also tagged , | 2 Responses

Top 10 Best Practices Architecting Applications for VMware Cloud (part 4)

This 4th and last part contains best practice No.7 ~ 10. To be notified for future posts, feel free to subscribe to this feed, and follow me at Twitter.

#7 Levarage vApp

vApp is a new addition to vSphere. It’s essentially a group of VMs that work together as a solution. You can manage them as a basic unit like a VM. It provides you higher level granularity for resource allocation and management.

This is an ideal container for your application if you have multiple virtual machines involved. They may or may not form a cluster, but are bundled together for a same goal.

The vApps are not only easily managed by the vSphere, but also imported and export as a bundle. Therefore you can easily move it without worrying what should be included while copying it.

VMware provides tools like VMware Studio using which you can create and configure vApps easily. The VMware Studio has Web based console, customization and build engine, build process automation with CLI (command line interface).

Other alternatives include:

Posted in Cloud Computing, Software Development | Also tagged , | Leave a comment

Top 10 Best Practices Architecting Applications for VMware Cloud (part 3)

This 3rd part contains best practice No.4 ~ 6. To be notified for the rest, feel free to subscribe to this feed, and follow me at Twitter.

#4 Scale Applications as Needed

Most time, people think scalability is to handle more workload when needed. This is true, but not enough. A truly scalable system should scale back. This is how you will save money. This is equally important as the first case where you get more revenue by serving more traffic.

There are different ways to scale:

  • Up and down. This is unique in virtualized environment in which you adjust the memory or CPU allocation and use more or less of them instantly.
  • Out & in. This means you include more machines either physical or virtual into your application.

You have to think over several architecture decision points:

Posted in Cloud Computing, Software Development | Also tagged , | Leave a comment

Top 10 Best Practices Architecting Applications for VMware Cloud (part 2)

This second part contains best practice No.1 ~ 3.

#1 Move Up to Higher Level Software Stack 

“If I have seen a little further it is by standing on the shoulders of Giants.”

 Isaac Newton

Modern software development is all about leverage. You don’t want to build everything from bottom up. Whatever “giant shoulders” you can leverage, you should do so. Remember the keyword here is “giant shoulders.” You got to be selective on what is “a giant” and what is not, for the best quality of your system. Your application’s quality is a function of that of the systems underneath it.

The typical “giant shoulders” includes middleware, high level programming languages, tools. By leveraging these, you should expect your applications portable, and with much less code and higher developer productivity. The portability is very important when you want to deploy your applications in federated cloud environment.

I posted a short article after the cloud demo in VMworld 2009 keynotes. It’s reposted at SpringSource blog. The article introduced “DIY PaaS” concept which you could have a higher level of development platform inside your enterprise in a similar way as would you get from vendors like Google, but without vendor lock-in.

The “DIY PaaS” does not require you to use any specific platforms, middleware or framework. You could use any existing combination of systems on top of Java, .Net, Python, PHP. For example, you could use Java with Spring framework for building your web applications or enterprise integration frameworks. The choice is really yours, not of any vendors. When making a decision, you want to consider various factors like your team’s expertise and preference, total cost of software licenses, design constraints posted by for example existing investments on particular software.

Having decided the combination of software stack, you want to pack them into virtual machine templates that can be re-used by various teams. If you have multiple combinations, you can have multiple virtual machine templates in your catalog.

In general, you want as few templates as possible. Why?

Having less VM templates means less effort to build them, to manage, to upgrade, and to test. This might not seem like a big deal but could become a big deal in a longer term when you have to maintain multiple versions of these templates at the same time.

It also means less storage. vSphere has a special technology called linked clone. The new virtual machine doesn’t fully clone the disks, but links back to the template. If you have least templates, you can have a huge saving on the disk space. High quality storage can be very expensive.

Last but not least benefit is less memory. vSphere has memory page sharing technology which keeps one copy of same page contents, and converts others as pointers to the single copy. It’s only possible when you have identical memory pages. When you have virtual machines cloned from a same templates, the chance of identical memory pages increases dramatically.

There are several techniques to keep the least number of virtual machine templates:

Posted in Cloud Computing, Software Development | Also tagged , | Leave a comment

Top 10 Best Practices Architecting Applications for VMware Cloud (part 1)

Overview

With more data centers running on purely virtualized systems, more applications are therefore running on top of virtualized environment than ever before. These virtualized systems are interconnected with each other and become a cloud platform. Virtualization is the pre-requisite and starting point of the cloud computing at the IaaS level.

Posted in Cloud Computing, Software Development | Also tagged , | 5 Responses

vSphere Java API Architecture Deep Dive

In my previous blog, I talked about the object model of the vSphere API. Many people like the UML diagram that illustrates how the managed objects are inherited from each other.

Following that blog, I will introduce the object model of the open source Java API that is built on top of the Web Services, as well as some key design decisions I made while designing the API.

The following UML diagram is extracted from the overall model but adds much more details with properties and methods. If you can understand this diagram, you can then easily understand all other managed object types.

Posted in Software Development, vSphere API | Also tagged , | 3 Responses

Trying Self Paced Lab without Going to PEX 2010

I mentioned the vSphere API self paced lab at PEX in my previous blog. Not all the people who are interested in learning the API made it to PEX last month. A reader asked me when it can be online in his comment.

Here is the VI Java API part in the tutorial. We had the environment set up all together for you in the PEX lab including the Eclipse and all the related jar files. So it’s very easy to get started there. Without going to PEX, you need to do something extra by yourself. But that is not too hard at all. I promise it won’t take you much time at all. To get the basic one done, you probably need 5 to 30 minutes depending on your familiarity with Java.

Ready to learn?

Posted in Software Development, vSphere API | Also tagged , | Leave a comment

Two Books Every Top Software Architect Should Read

When asked what books to read, I always recommend the following two books:

Are Your Lights On?: How to Figure Out What the Problem Really Is by Donald C. Gause; Gerald M. Weinberg

The Design of Everyday Things by Donald A. Norman

“But, they seem nothing to do with software!” You say.

You are right. But remember the blog title has a keyword “top.” If you want to stand out in any profession, some of the extra skills may well be outside your typical set as others expect.

The same is true for software profession. I assume you already know the basics of software programming, process, design patterns, etc., so another programming book doesn’t help you much to the top technically.

Let me explain why and how these two books help you.

Posted in Software Development | Also tagged | Leave a comment
  • NEED HELP?


    My company has created products like vSearch ("Super vCenter"), vijavaNG APIs, EAM APIs, ICE tool. We also help clients with virtualization and cloud computing on customized development, training. Should you, or someone you know, need these products and services, please feel free to contact me: steve __AT__ doublecloud.org.

    Me: Steve Jin, VMware vExpert who authored the VMware VI and vSphere SDK by Prentice Hall, and created the de factor open source vSphere Java API while working at VMware engineering. Companies like Cisco, EMC, NetApp, HP, Dell, VMware, are among the users of the API and other tools I developed for their products, internal IT orchestration, and test automation.