This second part contains best practice No.1 ~ 3.
＃1 Move Up to Higher Level Software Stack
Bothered by SLOW Web UI to manage vSphere? Want to manage ALL your VMware vCenters, AWS, Azure, Openstack, container behind a SINGLE pane of glass? Want to search, analyze, report, visualize VMs, hosts, networks, datastores, events as easily as Google the Web? Find out more about vSearch 3.0: the search engine for all your private and public clouds.
“If I have seen a little further it is by standing on the shoulders of Giants.”
Modern software development is all about leverage. You don’t want to build everything from bottom up. Whatever “giant shoulders” you can leverage, you should do so. Remember the keyword here is “giant shoulders.” You got to be selective on what is “a giant” and what is not, for the best quality of your system. Your application’s quality is a function of that of the systems underneath it.
The typical “giant shoulders” includes middleware, high level programming languages, tools. By leveraging these, you should expect your applications portable, and with much less code and higher developer productivity. The portability is very important when you want to deploy your applications in federated cloud environment.
I posted a short article after the cloud demo in VMworld 2009 keynotes. It’s reposted at SpringSource blog. The article introduced “DIY PaaS” concept which you could have a higher level of development platform inside your enterprise in a similar way as would you get from vendors like Google, but without vendor lock-in.
The “DIY PaaS” does not require you to use any specific platforms, middleware or framework. You could use any existing combination of systems on top of Java, .Net, Python, PHP. For example, you could use Java with Spring framework for building your web applications or enterprise integration frameworks. The choice is really yours, not of any vendors. When making a decision, you want to consider various factors like your team’s expertise and preference, total cost of software licenses, design constraints posted by for example existing investments on particular software.
Having decided the combination of software stack, you want to pack them into virtual machine templates that can be re-used by various teams. If you have multiple combinations, you can have multiple virtual machine templates in your catalog.
In general, you want as few templates as possible. Why?
Having less VM templates means less effort to build them, to manage, to upgrade, and to test. This might not seem like a big deal but could become a big deal in a longer term when you have to maintain multiple versions of these templates at the same time.
It also means less storage. vSphere has a special technology called linked clone. The new virtual machine doesn’t fully clone the disks, but links back to the template. If you have least templates, you can have a huge saving on the disk space. High quality storage can be very expensive.
Last but not least benefit is less memory. vSphere has memory page sharing technology which keeps one copy of same page contents, and converts others as pointers to the single copy. It’s only possible when you have identical memory pages. When you have virtual machines cloned from a same templates, the chance of identical memory pages increases dramatically.
There are several techniques to keep the least number of virtual machine templates:
- Standardize your OS. The OS is mostly the biggest piece of bits in your templates. Choosing one OS could mean half of your success.
- Install everything into your templates. If you have multiple combinations, the differences are really pretty small piece of software on the top level. You should choose to install all of them into one. There may be some waste of storage and cloning efforts, but overall it should save you more than otherwise.
- Externalize the configurations. Don’t try to come up with a new template just for a different configuration of a piece of software. You should inject the configuration after the VM is being clone, or configure the system on the fly.
- Install software after being cloned. This is a different from above methods that quite more efforts should be invested to manage the software installation, and exception handling. Not only you need to design your installation logics, but also take care of things like firewall. Some companies have policies that software must be pre-approved before it can be installed. Therefore this approach may not be feasible.
After your standard templates are ready, your application should be deployed and tested there. Depending on your application release cycle, you can choose two different ways for your application packaging, either into the template or install later. The shorter the cycle, the better the latter approach. You have to think more from the beginning of your system design. Make sure you loop in the operation team on this design due to the future impact on them.
＃2 Don’t Assume Anything
The virtualized cloud environment could be very dynamic unless you choose not. The virtual machines or virtual appliances can move around from one physical machine to another without noticeable disruption of service to the external. Your application should not assume anything on the physical location where your applications are running.
As one often used identifier, the IP address could also change each time the virtual machine powered on or restarted. You could use static IP address as a matter of your network design. To simplify the operation, mostly you want to use DHCP to allocate dynamic IP for virtual machines unless there is a strong reason for not so doing. Even the MAC address can be changed easily with a virtual machine.
With this in mind, you don’t want to use either IP address or MAC address to identify your virtual machine or application. Directory service and self discovery should be used instead.
In most cloud environment, especially in public cloud, there are private IPs and public IPs. The public IP addresses are these that are visible from external. Given the limited number of totally available IPv4 addresses, the public IPs are scarce resources. Therefore you will have several public IPs for all your virtual machines, or you need to pay extra for more.
Even you can pay more for extra IPs, it’s really not necessary most of time. What is typically done is the front end virtual machines get public IP addresses while the middle or backend virtual machines don’t. They still communicate with each other using the private IP addresses.
This is mostly a restriction of the public cloud environment like Terremark vCloudExpress. For private cloud this is less a problem.
Besides the networking unexpectedness, the virtual machines can start and shutdown over the time, or even crash unexpectedly. This could be caused by the hypervisor, but also by the OS or applications.
Another interesting aspect of cloud computing is you could save by control what time you run your applications. The reasons could be multiple. For example, the cloud service providers want to balance their workload between day time and night time. They may need to keep the hypervisors running regardless there is workload or not. With cheaper rate at nights, they can get more workload shifted to nights to offset their electricity bill.
For enterprises with DPM feature enabled, the power saving is less a problem because the un-used machines are powered off, and then powered on when workloads come back. Still, you get cheaper rates from the power grid, and want to do more at nights if possible. If you applications are flexible enough on timing, you can end up saving big bucks.
With the lifecycles of applications, you may see something different as well. Really large applications, for example, may take a while to upgrade, during which different versions of components co-exist and run together. Therefore you have to handle compatibility carefully. At the minimum, you should keep the interfaces among components relatively stable, if not the same, while evolving your applications. This demands that you pay extra attention on the interface design. You can change internal interfaces easily but not the ones with other components.
There are other unexpected things in cloud computing. You got to be prepared and handle them carefully.
＃3 Decouple Your Applications
This has been advocated long time as a proven design best practice. There is really nothing new as a basic design principle here, but with new context.
Loosely coupled applications do not have to be distributed over the network. But if your applications are distributed, you will be better off to couple them loosely. General speaking, a loosely coupled applications scale better. You can put in more resources in the bottle-neck components whenever needed.
Loosely coupled applications are also easier to develop and test. You can naturally break down the applications into smaller components to different engineers or teams. Along the development, you can unit test these components.
In virtual cloud, you can use virtual machine as a unit for componentizing your applications. Then you use IP over IPC (Inter-Process Communication) as the way to wire components together. It mostly means slower performance but breaks out the resource limit of single VM for better scalability. Nothing comes for free. Architecture design is about trade-off for the best balance at system level.
vSphere provides you an alternative for faster communication – VMCI. But it requires your VMs are on the same ESX hypervisors, which may limit your ability to scale your application. It’s not recommended unless you have to solve a performance bottleneck and understand the limitation.
Although you can use point to point IP communication, you should consider messaging bus for really large scale of systems. Each component or application doesn’t need to know each other as the messaging bus takes care of these details.
To fully leverage the virtual cloud, you should consider the following:
- Decouple your application from the OS & middleware so that you can have stateless VM template. It can save you efforts to pre-configure VMs after being cloned.
- Decouple data from code. The data should be externalized so that they can be changed without toughing your code. This is especially important for the compiled applications so that they don’t need to be re-compiled for small changes.
- Isolate mission critical from non-mission critical parts. You can then assign different human resources at development time, and allocate different computing resources at production environment. For example, for the mission critical part, you may want to use FT to guard against any hardware failure.
- Decouple utilities from core. For example, the billing is not a core part of a system. You can defer its running at night time with lower priority.
Please stay tuned for the rest of best practices by subscribing to this feed.