Rethink Virtual Machine Template: It’s Not What It Is
In the virtualization world, virtual machine template (as know as virtual machine image) is a big feature. It allows users to quickly deploy a new virtual machine without the steps to install a new operating system and other software. Because of this feature, we start to have a new problem with too many (unused or useless) virtual machines. But this is a separate topic that deserves its own discussion.
Now, thinking more about the virtual machine template, it actually does more what its name suggests. Literally speaking, a virtual machine template is a template for virtual machines which are virtual counterpart for physical machines. In theory, it should only include specification for virtual machine hardware itself, for example, number of CPUs, number and size of hard disks, NICs, etc. In reality, it also includes a preloaded hard disk with full stacks (OS, middleware, and applications) installed. Therefore, a virtual machine template is really a virtual machine system template.
The advantage of including the system is the speed and consistency for provisioning a new virtual machine system. In several minutes a new system is ready and guaranteed to the same (almost) as the old system.
The disadvantage is that it can be huge in term of storage. A production virtual machine template can easily go above 10G. It takes time and network bandwidth to move the virtual machine around. It’s also not easy to upgrade the existing software stack – you have to turn it to a running virtual machine and then back to a virtual machine template. In a diversified environment, it’s also not flexible enough – for each type of virtual machine in 3 tier Web application, you need a corresponding template.
To avoid the real disk image, you can use metadata alternatives like Chef and Puppet with which you can write scripts. These scripts would be significantly smaller than the installed system. More importantly, you don’t need a virtual machine template for each type of virtual machine, which means a big saving on storage. For each type of virtual machine, you just need a separate set of the scripts.
The downside of using metadata is that it’s much slower than the virtual machine template approach, and it requires network connectivity and provision server like Puppet master.
Like most cases in engineering, there is no one best solution for all. A good engineer should consider both problem and technology domains for a good solution.
In most cases, a good trade-off is to have a virtual machine template with basic (least dominator of all virtual machines, mostly OS itself plus Puppet agent for example) software which can be provisioned as usual. Then the Puppet kicks in to install extra software packages.