My First Try of Hadoop Azure

During the breaks of my vacation last week, I tried the Technology Preview for the Apache Hadoop-based Service on Windows Azure. The service is not yet publicly available and requiring Microsoft approval. Here is the link that I used to file my application. It took several days for me to get the email with invitation code. Sorry that I cannot include the code here. :-)

As with most other typical Microsoft products, the sign-in process is pretty smooth. Not much surprise there. After signing in, the first thing is to create a new Hadoop cluster. For that, you have to pick a DNS name for the management virtual machine for the new cluster. The suffix of the DNS name is always As you can image, I picked which was still available.

Lost VMs or Containers? Too Many Consoles? Too Slow GUI? Time to learn how to "Google" and manage your VMware and clouds in a fast and secure HTML5 App.

The cluster size defaults to 2 nodes and 1 TB disk space, and cannot be changed. For evaluation purpose, it’s pretty good not to mention that it’s free for 5 days. I assume you can change cluster size in real services provided that you also supply your credit card info.

Below the cluster size, you enter the username and password (somehow, the password rule is a little strange in that you cannot include any symbols which are frequently required). You need this credential to RDP to the management virtual machine later on. Optionally you can select SQL Azure. I didn’t choose that because I wanted to keep it simple. It took several, probably more than 10, minutes to create a new cluster.

When the cluster is created, the GUI looks like the following. Note that after doing private beta under NDA with VMware for years, I wondered whether same restriction would apply especially it’s invitation only preview. Then I found an article on Technet with more shared, and figured I should be fine.

The GUI is the new Metro style consistent with that of Windows 8. Using that is quite simple and straight forward – just click on one of the blocks. The primary one is the blue one: Create Job. After a new Hadoop job is created, a new block comes after. If you don’t have Hadoop application developed, you can just deploy any of the 9 samples as I did.

The management virtual machine is a windows based machine with unique IP address accessible from outside. After RDPing to it, you can also run Hadoop commands, and do things with a typical windows machine except that the administrative privilege is restricted.

After playing with the cluster and management VM, I’ve released the cluster. The overall experience is pretty good with a nice balance of simplicity and features. As I tweeted about it, I got follow-ups from Matt Winkler who is the program manager for Hadoop Azure at Microsoft. Please feel free to ping him on Twitter.

This entry was posted in Big Data, Cloud Computing and tagged , . Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.

One Trackback

  • By Tofa IT » My First Try of Hadoop Azure on September 22, 2012 at 5:44 am

    […] I used to file my application. It took several days for me to get the email with invitation […]My First Try of Hadoop Azure originally appeared on DoubleCloud by Steve Jin, author of VMware VI and vSphere SDK (Prentice […]

Post a Comment

Your email is never published nor shared. Required fields are marked *


You may use these HTML tags and attributes <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>


    My company has created products like vSearch ("Super vCenter"), vijavaNG APIs, EAM APIs, ICE tool. We also help clients with virtualization and cloud computing on customized development, training. Should you, or someone you know, need these products and services, please feel free to contact me: steve __AT__

    Me: Steve Jin, VMware vExpert who authored the VMware VI and vSphere SDK by Prentice Hall, and created the de factor open source vSphere Java API while working at VMware engineering. Companies like Cisco, EMC, NetApp, HP, Dell, VMware, are among the users of the API and other tools I developed for their products, internal IT orchestration, and test automation.