GUI Front End for Hadoop

I went to LinkedIn last Wednesday for a tech talk by UC Berkeley professor Joseph Hellerstein on Programming for Distributed Consistency: CALM and Bloom. This is indeed a highly specialized topic, so I am not going to talk about the details. Should you be interested in the new programming language Bloom, you can check the web site (http://bloom-lang.org).

What I will discuss here is the data processing tool the speaker introduced at the end of his talk. The tool is called Wrangler developed by Stanford University VIS group. According to the project page, “Wrangler is an interactive tool for data cleaning and transformation. Spend less time formatting and more time analyzing your data.”

Bothered by SLOW Web UI to manage vSphere? Want to manage ALL your VMware vCenters, AWS, Azure, Openstack, container behind a SINGLE pane of glass? Want to search, analyze, report, visualize VMs, hosts, networks, datastores, events as easily as Google the Web? Find out more about vSearch 3.0: the search engine for all your private and public clouds.

The speaker quickly demoed how to easily convert a text file to a tabular data set. When I tried it later at home, it doesn’t seem so easy to use. Lack of practice, I guess. :-)

This tool reminds me of the problem of Hadoop today. Although it’s a data processing tool, it remains to be a game for developers. Making it easy for business users like data scientists who may not know Java programming or PIG will surely accelerate the adoption of the open source project. More importantly, it will bring in larger revenues by helping people who are more willing to pay than developers.

As I mentioned in my Hadoop Summit summary, Datameer has done a decent job to use Excel like Web front end to hide the complexity. That significantly reduces the learning curves. At the same time, it’s limited by Excel processing model and constraints. In other words, it may not be flexible enough to handle real world cases.

In my understanding, the Wrangler tool has the potential to be further developed as a more generic Hadoop front end for business users. It won’t be as flexible as Java but probably sophisticated enough for most use cases.

Want to check out the tool by yourself? Here is the link (http://vis.stanford.edu/wrangler/) to the project home where you can find the button to launch the demo system written using JavaScript. Note that it doesn’t work with Microsoft IE, so you want to use Chrome, Safari, or Mozilla. On the same page, you can also find a 3+ minute demo video.

This entry was posted in Big Data and tagged , . Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.

One Comment

  1. Posted July 12, 2013 at 1:10 am | Permalink

    One of the main open source UI for hadoop is Hue http://gethue.com 😉

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*

You may use these HTML tags and attributes <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

  • NEED HELP?


    My company has created products like vSearch ("Super vCenter"), vijavaNG APIs, EAM APIs, ICE tool. We also help clients with virtualization and cloud computing on customized development, training. Should you, or someone you know, need these products and services, please feel free to contact me: steve __AT__ doublecloud.org.

    Me: Steve Jin, VMware vExpert who authored the VMware VI and vSphere SDK by Prentice Hall, and created the de factor open source vSphere Java API while working at VMware engineering. Companies like Cisco, EMC, NetApp, HP, Dell, VMware, are among the users of the API and other tools I developed for their products, internal IT orchestration, and test automation.