Hadoop vs. Tomcat

In my previous article, I talked about three different ways enterprises use Hadoop. Thinking a bit more, you may have come to realize that the three usage patterns are very similar to how we use Tomcat. I will compare these two for commonalities and differences.

First of all, both Hadoop and Tomcat are Java based open source projects from Apache Foundation, thus copyrighted by the same Apache license. As a result, you can freely use Hadoop in the same way as you have used Tomcat in terms of license compliance.

Bothered by SLOW Web UI to manage vSphere? Want to manage ALL your VMware vCenters, AWS, Azure, Openstack, container behind a SINGLE pane of glass? Want to search, analyze, report, visualize VMs, hosts, networks, datastores, events as easily as Google the Web? Find out more about vSearch 3.0: the search engine for all your private and public clouds.

Secondly, they are both container frameworks that simplify application development. In Tomcat case, it loads Web applications and dispatch a HTTP request to one of them which processes and returns HTTP response back to Tomcat, which forwards responses to a Web client like browser. The Tomcat developers don’t need to know the networking and multithreading, but write a small piece of code that implements interfaces required by Java Servlet spec to process request and return response.

Similarly, the Hadoop framework can distribute MapReduce jobs to nodes of a cluster which read in data, process it, and save the result on HDFS. The Hadoop developers don’t need to know the distributed processing, job scheduling, but write a small piece code that implement Hadoop interfaces as well. Although the IO channels are different (Tomcat from network, Hadoop from HDFS), the way to simplify programming model is quite similar.

Lastly, the usage patterns are very similar. The frameworks can be used as a framework, platform, or application. You can find out more details in the article.

Now, let’s look at the differences.

Obviously, Hadoop and Tomcat solve quite different problems: one for distributing big data processing, and the other for hosting Web applications. This difference makes Hadoop less feasible for hosting model than Tomcat because transferring big data over the network between enterprises and service providers is not efficient and not economic.

The difference also means two frameworks complement each other very well. The Tomcat can be used to build Web front end for submitting jobs and managing the Hadoop clusters.

This entry was posted in Big Data and tagged , , . Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.


  1. Posted September 6, 2012 at 10:46 am | Permalink
  2. Posted February 27, 2014 at 3:28 am | Permalink

    Woah! I’m really loving the template/theme of this site.
    It’s simple, yet effective. A lot of times it’s very
    difficult to get that “perfect balance” between user friendliness and appearance.
    I must say you have done a awesome job with this. Also, the blog loads super fast for me
    on Internet explorer. Exceptional Blog!

    my weblog – business 7.x-1.10 – Sylvester

Post a Comment

Your email is never published nor shared. Required fields are marked *


You may use these HTML tags and attributes <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>


    My company has created products like vSearch ("Super vCenter"), vijavaNG APIs, EAM APIs, ICE tool. We also help clients with virtualization and cloud computing on customized development, training. Should you, or someone you know, need these products and services, please feel free to contact me: steve __AT__ doublecloud.org.

    Me: Steve Jin, VMware vExpert who authored the VMware VI and vSphere SDK by Prentice Hall, and created the de factor open source vSphere Java API while working at VMware engineering. Companies like Cisco, EMC, NetApp, HP, Dell, VMware, are among the users of the API and other tools I developed for their products, internal IT orchestration, and test automation.