Tuesday, May 8, 2012

Bringing Hadoop closer to live data!

Of late, I have been reading and listening to my collegues talk about Hadoop , MapReduce and Twitter's Spout and bolts!

Most important part of Hadoop HDFS - MapReduce is already being implemented in Nutanix. This allows us to bring live data to the Hadoop cluster in read-only mode accessing same vdisks rather than waiting for server to dispatch the data nightly. Instead of Map Reduce, we could run Spot and Bolts to map reduce continuously.

If we plan to run Hadoop Jobs nightly, we could even have an Adaptive Chameleon Compute Clusters, which at day time runs regular jobs (VDI,etc) and in night time it runs Hadoop. VMware has a lot of tools and PowerCLI commands could acheive this by powering off or changing resources to Hadoop VMs.

Just so excited to work with Nutanix and so much more we could do with decoupling from Centralized
Storage to Distributed Storage.

We are just scratching the surface. That is my assignment to read further and step into the future.

3 comments:

  1. This seems that we are turning clock back to adapt again local storage. But the key differentiation here, definitely is the virtualization piece with capability to run Hadoop jobs off hour....

    ReplyDelete
  2. Rajesh,
    Appreciate the comment, please read my blog http://nutanix.blogspot.com/2012/05/beyond-san-and-fcoe.html when you get a chance.

    ReplyDelete