Analysing large scale data with Apache Hadoop

Apache Hadoop is a Java framework for large-scale distributed batch processing infrastructure which runs on commodity hardware. The biggest advantage is the ability to scale to hundreds or thousands of computers. Hadoop is designed to efficiently distribute and handle large amounts of work across a set of machines.

This talk will introduce Hadoop along with MapReduce and HDFS. It will discuss the possible scenarios where Hadoop fits as a robust solution and will include a case study from a project, where Hadoop is used for bulk inserts and large-scale data analytics.

  • What is Hadoop?
  • Why Hadoop?
  • What is MapReduce?
  • HDFS Architecture Overview
  • Demo with a use case from a real project scenario.
  • Who is on Hadoop ?

This demo driven presentation will help audience to see the power of Hadoop when it comes to processing terabytes of data on commodity hardware.


Salil Kalia has 7 years of experience on various Java based platforms (including mobile, desktop and web application development). In his recent projects, he has used various bleeding edge technologies including Hadoop where he has processed thousands of Gigabytes of data on Amazon cloud. Cont…


IndicThreads Conference On Software Development will be held on 13-14 July 2012 in Delhi India. Click for details on Sessions, Speakers and Location. *Register now to grab the current discounted rates!

Comments are closed.