Hadoop is a Java based framework.
What it does is, sort data. An example would be: How many people view MTV website in FB, what is their age, location, gender, income etc. Which films they like.
But couldn’t this thing be done in some other software?
It could, but the genius of Hadoop is, you can do this in a cluster of machines. But the genius is that even if one machine fails, it will still work. Let’s say you use 12 desktops for collating any data from facebook, and one machine just busted in the middle of work; what Hadoop would do is either distribute the work to other machines or not take the work that the failed machine was supposed to do.
That would give you some results even if it’s not perfect. Something is better than nothing or guesses.
But wouldn’t that be a problem in many projects?
Yes it would. Hadoop is great where projects related to social sciences are involved viz. Marketing, economics. For scientific projects, others may be used – although Hadoop would suffice if you can guarantee that errors don’t happen in computing.
Why is that such a big deal, lots of frameworks are around?
That’s because world is getting digitised. Big data can solve many a problem that earlier was either impossible or was an art.
An example would be, for a political party – which candidate to put in a constituency? Take voter data and run it through Hadoop with all its variables. Surprising results come out. If you do get access to this system, run it for Pune and neighbouring constituency.