Big data is everywhere these days, and many of us are trying to come up to speed on the technology. There are several books out now on this topic, and here are some tips for figuring out which ones are worth reading or best for newbies.
A good place to start is Frank Ohlhurst’s Big Data Analytics: Turning Big Data into Big Money. This is a business process and workflow treatment of the topic: you won’t find any code samples or URLs of open source repositories here. Ohlhurst, who worked with me at CMP and still writes numerous product reviews for the IT trade press, talks about ways to secure your data, structure it, and mine it for value and insights. It is a great book to give your boss.
Next is Big Data Analytics: Disruptive Technologies for Changing the Game by Arvind Sathi, a data architect for IBM. This is another great book for beginners, and identifies use cases, goes into more detail on the business processes and shows some of the main architectural elements of Big Data.
If you’re looking for something short and sweet and also free, try What Is Data Science? by Mike Loukides. You get some concrete examples of different kinds of data analysis tools and techniques and practical, real-world examples galore.
Then there is Enterprise Analytics: Optimize Performance, Process, and Decisions Through Big Data by Tom Davenport and several other authors. It covers a wider ground than some of these other books. It addresses topics including Big Data topics and a variety of other analytic techniques.
A more general overview of the major players behind Big Data is The Little Book of Big Data by Noreen Burlingame. It is a short read but a quick way to see who are the vendors making waves with this technology, including Hortonworks, Cloudera, Datameer and Karmasphere.
If you want to get more down and dirty into the technology, then the Hadoop: The Definitive Guide by Tom White is for you. White will take you through building your first Apache Hadoop cluster, the ins and outs of the Hadoop file system, how to set up MapReduce jobs, and using some of the other tools such as Pig, Zookeeper, and Hive. White works for Cloudera, one of the main commercial forces behind Hadoop.
Once you want to get more training, check out Cloudera University and Hortonworks University: both vendors have extensive programs on a multitude of topics relating to Hadoop and its offshoots, some paid and some for free. And Big Data University has dozens of courses all for free too.