Process distributed datasets using Hadoop

Description

Hadoop is rapidly emerging technology and architecture.  This session is aimed at making fundamentals clear for Java Developers
On completion of this course, developers will be able to understand:

  • Motivation behind using Hadoop
  • Hadoop architecture and concept
  • Working with HDFS
  • HDFS file operations
  • Java interface for HDFS
  • Using HDFS archives
  • Map Reduce for Java Programmers
  • Mappers, Reducers and Combiners
  • Configuration API
  • Unit testing of Map Reduce
  • Map Reduce workflow and features
  • Introduction to Pig, Hive, HBase, Zookeeper, Sqoop

Course Duration: 2 Day(s)

Target Audience

Data Scientist

Personas

Data Engineer & Data Analyst