Font size:

Course Overview
This course Provides instruction on the processes and practice of data science, including machine learning and natural language processing. Included are: tools and programming languages (Python, IPython, Mahout, Pig, NumPy, pandas, SciPy, Scikitlearn), th

Course Target Audience
Architects, software developers, analysts and data scientists who need to apply data science and machine learning on Hadoop.

Course Prerequisites
Students must have experience with at least one programming or scripting language, knowledge in statistics and/or mathematics, and a basic understanding of big data and Hadoop principles. Students new to Hadoop are encouraged to attend the HDP Overview: A

Course Objectives
Recognize use cases for data science on Hadoop Describe the Hadoop and YARN architecture Describe supervised and unsupervised learning differences Use Mahout to run a machine learning algorithm on Hadoop Describe the data science life cycle Use Pig to tra

Course Outline
Format 50% Lecture/Discussion 50% Hands-on Labs Hands-On Labs Lab: Setting Up a Development Environment Demo: Block Storage Lab: Using HDFS Commands Demo: MapReduce Lab: Using Apache Mahout for Machine Learning Demo: Apache Pig Lab: Getting Started with A

The course you have selected has limited or no upcoming scheduled training dates!

Please browse similar courses or request more information for assistance.'s training support team will respond within one business day with relevant offerings.