Course Description
You will learn how predictive analytics, the data-driven discovery and modeling of hidden patterns in large volumes of data, is an extension of an enterprise's decision support system (DSS) architecture. You will examine how predictive analytics complements other DSS capabilities, such as query and reporting, online analytical processing (OLAP), data visualization, and traditional statistical analysis. You will learn to produce and work with predictive analytics models that capture and represent hidden patterns and interactions in data, and you'll learn to use those descriptive and prospective models to address why things happened and what is likely to happen next. You will discover common strategic pitfalls of data mining, such as flawed or overlooked assessment, business understanding, project definition, and strategic planning, and how to avoid them. In this concentrated presentation and workshop, you will learn about the capabilities, limitations, risks, rewards, use cases, best practices, strategy, and lifecycle management of data mining. Live working sessions will reveal the natural messiness of data mining along with corresponding breakthroughs from which to interpret, learn, and apply. You will receive a download kit with a simple three-step installation guide to set up your own modeling environment before the lab sessions. During labs, you will examine and refine progress and results. Certification Attendees completing the full one-week Predictive Analytics and Data Mining series can take an exam to earn a Certificate of Comprehension from The Modeling Institute.
Agenda
INTRODUCTION
- What is Predictive Analytics?
- Goal driven analysis of large data sets...
- to identify an approach for allocating organizational resources
- that enhances performance on the organization's self-defined performance
metrics
- to better achieve the organization's business objectives
- using a repeatable, consistent strategy
- Beyond traditional statistics
- Shift Your Thinking
- The Goal of Modeling
- Physical systems
- Human behavior
- Behaviors of Interest
- Setting Up the Game
- Project team
- Phased development cycle
- Definitions
- Data sandbox
- Formulas vs. Model Development
- The Conflict Between Algorithm Objectives and Business Objectives
- Why Use Predictive Analytics?
- Definition of Data Mining
- What data mining is not
- Why mine data?
- The key question is "so what?"
- Successful Data Mining is Goal-Directed Analysis
- Traditional Statistics are Insufficient in Today's World
- What Can Data Mining Do?
- Data mining opportunities
- Data mining business goals
- Data mining analytic goals
- Why the Majority of Data Mining Projects Fail
- How Much Data is Needed to Develop a Model?
- How many variables?
- Rules of thumb
- Types of sampling
- Experimental Design
- Data sets used
- Types of data distribution
- Types of decision
- Predictive Analytics Key Technologies Overview (Methods and techniques are
detailed in Predictive Analytics and Data Mining: Model Development)
- Who Needs Brains When You've Got Software?
- Low-Risk / High-ROI Project Design
- The Business Justification for Predictive Analytics: Goal-Driven Analytics
- Organizational Predictive Analytics Opportunity Identification
- Incremental Project Design
- Single-tailed model development: Identify positive impacts
- Single-tailed model development: Identify negative impacts
- Two-tailed model development: Conflict resolution
- Ranking across the continuum: Adding resolution
- Subdividing dimensions: Adding detail
- Forecasting model development
- A Real-World Standardized Development Process: The CRoss-Industry Standard
Process for Data Mining (CRISP-DM)
USE CASE WORKSHOP #1: Implement CRISP-DM for a Single-Tailed Model
- Business Understanding (CRISP 1)
- Determine business objective
- Background and business objectives
- Identify decision process
- Business success criteria
- Identify performance metrics
- Calculate current baseline levels of performance
- Determine modeling objectives
- Requirements
- Assumptions
- Constraints
- Risks and contingencies
- Terminology
- Costs and benefits
- Modeling goals
- Modeling success criteria
- Assess resource availability
- Hardware resources
- Sources of data and knowledge
- Personnel sources
- Produce project plan
- Prepare Business Understanding deliverables
- Data Understanding (CRISP 2)
- Review data availability
- Collect initial data
- Data requirements planning
- Selection criteria
- Insertion of data
- Construction of output variable
- Describe data
- Volumetric analysis of data
- Attribute types and values
- Keys
- Review assumptions and goals
- Explore data
- Statistical analysis
- Data exploration
- Suppositions for future analysis
- Verify data quality
- Data Understanding deliverables
- Initial data collection report
- Data description report
- Data exploration report
- Data quality report
Note: CRISP-DM Parts 3, 4, and 5 are detailed in Predictive Analytics and Data
Mining: Model Development and extended into practice in this course. It is
helpful but not necessary to have had the tactical drill-down into these Parts
prior to their implementation.
- Data Preparation (CRISP-DM 3)
- Modeling (CRISP-DM 4)
- Evaluation (CRISP-DM 5)
- Deployment (CRISP-DM 6)
- Plan deployment
- Develop monitoring and maintenance plan
- Produce final report
- Project review
- Deliverables
- Deployment plan
- Monitoring and maintenance plan
- Final report
USE CASE WORKSHOP #2: Second CRISP-DM Pass for a Two-Tailed Model Implementation
- Business Understanding (CRISP-DM 1)
- Data Understanding (CRISP-DM 2)
- Data Preparation (CRISP-DM 3)
- Modeling (CRISP-DM 4)
- Evaluation (CRISP-DM 5)
- Deployment (CRISP-DM 6)
EXTENDED MODELING TOPICS
WRAP-UP AND NEXT STEPS
- Predictive Analytics and Data Mining: Model Development
- Certification exam (for those who complete the series)
- Product training courses
- Keep learning!
- Supplementary materials and resources
- Conferences and communities
- Get started on a project!
Audience
IT/IS executives and managers, including CIOs, CKOs, CTOs, stakeholders,
functional officers, technical directors and project managers; line-of-business
executives and functional managers, including risk managers, customer
relationship managers, business forecasters, inventory flow analysts, financial
forecasters, direct marketing analysts, medical diagnostic analysts, and
e-commerce company executives; technology planners; consultants