Learning Outcome
5
Recognize the mandatory pre-processing step (Scaling) and the loss of interpretability.
4
Determine how many components to keep using a Scree Plot.
3
Visualize how PCA finds the axes of maximum variance.
2
Explain the difference between Feature Selection and Feature Extraction (PCA).
1
Understand the "Curse of Dimensionality" and why too many columns destroy models.
The Story So Far
We've been looking at datasets with 5 to 10 features. But real-world data (like genetics or image processing) can have 10,000 columns!
The Problem
The Solution
We need to shrink the dataset from 10,000 columns down to 2 or 3 columns, without losing the underlying information.
Hook/Story/Analogy(Slide 4)
Transition from Analogy to Technical Concept(Slide 5)
Core Concepts (Slide 6)
Core Concepts (Slide 7)
Core Concepts (.....Slide N-3)
Summary
5
The trade-off for compressing your data is that you completely lose the interpretability of your original features.
4
You must always Scale your data before applying PCA.
3
PC1 captures the maximum variance (the "perfect camera angle").
2
It uses Feature Extraction, blending old columns into new, synthetic Principal Components.
1
PCA is an unsupervised dimensionality reduction technique.
Quiz
What’s the key difference between Feature Selection and Feature Extraction (PCA)?
A. Selection needs scaling; extraction doesn’t
B. Selection removes original columns; extraction creates new synthetic features from them
C. Selection is unsupervised; extraction is supervised
D. No mathematical difference
Quiz-Answer
What’s the key difference between Feature Selection and Feature Extraction (PCA)?
A. Selection needs scaling; extraction doesn’t
B. Selection removes original columns; extraction creates new synthetic features from them
C. Selection is unsupervised; extraction is supervised
D. No mathematical difference