-
Dimensionality Reduction
-
Benefits
- Computational ease
- Less overfitting
-
Techniques
-
Feature Selection
-
Criteria
- Correlation among features
- Feature Discriminability
-
Techniques
- Forward/Backward searches
- Branch and bound
-
Feature Extraction
-
Criteria
- align/deform/unfold data to be discriminable in few dimensions
-
Techniques
- Assumptions
- Classification
-
Machine
- Finite State Automaton (FSA)
- Quantum Computing (=FSA?)
- Human?--no just machines :)
-
Learning
-
What?
- Output given input
-
Why?
-
Automation
-
Proper utilization of human resource
- Delegate unpleasant jobs (produce output given input) to non-humans :)
- less errors
- fast
-
When?
- when prediction works
-
How?
-
Input Processing
-
What?
- Transform raw data into data having particular characteristics
-
why?
- Many prediction functions share certain requirements from input
- Representation
- Structure
- Point pattern
- Vector
- Values/ Data Types
- Nominal
- Ordinal
- Interval
- Ratio
- Noise Model
- Additive noise
- Missing Data
- Size
-
How?
- Handling Missing Data
- Dimensionality Reduction
- Changing data type
- Changing data structure
-
Training the Prediction Function
-
Assumptions
- Function class restriction
- why?
- Accuracy
- How?
- By design
- Choose linear or quadratic prediction function
- Regularization
- Input domain restriction
- Structure restriction
- Data type restriction
- Noise restriction
-
Techniques
- Output type
- Classification
- Regression
- Training data
- Input-output pairs
- Only Input
- Link/ no-link constraints
- Availability
- Function
- What is modeled?
- Data generation function: Generative
- Class separating boundary: Discriminative
- Function combining outputs of existing prediction schemes: Meta-learners
- Complexity
- Temporal consistency
- Drifting objective learning
- Testing the Prediction Function