Sources
ICML '96
Non-Linear Decision Trees -- NDT
Andreas Ittner
Michael Schlosser
Experiments with a New Boosting Algorithm
Subtopic 1
Yoav Freund
Robert E. Schapire
Induction
advantages
relatively inexpensive
thorough
"impurity"
essentially randomness
value between 0 and 1
Topic
lower is better
evaluation function
examples
entropy
gini
algorithms
ID3
top-down induction
C4.5
good performance
AdaBoost
test selection
information theory
evaluate information gain
accuracy
error rate
true
apparent
overfitting
smaller is better
pruning
tree size/complexity
larger
lower apparent error rate
higher true error rate
overfitting
good on training data
100% for decision trees
poor on novel test cases
smaller
higher apparent error rate
lower true error rate
more "generality"
fewer branches
fewer conjunctions
fewer attribute comparisons
pruning
alternatives
lookahead
effectiveness
empirically proven
improvements
reduced-error pruning
weakest link
train and test
resampling
problems
data issues
bad data
some attribute data missing
continuous attributes
large data sets
solutions
new learning algorithms
techniques
bagging
boosting
comparison
production rules
decision trees
mutually exclusive paths
easier to visualize
easier to generate
production rules
not mutually exclusive
may require ordering of rules
more complex learning system
more powerful
compatibility
easy mapping from decistion trees to rules
applications
fault detection
data mining
OCR
machine learning
organization of knowledge
static objects
decision list
inference network
concept hierarchy
decision trees
discrimination networks
change over time
state-transition networks
search-control rules
macro-operators
learning methods
nonincremental
incremental
problem types
online
offline
paradigms (langley 21)
neural networks
case-based learning
genetic algorithms
*rule induction
structures
condition-action rules
*decision trees
similar logical structures
methods
recursive partitioning
disjoint sets
"classes"
conjunction of logical conditions
analytic learning
uses search to solve multi-step problems
*backward chaining (me)
represents knowledge as rules
problems phrased as theorems
performance system searches for proofs
hybrid methods
becoming more common
field is maturing
convergence of paradigms
expert systems
encoding of expert knowledge
sometimes used as implementation
used to build expert systems
types
fuzzy logic
crisp dTrees
fuzzy dTrees
univariate
multivariate
oblique
non-linear multivariate