1. Sources
    1. ICML '96
      1. Non-Linear Decision Trees -- NDT
        1. Andreas Ittner
        2. Michael Schlosser
      2. Experiments with a New Boosting Algorithm
        1. Subtopic 1
        2. Yoav Freund
        3. Robert E. Schapire
  2. Induction
    1. advantages
      1. relatively inexpensive
      2. thorough
    2. "impurity"
      1. essentially randomness
      2. value between 0 and 1
        1. Topic
      3. lower is better
      4. evaluation function
        1. examples
          1. entropy
          2. gini
    3. algorithms
      1. ID3
        1. top-down induction
      2. C4.5
        1. good performance
      3. AdaBoost
    4. test selection
      1. information theory
        1. evaluate information gain
  3. accuracy
    1. error rate
      1. true
      2. apparent
    2. overfitting
      1. smaller is better
      2. pruning
    3. tree size/complexity
      1. larger
        1. lower apparent error rate
        2. higher true error rate
          1. overfitting
          2. good on training data
          3. 100% for decision trees
          4. poor on novel test cases
      2. smaller
        1. higher apparent error rate
        2. lower true error rate
        3. more "generality"
          1. fewer branches
          2. fewer conjunctions
          3. fewer attribute comparisons
  4. pruning
    1. alternatives
      1. lookahead
    2. effectiveness
      1. empirically proven
    3. improvements
      1. reduced-error pruning
      2. weakest link
      3. train and test
      4. resampling
  5. problems
    1. data issues
      1. bad data
      2. some attribute data missing
      3. continuous attributes
      4. large data sets
      5. solutions
        1. new learning algorithms
        2. techniques
          1. bagging
          2. boosting
  6. comparison
    1. production rules
      1. decision trees
        1. mutually exclusive paths
        2. easier to visualize
        3. easier to generate
      2. production rules
        1. not mutually exclusive
          1. may require ordering of rules
        2. more complex learning system
        3. more powerful
      3. compatibility
        1. easy mapping from decistion trees to rules
  7. applications
    1. fault detection
    2. data mining
    3. OCR
  8. machine learning
    1. organization of knowledge
      1. static objects
        1. decision list
        2. inference network
        3. concept hierarchy
          1. decision trees
          2. discrimination networks
      2. change over time
        1. state-transition networks
        2. search-control rules
        3. macro-operators
    2. learning methods
      1. nonincremental
      2. incremental
    3. problem types
      1. online
      2. offline
    4. paradigms (langley 21)
      1. neural networks
      2. case-based learning
      3. genetic algorithms
      4. *rule induction
        1. structures
          1. condition-action rules
          2. *decision trees
          3. similar logical structures
        2. methods
          1. recursive partitioning
          2. disjoint sets
          3. "classes"
          4. conjunction of logical conditions
      5. analytic learning
        1. uses search to solve multi-step problems
        2. *backward chaining (me)
          1. represents knowledge as rules
          2. problems phrased as theorems
          3. performance system searches for proofs
      6. hybrid methods
        1. becoming more common
        2. field is maturing
        3. convergence of paradigms
  9. expert systems
    1. encoding of expert knowledge
    2. sometimes used as implementation
    3. used to build expert systems
  10. types
    1. fuzzy logic
      1. crisp dTrees
      2. fuzzy dTrees
    2. univariate
    3. multivariate
      1. oblique
    4. non-linear multivariate