David_Ramos.ppt

download David_Ramos.ppt

of 8

Transcript of David_Ramos.ppt

  • 8/11/2019 David_Ramos.ppt

    1/8

    Decision Trees & the Iterative

    Dichotomiser 3 (ID3) Algorithm

    David Ramos

    CS 157B, Section 1

    May 4, 2006

  • 8/11/2019 David_Ramos.ppt

    2/8

    Review of Basics

    What exactly is a Decision Tree?

    A tree where each branching node represents a choice between two ormore alternatives, with every branching node being part of a path to a leafnode (bottom of the tree). The leaf node represents a decision, derivedfrom the tree for the given input.

    How can Decision Trees be used to classify instances ofdata?

    Instead of representing decisions, leaf nodes represent a particularclassification of a data instance, based on the given set of attributes (andtheir discrete values) that define the instance of data, which is kind of like arelational tuple for illustration purposes.

  • 8/11/2019 David_Ramos.ppt

    3/8

    Review of Basics (contd)

    It is important that data instances have boolean or discrete datavalues for their attributes to help with the basic understanding ofID3, although there are extensions of ID3 that deal with continuousdata.

    Because Decision Trees can classify data instances into differenttypes, they can be interpreted as a good generalization ofunobserved instances of data, appealing to people because itmakes the classification process self-evident [1]. They representknowledge about these data instances.

  • 8/11/2019 David_Ramos.ppt

    4/8

    How does ID3 relate to Decision Trees, then?

    ID3, or Iterative Dichotomiser 3 Algorithm, is a Decision Tree learningalgorithm. The name is correct in that it creates Decision Trees fordichotomizing data instances, or classifying them discretely throughbranching nodes until a classification bucket is reached (leaf node).

    By using ID3 and other machine-learning algorithms from ArtificialIntelligence, expert systems can engage in tasks usually done by humanexperts, such as doctors diagnosing diseases by examining varioussymptoms (the attributes) of patients (the data instances) in a complexDecision Tree.

    Of course, accurate Decision Trees are fundamental to Data Mining andDatabases.

  • 8/11/2019 David_Ramos.ppt

    5/8

    ID3 relates to Decision Trees (contd)

    Decision tree learning is a method for approximating discrete-valuedtarget functions, in which the learned function is represented by adecision tree. Decision tree learning is one of the most widely used

    and practical methods for inductive inference. [2]

    The input data of ID3 is known as sets of training or learningdata instances, which will be used by the algorithm to generate the

    Decision Tree. The machine is learning from this set of preliminarydata. For future exams, remember that ID3 was developedby Ross Quinlan in 1983.

  • 8/11/2019 David_Ramos.ppt

    6/8

    Description of ID3

    The ID3 algorithm generates a Decision Tree by using a greedysearch through the inputted sets of data instances so as todetermine nodes and the attributes they use for branching. Also, theemerging tree is traversed in a top-down (root to leaf) approachthrough each of the nodes within the tree. This occurs

    RECURSIVELY, reminding you of those pointless tree traversalsstrategies in CS 146 that you hated doing.

    The traversal attempts to determine if the decision attributeonwhich the branching will be based on for any particular emergingnode is the most ideal branching attribute (by using the inputtedsets of data). One particular metric that can be used to determinethe if a branching attribute is adequate is that of INFORMATIONGAIN, or ENTROPY (abstractly inversely proportional toINFORMATION GAIN).

  • 8/11/2019 David_Ramos.ppt

    7/8

  • 8/11/2019 David_Ramos.ppt

    8/8

    How is Entropy Related to Information Gain, whichis the other metric for determining if ID3 is

    choosing appropriate branching attributes from the

    training sets? Information Gain = measuring the expected reduction in

    EntropyThe higher the Information Gain, the more expectedreduction in Entropy.

    It turns out Entropy, or measure of non-homogeneousness within aset of learning sets can be calculated in a straight forward manner.

    For a more detailed presentation of the definition of Entropy

    and its calculation, see Prof. Lees Lecture Notes.