S7_WEKAIntro

download S7_WEKAIntro

of 28

Transcript of S7_WEKAIntro

  • 7/29/2019 S7_WEKAIntro

    1/28

    S7:IntrotoWekaLab

    ShawndraHill

    Spring2013

    TR1:30-3pmand3-4:30

  • 7/29/2019 S7_WEKAIntro

    2/28

    Preprocessing

    2

  • 7/29/2019 S7_WEKAIntro

    3/28

    Preprocessing:supervisedsampling

    3

    Unbalanced class,supervised sampling can be

    used to balance the data

    set.

    Click Choose in the filter

    section, and follow this

    path:

  • 7/29/2019 S7_WEKAIntro

    4/28

    Preprocessing:supervisedsampling

    4

    Click on theresample box

    next to the

    Choose button,

    and the pop-up

    window

    emerges to setthe desired

    parameters

    A balanced

    sample is now

    obtained

  • 7/29/2019 S7_WEKAIntro

    5/28

    Preprocessing:unsupervisedsampling

    5

  • 7/29/2019 S7_WEKAIntro

    6/28

    Preprocessing:filters

    6

    For a numeric transformation filter, click Choose, and then follow the path shown

    below, selecting NumericTransform. The pop-up to set up the transformation emerge

    by clicking the NumericTransform box next to the Choose button. To go-back from the

    transformation, press the button Undo in the preprocessing menu.

  • 7/29/2019 S7_WEKAIntro

    7/28

    Preprocessing:filters

    7

  • 7/29/2019 S7_WEKAIntro

    8/28

    Featureselecon

    8

    Go to the Select attributes section to perform feature

    selection. You need to define 2 things in order to do so: what

    attribute evaluator and what search method to use.

    Click on the Choose

    buttons to access and

    set the options

    A"ribute

    evaluatorop/ons:

    Searchmethod

    op/ons:

  • 7/29/2019 S7_WEKAIntro

    9/28

    Featureselecon:(InfoGain,Ranker)

    9

    Click on the Choosebuttons to access and

    set the options

  • 7/29/2019 S7_WEKAIntro

    10/28

    Featureselecon:(PCA,Ranker)

    10

  • 7/29/2019 S7_WEKAIntro

    11/28

    Featureselecon:(Wrapper,GreedySTW)

    11

  • 7/29/2019 S7_WEKAIntro

    12/28

    Featureselecon:modelswith

    selectedaQributes

    12

    Removeunselected

    a"ributesintheTRAINING

    set;savethemodified

    TRAININGfile

    ThenopentheTESTsetfile,andremoveunselecteda"ributesinthesameway.Save

    themodifiedTRAININGandTESTsetfiles,opentheminWEKAandrunyourmodel

  • 7/29/2019 S7_WEKAIntro

    13/28

    13

    Classificaon:Testsetopons

  • 7/29/2019 S7_WEKAIntro

    14/28

    14

    Classificaon:K-NN

    To create a K-NN classifier, go to the classifier tab and select options as indicated

    above. Once you select Ibk, you can click in the Ibk box (right next to the Choose

    button) and set the parameters in the pop-up window (figure in the right)

  • 7/29/2019 S7_WEKAIntro

    15/28

    15

    Classificaon:NaveBayes

  • 7/29/2019 S7_WEKAIntro

    16/28

    16

    Classificaon:Decisiontrees

  • 7/29/2019 S7_WEKAIntro

    17/28

    17

    Numericpredicon:Linearregression

  • 7/29/2019 S7_WEKAIntro

    18/28

    18

    Numericpredicon:NeuralNetworks

  • 7/29/2019 S7_WEKAIntro

    19/28

    19

    Numericpredicon:NeuralNetworks

    Clickingin More youcanlearn

    abouttheNNparameters.Wekadescrip/onsforsomeofthe

    importantparametersareshown

    below:

  • 7/29/2019 S7_WEKAIntro

    20/28

    20

    Classificaon:Output

  • 7/29/2019 S7_WEKAIntro

    21/28

    21

    Classificaon:Output

    Correctlyclassifiedinstances:52699639=54.68%

    TPrate:Class1:2056(2056+2770)=0.427

    Class2:3213(3213+1957)=0.668

    ThediagonalsofTPandFPrate

    sumupto1

  • 7/29/2019 S7_WEKAIntro

    22/28

    22

    Classificaon:Output

  • 7/29/2019 S7_WEKAIntro

    23/28

    23

    Classificaon:Output

    Predicted

    class

    Class

    probabilityes/mate

    NOTES:

    1:1 class 1, label 1

    2:0 class 2, label 0

    Class labels can beanything

    First (second)

    column in class

    probability estimates

    indicate probability

    of being class 1 (class

    2). E.g., the predicted

    prob. of obs.1 being

    class 1 is 0.578

    + indicates mistakes

    in the classification

    * Indicates the

    predicted class

  • 7/29/2019 S7_WEKAIntro

    24/28

    24

    Classificaon:Output

    Predicted

    class

    Class

    probability

    es/mate

    Class probability estimates can be stored using

    the command line, an example below.

    Get to the command line in Windows by typing

    cmd in the Run dialogue box.

    copy your training and test .arff files to the Weka directory, and

    then use the following command line:

    java -cp weka.jar weka.classifiers.trees.J48 -t TRAIN.arff -T

    TEST.arff -p 0 >filename.probs

  • 7/29/2019 S7_WEKAIntro

    25/28

    25

    Classificaon:ROCcurves

  • 7/29/2019 S7_WEKAIntro

    26/28

    26

    Visualizaontab

  • 7/29/2019 S7_WEKAIntro

    27/28

    27

    Visualizaontab

  • 7/29/2019 S7_WEKAIntro

    28/28

    S7:IntrotoWekaLab

    ShawndraHill

    Spring2013

    TR1:30-3pmand3-4:30