Getting started with Weka -...

69
Getting started with Weka Yishuang Geng, Kexin Shi, Pei Zhang, Angel Trifonov, Jiefeng He, Xiaolu Xiong

Transcript of Getting started with Weka -...

Page 1: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

Getting started with WekaYishuang Geng, Kexin Shi, Pei Zhang, Angel

Trifonov, Jiefeng He, Xiaolu Xiong

Page 2: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

Lesson 1.1 - Introduction

Page 3: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

Purpose of this course

● Take the mystery out of data mining.

● How to use the Weka workbench for data mining.

● Explain the basic principles of several popular algorithms

Page 4: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

Data mining with Weka

● What’s data mining?○ We are overwhelmed with data○ Data mining is about going from the raw data to

information.

● What could data mining do?○ You’re at the supermarket checkout and you’re

happy with your bargains … and the supermarket is happy you’ve bought some more stuff

○ You want a child, but you and your partner can’t have one.

○ You want to monitor the firefighters status but you cannot get into the burning houses to watch them.

Page 5: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

What is Weka?

1. A bird found only in New Zealand2. Waikato Environment for Knowledge

Analysis

● Weka includes:○ 100+ algorithms for classification○ 75 for data preprocessing○ 25 to assist with feature

selection○ 20 for clustering, finding

association rules, etc

Page 6: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

Textbook

Data Mining: Practical machinelearning tools and techniques,by Ian H. Witten, Eibe Frank andMark A. Hall. Morgan Kaufmann, 2011

Page 7: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

Learning outcome of the course

● Load data into Weka and look at it● Use filters to preprocess it● Explore it using interactive visualization● Apply classification algorithms● Interpret the output● Understand evaluation methods and their implications● Understand various representations for models● Explain how popular machine learning algorithms work● Be aware of common pitfalls with data mining

Use Weka on your own data… and understand what you are doing!

Page 8: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation
Page 9: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

A simple application

○ You want to monitor the firefighters status but you cannot get into the burning houses to watch them.

Page 10: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

A simple application

Motion Detection Using RF Signals for the FirstResponder in Emergency Operations● Firefighters

○ Sensor to monitor their physiological information, which personal area communication capability to a centroid node.

○ Centroid node has local area communication capability to link the terminals out of burning house.

○ If we want to monitor their motion, what should we do?

Page 11: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

Existing approaches

● Pros○ High detection rate.○ Low computational cost.

● Cons○ Add extra load to firefighter.○ Limited sensor location, usually on shoes.○ Lack of capability on detecting multiple

motions,mainly used for fall detection.

Page 12: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

Raw data

Page 13: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

Data mining

Page 14: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

Information from the raw data

Page 15: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

Summary

● Why taking that course● Materials

○ Weka○ Textbook

● Course schedule○ Lectures○ Activities○ Assessments

● Learning outcome● A simple application

Page 16: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

Lesson 1.2 - Exploring the Explorer

Page 17: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

Setting up Weka

● Download latest (Weka 3.6.10) from http://www.cs.waikato.ac.nz/ml/weka/downloading.html

● Self-extracting executable ○ Java VM included (if needed)

● Create shortcut to Data folder in your Computer’s My Documents

● Use the Weka shortcut from the program folder

Page 18: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

Weka Interface

● Weka interfaces○ Explorer○ Experimenter○ GUI○ Command-line

● Explorer will be used the most

Page 19: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

Explorer Interface

● Explorer Panels○ Preprocess

● Opening datasets○ File

● Filter○ Supervised○ Unsupervised

Page 20: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

Filters

● Difference

● An additional two kinds of filtering○ Instances○ Attributes

Page 21: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

More Preprocess Information

● Relation○ Attributes○ Instances

● Selected Attribute○ Name○ Type○ Other Info

● Attributes○ Editing○ Removing

● Class Visualization● Status and log

Page 22: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

Lesson 1.3 - Exploring datasets

Page 23: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

Classification

Page 24: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation
Page 25: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

Nominal vs. Numerical

Page 26: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

ARFF file format

Page 27: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation
Page 28: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation
Page 29: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation
Page 30: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

Lesson 1.4 - Building a classifier

Classifying the glass datasetInterpreting J48 outputJ48 configuration panel... option: pruned vs unpruned trees ... option: avoid small leaves

Jiefeng

Page 31: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation
Page 32: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

Click Here

Page 33: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

Use the confusion matrix to determine how many headlamps instances were misclassified as build wind float?

3What is the percentage ofcorrectly classified instances?

Page 34: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation
Page 35: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation
Page 36: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

Turning pruning off results in larger trees, and often yields worse results because the classifier may "overfit" the data. However, in some cases the unpruned tree performs better than the pruned one.

Page 37: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation
Page 38: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation
Page 39: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation
Page 40: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation
Page 41: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation
Page 42: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation
Page 43: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

1.4 Summary

Building a classifierClassifying the glass datasetInterpreting J48 outputJ48 configuration panel... option: pruned vs unpruned trees ... option: avoid small leaves

Page 44: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

Lesson 1.5 - Using a filter

Page 45: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

Use a filter to remove an attribute

Open weather.nominal.arff

Page 46: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

Check the filters

Page 47: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation
Page 48: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

Set attributeIndices to 3 and click OK

Page 49: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

Apply the filter

Page 50: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation
Page 51: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation
Page 52: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation
Page 53: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation
Page 54: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation
Page 55: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation
Page 56: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation
Page 57: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation
Page 58: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation
Page 59: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation
Page 60: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation
Page 61: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

Lesson 1.6 - Visualizing your data

Page 62: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

Raw data visualization

Page 63: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

Sepalwidth vs. petalwidth

Page 64: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation
Page 65: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

Zoom in

Page 66: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

Zoom in

Page 67: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

Error visualization

Page 68: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

Error visualization

Page 69: Getting started with Weka - WPIweb.cs.wpi.edu/~cs525/f13b-EAR//cs525-homepage/lectures/TALKS-MOOC/MOOC-getting...Apply classification algorithms Interpret the output Understand evaluation

Thank you!

Questions?