Lab #4: Introducing Classification - Duke UniversityLab #4: Introducing Classification Everything...

Lab #4: Introducing Classification

Everything DataCompSci 216 Spring 2019

Announcements (Tue Feb 12)

• HW 1 & 2 Graded– refer to solutions in repo

• Project team formation – due in two weeks (Tuesday 2/26)– 5 is the ideal team size; talk to us if you need

special arrangement

Format of this lab

• Introduction to classification• Lab #4– Team challenge: extra credits!

• Discussion of Lab #4 (~5 minutes)

Introducing Lab #4

Classification problem example: Given the set of movies a user rated, and the user’s occupation, predict the user’s gender

m1 m2 m3 … m1682 o1 o2 … o21 gender0 0 1 … 0 1 0 … 0 M1 0 0 … 1 0 1 … 0 F1 0 1 … 1 0 1 … 0 M.. … … … … … … … … …1 1 0 … 0 1 0 … 0 ???… … … … … … … … … ???

Training datato teach your classifier

Test datato evaluate your classifier

Accuracy = (# test records classified correctly) / (# test records)

Categorical “outcome”

Each row is an “instance”

“Features”

Where is test data?

What if no test data is specified, or we don’t know the right answers?• We can still evaluate our classifier by

splitting the data given to us

m1 … m1682 o1 … o21 gender

0 … 0 1 … 0 M

1 … 1 0 … 0 F

1 … 1 0 … 0 M

.. … … … … … …

1 … 0 1 … 0 F

… … … … … … …

m1 … m1682 o1 … o21 gender

0 … 0 1 … 0 M

1 … 1 0 … 0 F

1 … 1 0 … 0 M

.. … … … … … …

m1 … m1682 o1 … o21 gender

1 … 0 1 … 0 F

… … … … … … …

Classifier

gender

Compute accuracyRookie mistake:train and test using the same (whole) dataset

Lucky splits, unlucky splits

• What if a particular split gets lucky or unlucky?

• Should we tweak the heck out of our classification algorithm just for this split?

☞Answer: cross-validation, a smart way to make best use of available data

r-fold cross-validation

• Randomly divide data into r groups (say 10)

• Hold out each group for testing; train on the remaining r – 1 groups– r train-test runs and r

accuracy measurements– A better picture of

performance

Held out for testing

… …

Three little classifiers

• classifyA.py: a “mystery” classifier– Read the code to see what it does

• classifyB.py: Naïve Bayes Classifier– Along the same line as Homework #4, 3(C)

• classifyC.py: k-Nearest-Neighbor Classifier– Given x, choose the k training data points

closest to x; predict the majority class

http://www.weirdspace.dk/Disney/ThreeLittlePigs.htm

Lab #4: Introducing Classification - Duke UniversityLab #4: Introducing Classification Everything...

Documents

Transcript of Lab #4: Introducing Classification - Duke UniversityLab #4: Introducing Classification Everything...

manorprimary.commanorprimary.com/usr/docs/2017/5/Year 6 Living... · Web viewThe unit build on the previous content by introducing pupils to the importance of classification, ...

Introducing MSL Classification - Littelfuse/media/... · classified to MSL Level 1, all surface-mount PolySwitch devices and PolyZen devices are classified to MSL Level 2a according

Introducing Integers (1).notebookidmswildcats.weebly.com/uploads/8/7/.../introducing... · Introducing Integers (1).notebook November 01, 2017 Introducing Integers Objectives: 1)

Implement the DiffServ QoS Model Introducing Classification and Marking.

1 Introducing Introducing: Kay Hudson CARS Recon Inc.

Physical Computing - Keio UniversityLab, ty Physical Computing • Massimo Banzi, “Getting Started With Arduino”, Make Books, 2008 • Tom Igoe, “Making Things Talk”, Make

Introducing Windows Azure. INTRODUCING CLOUD COMPUTING Windows Azure.

1. Introducing Narda 2. Introducing NRA 3. Introducing IDA-2 · 1. Introducing Narda 2. Introducing NRA 3. Introducing IDA-2 1 1. ... ASCII command set, ... Low power consumption,

Lab Manual B.Sc. (CA) - Aligarh Muslim UniversityLAB MANUAL/B.Sc.(CA)/CCB–6P1/CSD-AMU 4 OUTCOMES After completing this course, the students would be able to: Understand the concept

Introducing LEVELED BOOK • L Planet Earth Introducing ...

A non-linear learning & classification algorithm that ... fileContents Pg. 1 Introduction 1 1.1 Algorithm Validation 1 2 Introducing the (R.R.E) Algorithm 1

Lecture 1 - WordPress.com · Lecture 1 Introducing Municipal Solid Waste Management STRUCTURE Overview Learning Objectives 1.1 Classification of Solid Wastes 1.1.1 Source-based classification

khosq.am · Web view2019. 11. 26. · Nevertheless, the applied classification enables introducing in a comprehensive and vivid way the general picture of violations of the rights

Closing the Loop - Yale UniversityLab Analysis Soils were analyzed for pH by mixing water to soil in 2:1 ratio, gravimetric soil moisture (24 hours at 105C), substrate induced respiration,

Introducing the National Partnership for Maternity Safety ... · Introducing the National Partnership for Maternity Safety: ... was calculated using ICD-10 cause of death classification

EE 264 DSP Project Report - Stanford Universitylab using the Parks-McClellan algorithm, then directly integrated into the DSP API code, taking advantage of the adaptive ﬁltering

Introducing Polyolefins³l_roviden... · Content . Synthetic polymers – Classification – Global production . Polyolefins – Polyolefins family – PE classification – PP structures

Introducing the SIOP Model - Pearsoning English very well (the U.S. Census Bureau’s classification of limited English proficiency) (U.S. Census Bureau, 2012). Children age 5–17

A Radical-aware Attention-based Model for Chinese Text ...home.ustc.edu.cn/~hqtao/index_files/aaai 19-poster-Tao.pdfSimply introducing radicals to Chinese text classification cannot

A tour of introducing phonegap introducing PhoneGap.