Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School...

193
Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering and Sciences University of Southern Queensland Lecture 1 Professor Shahjahan Khan, PhD Lecture 1

Transcript of Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School...

Page 1: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

Data Analysis STA2300

Professor Shahjahan Khan, PhD

School of Agricultural, Computational and Environmental SciencesFaculty of Health, Engineering and Sciences

University of Southern Queensland

Lecture 1

Professor Shahjahan Khan, PhD Lecture 1

Page 2: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

Part I

Introduction

Professor Shahjahan Khan, PhD Lecture 1

Page 3: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

Welcome to STA2300

Warm Welcome to STA2300 Data Analysis CourseMy name is Shahjahan Khan

The Tajmahal built by Mughal Emperor ShahjahanWorked and lived in 9 different countriesExaminer of the course for this semesterI hope to work with you through out the semester

Professor Shahjahan Khan, PhD Lecture 1

Page 4: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

General Announcements

Announcements

Tutorials start this week:Bring textbook; Studybook (containing tutorialquestions); calculator

Use StudyDesk via UConnect regularlyCheck the News Forum at least once a week!Check the Social/Topics Forum at least once a week!Check your Umail regularly

Professor Shahjahan Khan, PhD Lecture 1

Page 5: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

General Announcements

Announcements

Tutorials start this week:Bring textbook; Studybook (containing tutorialquestions); calculator

Use StudyDesk via UConnect regularlyCheck the News Forum at least once a week!Check the Social/Topics Forum at least once a week!Check your Umail regularly

Professor Shahjahan Khan, PhD Lecture 1

Page 6: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

General Announcements

Announcements

Tutorials start this week:Bring textbook; Studybook (containing tutorialquestions); calculator

Use StudyDesk via UConnect regularlyCheck the News Forum at least once a week!Check the Social/Topics Forum at least once a week!Check your Umail regularly

Professor Shahjahan Khan, PhD Lecture 1

Page 7: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

General Announcements

Announcements

Tutorials start this week:Bring textbook; Studybook (containing tutorialquestions); calculator

Use StudyDesk via UConnect regularlyCheck the News Forum at least once a week!Check the Social/Topics Forum at least once a week!Check your Umail regularly

Professor Shahjahan Khan, PhD Lecture 1

Page 8: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

General Announcements

Announcements

Tutorials start this week:Bring textbook; Studybook (containing tutorialquestions); calculator

Use StudyDesk via UConnect regularlyCheck the News Forum at least once a week!Check the Social/Topics Forum at least once a week!Check your Umail regularly

Professor Shahjahan Khan, PhD Lecture 1

Page 9: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

General Announcements

Announcements

Tutorials start this week:Bring textbook; Studybook (containing tutorialquestions); calculator

Use StudyDesk via UConnect regularlyCheck the News Forum at least once a week!Check the Social/Topics Forum at least once a week!Check your Umail regularly

Professor Shahjahan Khan, PhD Lecture 1

Page 10: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

General Announcements

AssessmentThree assignments and 2 hour exam are compulsory

Three assignments (5%, 20% and 25%)Online Assignment 1 (i.e., Quiz 1) due Friday of nextweek! (Complete online through StudyDesk by 11.55pm)Assignments 2 & 3 submit as pdf document viaStudyDesk by 11.55 pm of due dateExam (50% total weighting):

Part A is 20 multiple choice questionsPart B is 30 marks of short answer questions

Passing grade: (1) at least 50% of total weightedmarks and (2) at least 40% of the 50% (ie 20/50) inthe final exam.

Professor Shahjahan Khan, PhD Lecture 1

Page 11: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

General Announcements

AssessmentThree assignments and 2 hour exam are compulsory

Three assignments (5%, 20% and 25%)Online Assignment 1 (i.e., Quiz 1) due Friday of nextweek! (Complete online through StudyDesk by 11.55pm)Assignments 2 & 3 submit as pdf document viaStudyDesk by 11.55 pm of due dateExam (50% total weighting):

Part A is 20 multiple choice questionsPart B is 30 marks of short answer questions

Passing grade: (1) at least 50% of total weightedmarks and (2) at least 40% of the 50% (ie 20/50) inthe final exam.

Professor Shahjahan Khan, PhD Lecture 1

Page 12: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

General Announcements

AssessmentThree assignments and 2 hour exam are compulsory

Three assignments (5%, 20% and 25%)Online Assignment 1 (i.e., Quiz 1) due Friday of nextweek! (Complete online through StudyDesk by 11.55pm)Assignments 2 & 3 submit as pdf document viaStudyDesk by 11.55 pm of due dateExam (50% total weighting):

Part A is 20 multiple choice questionsPart B is 30 marks of short answer questions

Passing grade: (1) at least 50% of total weightedmarks and (2) at least 40% of the 50% (ie 20/50) inthe final exam.

Professor Shahjahan Khan, PhD Lecture 1

Page 13: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

General Announcements

AssessmentThree assignments and 2 hour exam are compulsory

Three assignments (5%, 20% and 25%)Online Assignment 1 (i.e., Quiz 1) due Friday of nextweek! (Complete online through StudyDesk by 11.55pm)Assignments 2 & 3 submit as pdf document viaStudyDesk by 11.55 pm of due dateExam (50% total weighting):

Part A is 20 multiple choice questionsPart B is 30 marks of short answer questions

Passing grade: (1) at least 50% of total weightedmarks and (2) at least 40% of the 50% (ie 20/50) inthe final exam.

Professor Shahjahan Khan, PhD Lecture 1

Page 14: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

General Announcements

AssessmentThree assignments and 2 hour exam are compulsory

Three assignments (5%, 20% and 25%)Online Assignment 1 (i.e., Quiz 1) due Friday of nextweek! (Complete online through StudyDesk by 11.55pm)Assignments 2 & 3 submit as pdf document viaStudyDesk by 11.55 pm of due dateExam (50% total weighting):

Part A is 20 multiple choice questionsPart B is 30 marks of short answer questions

Passing grade: (1) at least 50% of total weightedmarks and (2) at least 40% of the 50% (ie 20/50) inthe final exam.

Professor Shahjahan Khan, PhD Lecture 1

Page 15: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

General Announcements

AssessmentThree assignments and 2 hour exam are compulsory

Three assignments (5%, 20% and 25%)Online Assignment 1 (i.e., Quiz 1) due Friday of nextweek! (Complete online through StudyDesk by 11.55pm)Assignments 2 & 3 submit as pdf document viaStudyDesk by 11.55 pm of due dateExam (50% total weighting):

Part A is 20 multiple choice questionsPart B is 30 marks of short answer questions

Passing grade: (1) at least 50% of total weightedmarks and (2) at least 40% of the 50% (ie 20/50) inthe final exam.

Professor Shahjahan Khan, PhD Lecture 1

Page 16: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

General Announcements

AssessmentThree assignments and 2 hour exam are compulsory

Three assignments (5%, 20% and 25%)Online Assignment 1 (i.e., Quiz 1) due Friday of nextweek! (Complete online through StudyDesk by 11.55pm)Assignments 2 & 3 submit as pdf document viaStudyDesk by 11.55 pm of due dateExam (50% total weighting):

Part A is 20 multiple choice questionsPart B is 30 marks of short answer questions

Passing grade: (1) at least 50% of total weightedmarks and (2) at least 40% of the 50% (ie 20/50) inthe final exam.

Professor Shahjahan Khan, PhD Lecture 1

Page 17: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

What you need to have access toWhat we provide

Materials you needDe Veaux, Velleman & Bock 4th edition (3rd ed is also OK)

Introductory Material (on StudyDesk )Study Book (on StudyDesk )Text book: Intro Stats by De Veaux, Velleman & Bock(4th edition) with ActivStats CD (Or Stats: data andmodels, 4th Global Edn)Calculator (with STAT mode)Access to the SPSS (also called IMB SPSS) software

Available for purchase from the bookshop orbuy online 6 month’s licence from Hearne Scientifichttps://www.hearne.software/Software/SPSS-Grad-Packs-for-Students-by-IBM/EditionsAvailable in all USQ PC labs

Professor Shahjahan Khan, PhD Lecture 1

Page 18: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

What you need to have access toWhat we provide

Materials you needDe Veaux, Velleman & Bock 4th edition (3rd ed is also OK)

Introductory Material (on StudyDesk )Study Book (on StudyDesk )Text book: Intro Stats by De Veaux, Velleman & Bock(4th edition) with ActivStats CD (Or Stats: data andmodels, 4th Global Edn)Calculator (with STAT mode)Access to the SPSS (also called IMB SPSS) software

Available for purchase from the bookshop orbuy online 6 month’s licence from Hearne Scientifichttps://www.hearne.software/Software/SPSS-Grad-Packs-for-Students-by-IBM/EditionsAvailable in all USQ PC labs

Professor Shahjahan Khan, PhD Lecture 1

Page 19: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

What you need to have access toWhat we provide

Materials you needDe Veaux, Velleman & Bock 4th edition (3rd ed is also OK)

Introductory Material (on StudyDesk )Study Book (on StudyDesk )Text book: Intro Stats by De Veaux, Velleman & Bock(4th edition) with ActivStats CD (Or Stats: data andmodels, 4th Global Edn)Calculator (with STAT mode)Access to the SPSS (also called IMB SPSS) software

Available for purchase from the bookshop orbuy online 6 month’s licence from Hearne Scientifichttps://www.hearne.software/Software/SPSS-Grad-Packs-for-Students-by-IBM/EditionsAvailable in all USQ PC labs

Professor Shahjahan Khan, PhD Lecture 1

Page 20: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

What you need to have access toWhat we provide

Materials you needDe Veaux, Velleman & Bock 4th edition (3rd ed is also OK)

Introductory Material (on StudyDesk )Study Book (on StudyDesk )Text book: Intro Stats by De Veaux, Velleman & Bock(4th edition) with ActivStats CD (Or Stats: data andmodels, 4th Global Edn)Calculator (with STAT mode)Access to the SPSS (also called IMB SPSS) software

Available for purchase from the bookshop orbuy online 6 month’s licence from Hearne Scientifichttps://www.hearne.software/Software/SPSS-Grad-Packs-for-Students-by-IBM/EditionsAvailable in all USQ PC labs

Professor Shahjahan Khan, PhD Lecture 1

Page 21: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

What you need to have access toWhat we provide

Materials you needDe Veaux, Velleman & Bock 4th edition (3rd ed is also OK)

Introductory Material (on StudyDesk )Study Book (on StudyDesk )Text book: Intro Stats by De Veaux, Velleman & Bock(4th edition) with ActivStats CD (Or Stats: data andmodels, 4th Global Edn)Calculator (with STAT mode)Access to the SPSS (also called IMB SPSS) software

Available for purchase from the bookshop orbuy online 6 month’s licence from Hearne Scientifichttps://www.hearne.software/Software/SPSS-Grad-Packs-for-Students-by-IBM/EditionsAvailable in all USQ PC labs

Professor Shahjahan Khan, PhD Lecture 1

Page 22: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

What you need to have access toWhat we provide

We provide:We provide them—but you must use them to get the benefit

Lectures, tutorials to on-campus studentsSPSS Exercises (on StudyDesk )Face-to-face, StudyDesk forums, telephone, ande-mail assistance via UAskLTS support (The Learning Centre)MEET-UP program (see StudyDesk for details)Feedback on assignmentsWeb resources: discussion forums etc. onStudyDesk

Professor Shahjahan Khan, PhD Lecture 1

Page 23: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

What you need to have access toWhat we provide

We provide:We provide them—but you must use them to get the benefit

Lectures, tutorials to on-campus studentsSPSS Exercises (on StudyDesk )Face-to-face, StudyDesk forums, telephone, ande-mail assistance via UAskLTS support (The Learning Centre)MEET-UP program (see StudyDesk for details)Feedback on assignmentsWeb resources: discussion forums etc. onStudyDesk

Professor Shahjahan Khan, PhD Lecture 1

Page 24: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

What you need to have access toWhat we provide

We provide:We provide them—but you must use them to get the benefit

Lectures, tutorials to on-campus studentsSPSS Exercises (on StudyDesk )Face-to-face, StudyDesk forums, telephone, ande-mail assistance via UAskLTS support (The Learning Centre)MEET-UP program (see StudyDesk for details)Feedback on assignmentsWeb resources: discussion forums etc. onStudyDesk

Professor Shahjahan Khan, PhD Lecture 1

Page 25: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

What you need to have access toWhat we provide

We provide:We provide them—but you must use them to get the benefit

Lectures, tutorials to on-campus studentsSPSS Exercises (on StudyDesk )Face-to-face, StudyDesk forums, telephone, ande-mail assistance via UAskLTS support (The Learning Centre)MEET-UP program (see StudyDesk for details)Feedback on assignmentsWeb resources: discussion forums etc. onStudyDesk

Professor Shahjahan Khan, PhD Lecture 1

Page 26: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

What you need to have access toWhat we provide

We provide:We provide them—but you must use them to get the benefit

Lectures, tutorials to on-campus studentsSPSS Exercises (on StudyDesk )Face-to-face, StudyDesk forums, telephone, ande-mail assistance via UAskLTS support (The Learning Centre)MEET-UP program (see StudyDesk for details)Feedback on assignmentsWeb resources: discussion forums etc. onStudyDesk

Professor Shahjahan Khan, PhD Lecture 1

Page 27: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

What you need to have access toWhat we provide

We provide:We provide them—but you must use them to get the benefit

Lectures, tutorials to on-campus studentsSPSS Exercises (on StudyDesk )Face-to-face, StudyDesk forums, telephone, ande-mail assistance via UAskLTS support (The Learning Centre)MEET-UP program (see StudyDesk for details)Feedback on assignmentsWeb resources: discussion forums etc. onStudyDesk

Professor Shahjahan Khan, PhD Lecture 1

Page 28: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

What you need to have access toWhat we provide

We provide:We provide them—but you must use them to get the benefit

Lectures, tutorials to on-campus studentsSPSS Exercises (on StudyDesk )Face-to-face, StudyDesk forums, telephone, ande-mail assistance via UAskLTS support (The Learning Centre)MEET-UP program (see StudyDesk for details)Feedback on assignmentsWeb resources: discussion forums etc. onStudyDesk

Professor Shahjahan Khan, PhD Lecture 1

Page 29: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

What you need to have access toWhat we provide

LTS supportThe Learning Centre

The Learning Centre is located near the bookshopSuccess in Mathematics for Statistics Online TutorialsCalculator bookletsDrop-in and phone-in mathematics supportAcademic Skills WorkshopsMore details from postings and links on theStudyDesk

Professor Shahjahan Khan, PhD Lecture 1

Page 30: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

What you need to have access toWhat we provide

LTS supportThe Learning Centre

The Learning Centre is located near the bookshopSuccess in Mathematics for Statistics Online TutorialsCalculator bookletsDrop-in and phone-in mathematics supportAcademic Skills WorkshopsMore details from postings and links on theStudyDesk

Professor Shahjahan Khan, PhD Lecture 1

Page 31: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

What you need to have access toWhat we provide

LTS supportThe Learning Centre

The Learning Centre is located near the bookshopSuccess in Mathematics for Statistics Online TutorialsCalculator bookletsDrop-in and phone-in mathematics supportAcademic Skills WorkshopsMore details from postings and links on theStudyDesk

Professor Shahjahan Khan, PhD Lecture 1

Page 32: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

What you need to have access toWhat we provide

LTS supportThe Learning Centre

The Learning Centre is located near the bookshopSuccess in Mathematics for Statistics Online TutorialsCalculator bookletsDrop-in and phone-in mathematics supportAcademic Skills WorkshopsMore details from postings and links on theStudyDesk

Professor Shahjahan Khan, PhD Lecture 1

Page 33: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

What you need to have access toWhat we provide

LTS supportThe Learning Centre

The Learning Centre is located near the bookshopSuccess in Mathematics for Statistics Online TutorialsCalculator bookletsDrop-in and phone-in mathematics supportAcademic Skills WorkshopsMore details from postings and links on theStudyDesk

Professor Shahjahan Khan, PhD Lecture 1

Page 34: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

What you need to have access toWhat we provide

LTS supportThe Learning Centre

The Learning Centre is located near the bookshopSuccess in Mathematics for Statistics Online TutorialsCalculator bookletsDrop-in and phone-in mathematics supportAcademic Skills WorkshopsMore details from postings and links on theStudyDesk

Professor Shahjahan Khan, PhD Lecture 1

Page 35: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

Why do this course?

You are doing it because, it isan essential skill of your degreedecided by your faculty lecturers/professors for youa skill expected by good employersessential for any research studies (eg honours)providing statistical literacy for everyday life

Professor Shahjahan Khan, PhD Lecture 1

Page 36: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

Why do this course?

You are doing it because, it isan essential skill of your degreedecided by your faculty lecturers/professors for youa skill expected by good employersessential for any research studies (eg honours)providing statistical literacy for everyday life

Professor Shahjahan Khan, PhD Lecture 1

Page 37: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

Why do this course?

You are doing it because, it isan essential skill of your degreedecided by your faculty lecturers/professors for youa skill expected by good employersessential for any research studies (eg honours)providing statistical literacy for everyday life

Professor Shahjahan Khan, PhD Lecture 1

Page 38: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

Why do this course?

You are doing it because, it isan essential skill of your degreedecided by your faculty lecturers/professors for youa skill expected by good employersessential for any research studies (eg honours)providing statistical literacy for everyday life

Professor Shahjahan Khan, PhD Lecture 1

Page 39: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

Why do this course?

You are doing it because, it isan essential skill of your degreedecided by your faculty lecturers/professors for youa skill expected by good employersessential for any research studies (eg honours)providing statistical literacy for everyday life

Professor Shahjahan Khan, PhD Lecture 1

Page 40: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

Why Statistics?

Professor Shahjahan Khan, PhD Lecture 1

Page 41: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

Contributions of Statistics?

Professor Shahjahan Khan, PhD Lecture 1

Page 42: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

Statistical disciplines

Professor Shahjahan Khan, PhD Lecture 1

Page 43: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

Statistics at the top

Professor Shahjahan Khan, PhD Lecture 1

Page 44: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

The basics of what we will learn

We will learnthe language of statistics (all Modules)how to summarise data (verbally, graphically andnumerically) (Modules 1 to 4)how to collect data (and how not to) (Module 5)how to generalise our observations to the wider world(Modules 6 to 11)

Professor Shahjahan Khan, PhD Lecture 1

Page 45: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

The basics of what we will learn

We will learnthe language of statistics (all Modules)how to summarise data (verbally, graphically andnumerically) (Modules 1 to 4)how to collect data (and how not to) (Module 5)how to generalise our observations to the wider world(Modules 6 to 11)

Professor Shahjahan Khan, PhD Lecture 1

Page 46: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

The basics of what we will learn

We will learnthe language of statistics (all Modules)how to summarise data (verbally, graphically andnumerically) (Modules 1 to 4)how to collect data (and how not to) (Module 5)how to generalise our observations to the wider world(Modules 6 to 11)

Professor Shahjahan Khan, PhD Lecture 1

Page 47: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

The basics of what we will learn

We will learnthe language of statistics (all Modules)how to summarise data (verbally, graphically andnumerically) (Modules 1 to 4)how to collect data (and how not to) (Module 5)how to generalise our observations to the wider world(Modules 6 to 11)

Professor Shahjahan Khan, PhD Lecture 1

Page 48: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

OutlineGeneral AnnouncementsWhat you need to have access toWhat we provide

1 Appendix A: Mathematics Review

Professor Shahjahan Khan, PhD Lecture 1

Page 49: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

What is Appendix A of the Study Book?

Appendix A is the assumed mathematics knowledgefor the course.It provides some of the essential mathematical skillsnecessary for Data AnalysisReturn to this chapter if your mathematics skills needrefreshing during the courseContact The Learning Centre for help with thismaterials support.It is not directly examinable, and not supported bythe course teaching team.

Professor Shahjahan Khan, PhD Lecture 1

Page 50: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

What is Appendix A of the Study Book?

Appendix A is the assumed mathematics knowledgefor the course.It provides some of the essential mathematical skillsnecessary for Data AnalysisReturn to this chapter if your mathematics skills needrefreshing during the courseContact The Learning Centre for help with thismaterials support.It is not directly examinable, and not supported bythe course teaching team.

Professor Shahjahan Khan, PhD Lecture 1

Page 51: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

What is Appendix A of the Study Book?

Appendix A is the assumed mathematics knowledgefor the course.It provides some of the essential mathematical skillsnecessary for Data AnalysisReturn to this chapter if your mathematics skills needrefreshing during the courseContact The Learning Centre for help with thismaterials support.It is not directly examinable, and not supported bythe course teaching team.

Professor Shahjahan Khan, PhD Lecture 1

Page 52: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

What is Appendix A of the Study Book?

Appendix A is the assumed mathematics knowledgefor the course.It provides some of the essential mathematical skillsnecessary for Data AnalysisReturn to this chapter if your mathematics skills needrefreshing during the courseContact The Learning Centre for help with thismaterials support.It is not directly examinable, and not supported bythe course teaching team.

Professor Shahjahan Khan, PhD Lecture 1

Page 53: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

To get started. . .MaterialsOverview

Appendix A: Mathematics Review

What is Appendix A of the Study Book?

Appendix A is the assumed mathematics knowledgefor the course.It provides some of the essential mathematical skillsnecessary for Data AnalysisReturn to this chapter if your mathematics skills needrefreshing during the courseContact The Learning Centre for help with thismaterials support.It is not directly examinable, and not supported bythe course teaching team.

Professor Shahjahan Khan, PhD Lecture 1

Page 54: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Part II

Module 1: Exploring andunderstanding data

Professor Shahjahan Khan, PhD Lecture 1

Page 55: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Outline

2 §1.1 What is/are Statistics?Why do Data Analysis?

3 §1.2 About dataSome languageTypes of data

4 §1.3 Displaying categorical dataBar chartsPie chartsContingency tables

Professor Shahjahan Khan, PhD Lecture 1

Page 56: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical dataWhy do Data Analysis?

Outline

2 §1.1 What is/are Statistics?Why do Data Analysis?

3 §1.2 About dataSome languageTypes of data

4 §1.3 Displaying categorical dataBar chartsPie chartsContingency tables

Professor Shahjahan Khan, PhD Lecture 1

Page 57: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical dataWhy do Data Analysis?

Why learn Data Analysis?SB §1.1

Data are generated by repeated observationData analysis finds the information in dataInformation is used to learn about the world and helpdecision making

Professor Shahjahan Khan, PhD Lecture 1

Page 58: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical dataWhy do Data Analysis?

Why learn Data Analysis?SB §1.1

Data are generated by repeated observationData analysis finds the information in dataInformation is used to learn about the world and helpdecision making

Professor Shahjahan Khan, PhD Lecture 1

Page 59: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical dataWhy do Data Analysis?

Why learn Data Analysis?SB §1.1

Data are generated by repeated observationData analysis finds the information in dataInformation is used to learn about the world and helpdecision making

Professor Shahjahan Khan, PhD Lecture 1

Page 60: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical dataWhy do Data Analysis?

Who needs to know statistics?

If your job is: Statistics helps you:Collecting data know how much data you need

get max. information at min. costcommunicate with your analyst

Analysis extract informationmake correct decisions

Making decisions justify your decisionsmake informed decisionscommunicate with your analyst

Professor Shahjahan Khan, PhD Lecture 1

Page 61: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical dataWhy do Data Analysis?

Why do Data Analysis?

Statistics used correctly can assist good decisionmakingStatistics used incorrectly can misinform‘Figures don’t lie, but liars can figure’‘Figures fool, when fools figure’Your choice: user or victim!

Professor Shahjahan Khan, PhD Lecture 1

Page 62: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical dataWhy do Data Analysis?

Why do Data Analysis?

Statistics used correctly can assist good decisionmakingStatistics used incorrectly can misinform‘Figures don’t lie, but liars can figure’‘Figures fool, when fools figure’Your choice: user or victim!

Professor Shahjahan Khan, PhD Lecture 1

Page 63: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical dataWhy do Data Analysis?

Why do Data Analysis?

Statistics used correctly can assist good decisionmakingStatistics used incorrectly can misinform‘Figures don’t lie, but liars can figure’‘Figures fool, when fools figure’Your choice: user or victim!

Professor Shahjahan Khan, PhD Lecture 1

Page 64: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical dataWhy do Data Analysis?

Why do Data Analysis?

Statistics used correctly can assist good decisionmakingStatistics used incorrectly can misinform‘Figures don’t lie, but liars can figure’‘Figures fool, when fools figure’Your choice: user or victim!

Professor Shahjahan Khan, PhD Lecture 1

Page 65: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical dataWhy do Data Analysis?

Why do Data Analysis?

Statistics used correctly can assist good decisionmakingStatistics used incorrectly can misinform‘Figures don’t lie, but liars can figure’‘Figures fool, when fools figure’Your choice: user or victim!

Professor Shahjahan Khan, PhD Lecture 1

Page 66: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical dataWhy do Data Analysis?

Use, Abuse and Misuse of Statistics

On 5 October 2012 the US Bureau of Labor Statisticsreported unemployment rate dropped from 8.1% to 7.8%.

Professor Shahjahan Khan, PhD Lecture 1

Page 67: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical dataWhy do Data Analysis?

Use, Abuse and Misuse of Statistics

In November 2015, the republican Presidential Candidatein the USA, Donald Trump tweeted

“Whites killed by blacks - 81%”,citing “Crime Statistics Bureau of San Francisco”.

The US fact-checking site Politifact found that this“Bureau” did not exist, and thetrue figure is around 15%.

When confronted, Trump shrugged and said,“Am I going to check every statistic?”

Professor Shahjahan Khan, PhD Lecture 1

Page 68: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

Outline

2 §1.1 What is/are Statistics?Why do Data Analysis?

3 §1.2 About dataSome languageTypes of data

4 §1.3 Displaying categorical dataBar chartsPie chartsContingency tables

Professor Shahjahan Khan, PhD Lecture 1

Page 69: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

Some languageSB §1.2

Cases are the individuals or objects being describedA variable is any characteristic of a caseData are the observed values of the variablesA data set contains the observed values of thevariables for a group of individuals

ExampleGender is a variable; ‘Male’ and ‘Female’ are theobserved values of the variable (ie data)

Professor Shahjahan Khan, PhD Lecture 1

Page 70: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

Some languageSB §1.2

Cases are the individuals or objects being describedA variable is any characteristic of a caseData are the observed values of the variablesA data set contains the observed values of thevariables for a group of individuals

ExampleGender is a variable; ‘Male’ and ‘Female’ are theobserved values of the variable (ie data)

Professor Shahjahan Khan, PhD Lecture 1

Page 71: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

Some languageSB §1.2

Cases are the individuals or objects being describedA variable is any characteristic of a caseData are the observed values of the variablesA data set contains the observed values of thevariables for a group of individuals

ExampleGender is a variable; ‘Male’ and ‘Female’ are theobserved values of the variable (ie data)

Professor Shahjahan Khan, PhD Lecture 1

Page 72: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

Some languageSB §1.2

Cases are the individuals or objects being describedA variable is any characteristic of a caseData are the observed values of the variablesA data set contains the observed values of thevariables for a group of individuals

ExampleGender is a variable; ‘Male’ and ‘Female’ are theobserved values of the variable (ie data)

Professor Shahjahan Khan, PhD Lecture 1

Page 73: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

Some languageSB §1.2

Cases are the individuals or objects being describedA variable is any characteristic of a caseData are the observed values of the variablesA data set contains the observed values of thevariables for a group of individuals

ExampleGender is a variable; ‘Male’ and ‘Female’ are theobserved values of the variable (ie data)

Professor Shahjahan Khan, PhD Lecture 1

Page 74: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

A typical data setThe data here is loaded in SPSS

Professor Shahjahan Khan, PhD Lecture 1

Page 75: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

Understanding the SPSS window

This is a dataset: it contains the observed values ofvariables for a group of individuals (cases)Variables are in the columnsCases are in the rowsThe variable names are: gender; height; faculty; etc.

Professor Shahjahan Khan, PhD Lecture 1

Page 76: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

Understanding the SPSS window

This is a dataset: it contains the observed values ofvariables for a group of individuals (cases)Variables are in the columnsCases are in the rowsThe variable names are: gender; height; faculty; etc.

Professor Shahjahan Khan, PhD Lecture 1

Page 77: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

Understanding the SPSS window

This is a dataset: it contains the observed values ofvariables for a group of individuals (cases)Variables are in the columnsCases are in the rowsThe variable names are: gender; height; faculty; etc.

Professor Shahjahan Khan, PhD Lecture 1

Page 78: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

Understanding the SPSS window

This is a dataset: it contains the observed values ofvariables for a group of individuals (cases)Variables are in the columnsCases are in the rowsThe variable names are: gender; height; faculty; etc.

Professor Shahjahan Khan, PhD Lecture 1

Page 79: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

An example of a variable in the data set

Professor Shahjahan Khan, PhD Lecture 1

Page 80: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

An example of a case in the data set

Professor Shahjahan Khan, PhD Lecture 1

Page 81: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

Types of data: Quantitative dataKnowing the type of data means we can select the correct techniques later

Quantitative data: (Scale in SPSS) take on numericalvalues for which mathematical operations (like +, ÷)make senseHas unit of measurement.

ExampleQuantitative variables in the SPSS data: height (in cm)

Professor Shahjahan Khan, PhD Lecture 1

Page 82: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

Types of data: Quantitative dataKnowing the type of data means we can select the correct techniques later

Quantitative data: (Scale in SPSS) take on numericalvalues for which mathematical operations (like +, ÷)make senseHas unit of measurement.

ExampleQuantitative variables in the SPSS data: height (in cm)

Professor Shahjahan Khan, PhD Lecture 1

Page 83: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

Types of data: Quantitative dataKnowing the type of data means we can select the correct techniques later

Quantitative data generallyhas units

ExampleHeight is quantitative. You mustindicate if it is measured in inches,metres, or cm.

Professor Shahjahan Khan, PhD Lecture 1

Page 84: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

Types of data: Quantitative dataKnowing the type of data means we can select the correct techniques later

Quantitative data generallyhas units

ExampleHeight is quantitative. You mustindicate if it is measured in inches,metres, or cm.

Professor Shahjahan Khan, PhD Lecture 1

Page 85: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

Types of data: Categorical data

Categorical variable: (Nominal in SPSS) values definecategories

ExampleCategorical variables in the SPSS data: gender (values‘Male’ and ‘Female’), faculty (of enrolment)

Professor Shahjahan Khan, PhD Lecture 1

Page 86: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

Types of data: Categorical data

Categorical variable: (Nominal in SPSS) values definecategories

ExampleCategorical variables in the SPSS data: gender (values‘Male’ and ‘Female’), faculty (of enrolment)

Professor Shahjahan Khan, PhD Lecture 1

Page 87: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

Types of data: Categorical data

Categorical (Nominal in SPSS) data is usually codedfor use in SPSS

ExampleConsider gender: Males may be coded as 1; females as2. Or females coded as 0; males as 1.

Careful: Categorical data may look quantitative if thecategories are given numerical codes

Professor Shahjahan Khan, PhD Lecture 1

Page 88: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

Types of data: Categorical data

Categorical (Nominal in SPSS) data is usually codedfor use in SPSS

ExampleConsider gender: Males may be coded as 1; females as2. Or females coded as 0; males as 1.

Careful: Categorical data may look quantitative if thecategories are given numerical codes

Professor Shahjahan Khan, PhD Lecture 1

Page 89: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

Types of data: Categorical data

Categorical (Nominal in SPSS) data is usually codedfor use in SPSS

ExampleConsider gender: Males may be coded as 1; females as2. Or females coded as 0; males as 1.

Careful: Categorical data may look quantitative if thecategories are given numerical codes

Professor Shahjahan Khan, PhD Lecture 1

Page 90: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

Types of data in the dataset

Professor Shahjahan Khan, PhD Lecture 1

Page 91: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

Types of data in the dataset

Professor Shahjahan Khan, PhD Lecture 1

Page 92: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

The variable ‘faculty’

The variable ‘Faculty’ iscategorical, with levelsdefined numericallyWe can see thesedefinitions using‘Variable Views’

Professor Shahjahan Khan, PhD Lecture 1

Page 93: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

The variable ‘faculty’

The variable ‘Faculty’ iscategorical, with levelsdefined numericallyWe can see thesedefinitions using‘Variable Views’

Professor Shahjahan Khan, PhD Lecture 1

Page 94: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

The variable ‘faculty’

The variable ‘Faculty’ iscategorical, with levelsdefined numericallyWe can see thesedefinitions using‘Variable Views’

Professor Shahjahan Khan, PhD Lecture 1

Page 95: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

The variable ‘faculty’

The variable ‘Faculty’ iscategorical, with levelsdefined numericallyWe can see thesedefinitions using‘Variable Views’

Professor Shahjahan Khan, PhD Lecture 1

Page 96: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

The variable ‘faculty’

The variable ‘Faculty’ iscategorical, with levelsdefined numericallyWe can see thesedefinitions using‘Variable Views’

Professor Shahjahan Khan, PhD Lecture 1

Page 97: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

Types of data: Ordinal data

Data is ordinal if it is between categorical andquantitative

ExampleResponses such as ‘Disagree’, ‘Neutral’ and ‘Agree’ areordinal: they can be placed in a natural order, but are notquantitative

Professor Shahjahan Khan, PhD Lecture 1

Page 98: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

Types of data: Ordinal data

Data is ordinal if it is between categorical andquantitative

ExampleResponses such as ‘Disagree’, ‘Neutral’ and ‘Agree’ areordinal: they can be placed in a natural order, but are notquantitative

Professor Shahjahan Khan, PhD Lecture 1

Page 99: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

Types of data: Ordinal data

Data is ordinal if it is between categorical andquantitative

Example‘How often do you smoke? Never; Sometimes; Often;Regularly’. This variable is also ordinal.

Professor Shahjahan Khan, PhD Lecture 1

Page 100: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

Types of data: Ordinal data

Data is ordinal if it is between categorical andquantitative

Example‘How often do you smoke? Never; Sometimes; Often;Regularly’. This variable is also ordinal.

Professor Shahjahan Khan, PhD Lecture 1

Page 101: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

Understanding dataSB §1.2

Who: information about the casesWhat, and in what units: The meaning of thevariablesWhenWhereWhyHow

Professor Shahjahan Khan, PhD Lecture 1

Page 102: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

Understanding dataSB §1.2

Who: information about the casesWhat, and in what units: The meaning of thevariablesWhenWhereWhyHow

Professor Shahjahan Khan, PhD Lecture 1

Page 103: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

Understanding dataSB §1.2

Who: information about the casesWhat, and in what units: The meaning of thevariablesWhenWhereWhyHow

Professor Shahjahan Khan, PhD Lecture 1

Page 104: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

Understanding dataSB §1.2

Who: information about the casesWhat, and in what units: The meaning of thevariablesWhenWhereWhyHow

Professor Shahjahan Khan, PhD Lecture 1

Page 105: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

Understanding dataSB §1.2

Who: information about the casesWhat, and in what units: The meaning of thevariablesWhenWhereWhyHow

Professor Shahjahan Khan, PhD Lecture 1

Page 106: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

Understanding dataSB §1.2

Who: information about the casesWhat, and in what units: The meaning of thevariablesWhenWhereWhyHow

Professor Shahjahan Khan, PhD Lecture 1

Page 107: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

Understanding data: The SPSS data setThere’s no point having data if you don’t understand it

Who: students attending the first STA2300 lectureWhat: various: height (in cm), gender, etc.When: Semester, YearWhere: L209 (ie. on-campus students only whocame)Why: To generate some data for use in lecturesHow: A quick paper survey

Professor Shahjahan Khan, PhD Lecture 1

Page 108: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

Understanding data: The SPSS data setThere’s no point having data if you don’t understand it

Who: students attending the first STA2300 lectureWhat: various: height (in cm), gender, etc.When: Semester, YearWhere: L209 (ie. on-campus students only whocame)Why: To generate some data for use in lecturesHow: A quick paper survey

Professor Shahjahan Khan, PhD Lecture 1

Page 109: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

Understanding data: The SPSS data setThere’s no point having data if you don’t understand it

Who: students attending the first STA2300 lectureWhat: various: height (in cm), gender, etc.When: Semester, YearWhere: L209 (ie. on-campus students only whocame)Why: To generate some data for use in lecturesHow: A quick paper survey

Professor Shahjahan Khan, PhD Lecture 1

Page 110: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

Understanding data: The SPSS data setThere’s no point having data if you don’t understand it

Who: students attending the first STA2300 lectureWhat: various: height (in cm), gender, etc.When: Semester, YearWhere: L209 (ie. on-campus students only whocame)Why: To generate some data for use in lecturesHow: A quick paper survey

Professor Shahjahan Khan, PhD Lecture 1

Page 111: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

Understanding data: The SPSS data setThere’s no point having data if you don’t understand it

Who: students attending the first STA2300 lectureWhat: various: height (in cm), gender, etc.When: Semester, YearWhere: L209 (ie. on-campus students only whocame)Why: To generate some data for use in lecturesHow: A quick paper survey

Professor Shahjahan Khan, PhD Lecture 1

Page 112: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Some languageTypes of data

Understanding data: The SPSS data setThere’s no point having data if you don’t understand it

Who: students attending the first STA2300 lectureWhat: various: height (in cm), gender, etc.When: Semester, YearWhere: L209 (ie. on-campus students only whocame)Why: To generate some data for use in lecturesHow: A quick paper survey

Professor Shahjahan Khan, PhD Lecture 1

Page 113: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

Outline

2 §1.1 What is/are Statistics?Why do Data Analysis?

3 §1.2 About dataSome languageTypes of data

4 §1.3 Displaying categorical dataBar chartsPie chartsContingency tables

Professor Shahjahan Khan, PhD Lecture 1

Page 114: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

Distributions and graphsSB §1.3

The distribution of a variable tells us what values thevariable takes and how often it takes them

ExampleThe distribution of gender tells us how many Males andFemales are in the dataset.

Different graphs are used for different reasons anddifferent data types

Professor Shahjahan Khan, PhD Lecture 1

Page 115: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

Distributions and graphsSB §1.3

The distribution of a variable tells us what values thevariable takes and how often it takes them

ExampleThe distribution of gender tells us how many Males andFemales are in the dataset.

Different graphs are used for different reasons anddifferent data types

Professor Shahjahan Khan, PhD Lecture 1

Page 116: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

Distributions and graphsSB §1.3

The distribution of a variable tells us what values thevariable takes and how often it takes them

ExampleThe distribution of gender tells us how many Males andFemales are in the dataset.

Different graphs are used for different reasons anddifferent data types

Professor Shahjahan Khan, PhD Lecture 1

Page 117: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

Different graphs for displaying one variable

Categorical variables Quantitative variables

Bar chart Stem-and-leaf plotPie chart Histogram

Boxplot

Which of these options we use depends on the data. . .

Professor Shahjahan Khan, PhD Lecture 1

Page 118: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

Different graphs for displaying one variable

Categorical variables Quantitative variables

Bar chart Stem-and-leaf plotPie chart Histogram

Boxplot

Which of these options we use depends on the data. . .

Professor Shahjahan Khan, PhD Lecture 1

Page 119: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

One categorical variable: bar chart

Categorical variable on the horizontal axisCount or percentage on the vertical axisTitles and labels essential!Bars don’t touchCan order bars alphabetically, from largest tosmallest, etc.Don’t add an artificial third dimension

Professor Shahjahan Khan, PhD Lecture 1

Page 120: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

One categorical variable: bar chart

Categorical variable on the horizontal axisCount or percentage on the vertical axisTitles and labels essential!Bars don’t touchCan order bars alphabetically, from largest tosmallest, etc.Don’t add an artificial third dimension

Professor Shahjahan Khan, PhD Lecture 1

Page 121: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

One categorical variable: bar chart

Categorical variable on the horizontal axisCount or percentage on the vertical axisTitles and labels essential!Bars don’t touchCan order bars alphabetically, from largest tosmallest, etc.Don’t add an artificial third dimension

Professor Shahjahan Khan, PhD Lecture 1

Page 122: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

One categorical variable: bar chart

Categorical variable on the horizontal axisCount or percentage on the vertical axisTitles and labels essential!Bars don’t touchCan order bars alphabetically, from largest tosmallest, etc.Don’t add an artificial third dimension

Professor Shahjahan Khan, PhD Lecture 1

Page 123: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

One categorical variable: bar chart

Categorical variable on the horizontal axisCount or percentage on the vertical axisTitles and labels essential!Bars don’t touchCan order bars alphabetically, from largest tosmallest, etc.Don’t add an artificial third dimension

Professor Shahjahan Khan, PhD Lecture 1

Page 124: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

One categorical variable: bar chart

Categorical variable on the horizontal axisCount or percentage on the vertical axisTitles and labels essential!Bars don’t touchCan order bars alphabetically, from largest tosmallest, etc.Don’t add an artificial third dimension

Professor Shahjahan Khan, PhD Lecture 1

Page 125: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

Example barchartSPSS Exercise 5 explains how to draw barcharts

The bars could beordered:

alphabeticallyby sizeby my preferenceor any way I like

since the variable graphedis categorical

Professor Shahjahan Khan, PhD Lecture 1

Page 126: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

Get the little things right!We look for these things when we mark assignments

Variable names onaxesTitleScale on axesGaps between bars

Professor Shahjahan Khan, PhD Lecture 1

Page 127: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

Get the little things right!We look for these things when we mark assignments

Variable names onaxesTitleScale on axesGaps between bars

Professor Shahjahan Khan, PhD Lecture 1

Page 128: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

Get the little things right!We look for these things when we mark assignments

Variable names onaxesTitleScale on axesGaps between bars

Professor Shahjahan Khan, PhD Lecture 1

Page 129: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

Get the little things right!We look for these things when we mark assignments

Variable names onaxesTitleScale on axesGaps between bars

Professor Shahjahan Khan, PhD Lecture 1

Page 130: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

The good and bad

Good things about bar charts:used for any categorical variablesimple to construct

Bad things about barcharts:not so easy to see what fraction of the whole group aparticular category is

Professor Shahjahan Khan, PhD Lecture 1

Page 131: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

The good and bad

Good things about bar charts:used for any categorical variablesimple to construct

Bad things about barcharts:not so easy to see what fraction of the whole group aparticular category is

Professor Shahjahan Khan, PhD Lecture 1

Page 132: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

One categorical variable: pie chart

Circle divided into areas of the appropriate sizePie charts need all categories (must add to 100%)

Not all categorical data can be graphed with a piechart

Don’t add an artificial third dimension

Professor Shahjahan Khan, PhD Lecture 1

Page 133: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

One categorical variable: pie chart

Circle divided into areas of the appropriate sizePie charts need all categories (must add to 100%)

Not all categorical data can be graphed with a piechart

Don’t add an artificial third dimension

Professor Shahjahan Khan, PhD Lecture 1

Page 134: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

One categorical variable: pie chart

Circle divided into areas of the appropriate sizePie charts need all categories (must add to 100%)

Not all categorical data can be graphed with a piechart

Don’t add an artificial third dimension

Professor Shahjahan Khan, PhD Lecture 1

Page 135: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

One categorical variable: pie chart

Circle divided into areas of the appropriate sizePie charts need all categories (must add to 100%)

Not all categorical data can be graphed with a piechart

Don’t add an artificial third dimension

Professor Shahjahan Khan, PhD Lecture 1

Page 136: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

The good and bad

Good things about pie charts:used for showing parts of a whole

Bad things about pie charts:Pie charts are hard for the brain to understandHard to compare the sizes of pie segmentsAdding an artificial third dimension makes them verymisleadingCan’t always use a pie chart

Professor Shahjahan Khan, PhD Lecture 1

Page 137: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

The good and bad

Good things about pie charts:used for showing parts of a whole

Bad things about pie charts:Pie charts are hard for the brain to understandHard to compare the sizes of pie segmentsAdding an artificial third dimension makes them verymisleadingCan’t always use a pie chart

Professor Shahjahan Khan, PhD Lecture 1

Page 138: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

An example pie chartSPSS Exercise 5 explains how to draw barcharts

This is one place wherethe pie chart can bedrawn for the dataNotice it is hard tocompare the sizes of thesegments

Professor Shahjahan Khan, PhD Lecture 1

Page 139: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

A bad pie chart

The third dimension isunnecessaryThe third dimension ismisleadingEven harder tocompare the sizes ofthe segments

Professor Shahjahan Khan, PhD Lecture 1

Page 140: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

Two categorical vars: Contingency tables

When we look at two categorical variables, use acontingency tableContingency tables are also called two-way tables, orcross-tabulations (cross-tabs)Can examine contingency tables in many ways:

joint distributionsmarginal distributionsconditional distributions

Professor Shahjahan Khan, PhD Lecture 1

Page 141: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

Two categorical vars: Contingency tables

When we look at two categorical variables, use acontingency tableContingency tables are also called two-way tables, orcross-tabulations (cross-tabs)Can examine contingency tables in many ways:

joint distributionsmarginal distributionsconditional distributions

Professor Shahjahan Khan, PhD Lecture 1

Page 142: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

Two categorical vars: Contingency tables

When we look at two categorical variables, use acontingency tableContingency tables are also called two-way tables, orcross-tabulations (cross-tabs)Can examine contingency tables in many ways:

joint distributionsmarginal distributionsconditional distributions

Professor Shahjahan Khan, PhD Lecture 1

Page 143: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

Two categorical vars: Contingency tables

When we look at two categorical variables, use acontingency tableContingency tables are also called two-way tables, orcross-tabulations (cross-tabs)Can examine contingency tables in many ways:

joint distributionsmarginal distributionsconditional distributions

Professor Shahjahan Khan, PhD Lecture 1

Page 144: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

Two categorical vars: Contingency tables

When we look at two categorical variables, use acontingency tableContingency tables are also called two-way tables, orcross-tabulations (cross-tabs)Can examine contingency tables in many ways:

joint distributionsmarginal distributionsconditional distributions

Professor Shahjahan Khan, PhD Lecture 1

Page 145: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

Two categorical vars: Contingency tables

When we look at two categorical variables, use acontingency tableContingency tables are also called two-way tables, orcross-tabulations (cross-tabs)Can examine contingency tables in many ways:

joint distributionsmarginal distributionsconditional distributions

Professor Shahjahan Khan, PhD Lecture 1

Page 146: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

Consider an exampleSPSS Exercise 4 explains how to construct contingency tables

ExampleAll 722 members of a senior class at the Uni. of Illinoiswere asked which business major they had chosen. Hereare the data of those who responded:

Female Male TotalAccounting 68 56 124Administration 91 40 131Economics 5 6 11Finance 61 59 120Total 225 161 386

Professor Shahjahan Khan, PhD Lecture 1

Page 147: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

Understanding the table

There are two categorical variables:1 Gender of the student (with values ‘Female’ and

‘Male’)2 The chosen business major (with values ‘Accounting’,

‘Administration’, etc.)

This is a 4 × 2 table: four rows and two columns(don’t count the Total row or column)

Professor Shahjahan Khan, PhD Lecture 1

Page 148: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

Understanding the table

There are two categorical variables:1 Gender of the student (with values ‘Female’ and

‘Male’)2 The chosen business major (with values ‘Accounting’,

‘Administration’, etc.)

This is a 4 × 2 table: four rows and two columns(don’t count the Total row or column)

Professor Shahjahan Khan, PhD Lecture 1

Page 149: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

Understanding the table

There are two categorical variables:1 Gender of the student (with values ‘Female’ and

‘Male’)2 The chosen business major (with values ‘Accounting’,

‘Administration’, etc.)

This is a 4 × 2 table: four rows and two columns(don’t count the Total row or column)

Professor Shahjahan Khan, PhD Lecture 1

Page 150: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

Understanding the table

There are two categorical variables:1 Gender of the student (with values ‘Female’ and

‘Male’)2 The chosen business major (with values ‘Accounting’,

‘Administration’, etc.)

This is a 4 × 2 table: four rows and two columns(don’t count the Total row or column)

Professor Shahjahan Khan, PhD Lecture 1

Page 151: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

Questions

Female Male TotalAccounting 68 56 124Administration 91 40 131Economics 5 6 11Finance 61 59 120Total 225 161 386

What proportion of students are male Finance majors?

Professor Shahjahan Khan, PhD Lecture 1

Page 152: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

Questions

Female Male TotalAccounting 68 56 124Administration 91 40 131Economics 5 6 11Finance 61 59 120Total 225 161 386

What proportion of students are female Economicsmajors?

Professor Shahjahan Khan, PhD Lecture 1

Page 153: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

Questions

Female Male TotalAccounting 68 56 124Administration 91 40 131Economics 5 6 11Finance 61 59 120Total 225 161 386

What proportion of students are female?

Professor Shahjahan Khan, PhD Lecture 1

Page 154: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

Questions

Female Male TotalAccounting 68 56 124Administration 91 40 131Economics 5 6 11Finance 61 59 120Total 225 161 386

What proportion of Accounting majors are male?

Professor Shahjahan Khan, PhD Lecture 1

Page 155: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

The joint distribution

Female Male Total

Accounting 68386 × 100 56

386 × 100

Administration 91386 × 100 40

386 × 100

Economics 5386 × 100 6

386 × 100

Finance 61386 × 100 59

386 × 100

Total 100%

Professor Shahjahan Khan, PhD Lecture 1

Page 156: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

The joint distribution

Female Male Total

Accounting 17.6% 14.5%

Administration 23.6% 10.4%

Economics 1.3% 1.6%

Finance 15.8% 15.3%

Total 100%

15.3% of students are male finance majors.1.3% of students are female economics majors.

Professor Shahjahan Khan, PhD Lecture 1

Page 157: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

The joint distribution

Female Male Total

Accounting 17.6% 14.5%

Administration 23.6% 10.4%

Economics 1.3% 1.6%

Finance 15.8% 15.3%

Total 100%

15.3% of students are male finance majors.1.3% of students are female economics majors.

Professor Shahjahan Khan, PhD Lecture 1

Page 158: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

The joint distribution

Female Male Total

Accounting 17.6% 14.5%

Administration 23.6% 10.4%

Economics 1.3% 1.6%

Finance 15.8% 15.3%

Total 100%

15.3% of students are male finance majors.1.3% of students are female economics majors.

Professor Shahjahan Khan, PhD Lecture 1

Page 159: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

The marginal distribution by rows

Female Male Total

Accounting 124386 × 100

Administration 131386 × 100

Economics 11386 × 100

Finance 120386 × 100

Total 100%

Professor Shahjahan Khan, PhD Lecture 1

Page 160: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

The marginal distribution by rows

Female Male Total

Accounting 32.1%

Administration 33.9%

Economics 2.8%

Finance 31.1%

Total 100%

33.9% of students are administration majors

Professor Shahjahan Khan, PhD Lecture 1

Page 161: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

The marginal distribution by rows

Female Male Total

Accounting 32.1%

Administration 33.9%

Economics 2.8%

Finance 31.1%

Total 100%

33.9% of students are administration majors

Professor Shahjahan Khan, PhD Lecture 1

Page 162: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

Can draw a bar chart of major

Accnt. Admin. Econ. Fin.

Bar plot of choice of majors (n=386)

Per

cent

0

5

10

15

20

25

30

Professor Shahjahan Khan, PhD Lecture 1

Page 163: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

The marginal distribution by column

Female Male Total

Accounting

Administration

Economics

Finance

Total 225386 × 100 161

386 × 100 100%

Professor Shahjahan Khan, PhD Lecture 1

Page 164: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

The marginal distribution by column

Female Male Total

Accounting

Administration

Economics

Finance

Total 58.3% 41.7% 100%

58.3% of students are female

Professor Shahjahan Khan, PhD Lecture 1

Page 165: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

The marginal distribution by column

Female Male Total

Accounting

Administration

Economics

Finance

Total 58.3% 41.7% 100%

58.3% of students are female

Professor Shahjahan Khan, PhD Lecture 1

Page 166: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

Conditional distributions

What proportion of the Accounting majors are males?

Female Male TotalAccounting 68 56 124Administration 91 40 131Economics 5 6 11Finance 61 59 120Total 225 161 386

Professor Shahjahan Khan, PhD Lecture 1

Page 167: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

Conditional distributions

What proportion of the Accounting majors are males?

Female Male TotalAccounting 68 56 124Administration 91 40 131Economics 5 6 11Finance 61 59 120Total 225 161 386

Of the 124 Accounting majors,

Professor Shahjahan Khan, PhD Lecture 1

Page 168: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

Conditional distributions

What proportion of the Accounting majors are males?

Female Male TotalAccounting 68 56 124Administration 91 40 131Economics 5 6 11Finance 61 59 120Total 225 161 386

Of the 124 Accounting majors, 56 are male.

Professor Shahjahan Khan, PhD Lecture 1

Page 169: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

Conditional distributions

What proportion of the Accounting majors are males?

Female Male TotalAccounting 68 56 124Administration 91 40 131Economics 5 6 11Finance 61 59 120Total 225 161 386

The answer is56

124× 100 = 45.2%

Professor Shahjahan Khan, PhD Lecture 1

Page 170: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

The conditional distribution by rows

Female Male Total

Accounting 68124 × 100 56

124 × 100 100%

Administration 91131 × 100 40

131 × 100 100%

Economics 511 × 100 6

11 × 100 100%

Finance 61120 × 100 59

120 × 100 100%

Professor Shahjahan Khan, PhD Lecture 1

Page 171: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

The conditional distribution by rows

Female Male Total

Accounting 54.8% 45.2% 100%

Administration 69.5% 30.5% 100%

Economics 45.5% 54.5% 100%

Finance 50.8% 49.2% 100%

Professor Shahjahan Khan, PhD Lecture 1

Page 172: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

The conditional distribution by rows

Female Male Total

Accounting 54.8% 45.2% 100%

Administration 69.5% 30.5% 100%

Economics 45.5% 54.5% 100%

Finance 50.8% 49.2% 100%

The male–female ratio is similar for all majors, exceptAdmin.

Professor Shahjahan Khan, PhD Lecture 1

Page 173: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

The conditional distribution by rows

Female Male Total

Accounting 54.8% 45.2% 100%

Administration 69.5% 30.5% 100%

Economics 45.5% 54.5% 100%

Finance 50.8% 49.2% 100%

45.2% of accounting majors are male.

Professor Shahjahan Khan, PhD Lecture 1

Page 174: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

The conditional distribution by rows

Female Male Total

Accounting 54.8% 45.2% 100%

Administration 69.5% 30.5% 100%

Economics 45.5% 54.5% 100%

Finance 50.8% 49.2% 100%

45.2% of accounting majors are male.

Professor Shahjahan Khan, PhD Lecture 1

Page 175: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

The conditional distribution by rows

Female Male Total

Accounting 54.8% 45.2% 100%

Administration 69.5% 30.5% 100%

Economics 45.5% 54.5% 100%

Finance 50.8% 49.2% 100%

50.8% of finance majors are female.

Professor Shahjahan Khan, PhD Lecture 1

Page 176: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

The conditional distribution by rows

Female Male Total

Accounting 54.8% 45.2% 100%

Administration 69.5% 30.5% 100%

Economics 45.5% 54.5% 100%

Finance 50.8% 49.2% 100%

50.8% of finance majors are female.

Professor Shahjahan Khan, PhD Lecture 1

Page 177: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

Can draw a stacked bar chart

Professor Shahjahan Khan, PhD Lecture 1

Page 178: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

The conditional distribution by column

Female Male Total

Accounting 68225 × 100 56

161 × 100 124386 × 100

Administration 91225 × 100 40

161 × 100 131386 × 100

Economics 5225 × 100 6

161 × 100 11386 × 100

Finance 61225 × 100 59

161 × 100 120386 × 100

Total 100% 100%

Professor Shahjahan Khan, PhD Lecture 1

Page 179: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

The conditional distribution by column

Female Male

Accounting 30.2% 34.8%

Administration 40.4% 24.8%

Economics 2.2% 3.7%

Finance 27.1% 36.6%

Total 100% 100%

Professor Shahjahan Khan, PhD Lecture 1

Page 180: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

The conditional distribution by column

Female Male

Accounting 30.2% 34.8%

Administration 40.4% 24.8%

Economics 2.2% 3.7%

Finance 27.1% 36.6%

Total 100% 100%

Females divide into majors in a similar way to the males,apart from Administration.

Professor Shahjahan Khan, PhD Lecture 1

Page 181: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

The conditional distribution by column

Female Male

Accounting 30.2% 34.8%

Administration 40.4% 24.8%

Economics 2.2% 3.7%

Finance 27.1% 36.6%

Total 100% 100%

30.2% of females study accounting.

Professor Shahjahan Khan, PhD Lecture 1

Page 182: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

The conditional distribution by column

Female Male

Accounting 30.2% 34.8%

Administration 40.4% 24.8%

Economics 2.2% 3.7%

Finance 27.1% 36.6%

Total 100% 100%

30.2% of females study accounting.

Professor Shahjahan Khan, PhD Lecture 1

Page 183: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

The conditional distribution by column

Female Male

Accounting 30.2% 34.8%

Administration 40.4% 24.8%

Economics 2.2% 3.7%

Finance 27.1% 36.6%

Total 100% 100%

24.8% of males study administration.

Professor Shahjahan Khan, PhD Lecture 1

Page 184: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

The conditional distribution by column

Female Male

Accounting 30.2% 34.8%

Administration 40.4% 24.8%

Economics 2.2% 3.7%

Finance 27.1% 36.6%

Total 100% 100%

24.8% of males study administration.

Professor Shahjahan Khan, PhD Lecture 1

Page 185: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

About contingency tables

The marginal distribution of the row variable comesfrom the row totalsThe marginal distribution of the column variablecomes from the column totalsThe cell counts give rise to the joint distribution of therow and column variables

Professor Shahjahan Khan, PhD Lecture 1

Page 186: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

About contingency tables

The marginal distribution of the row variable comesfrom the row totalsThe marginal distribution of the column variablecomes from the column totalsThe cell counts give rise to the joint distribution of therow and column variables

Professor Shahjahan Khan, PhD Lecture 1

Page 187: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

About contingency tables

The marginal distribution of the row variable comesfrom the row totalsThe marginal distribution of the column variablecomes from the column totalsThe cell counts give rise to the joint distribution of therow and column variables

Professor Shahjahan Khan, PhD Lecture 1

Page 188: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

About contingency tables

Each row of counts produces a conditionaldistributionEach column of counts produces a conditionaldistributionThe conditional distributions provide evidence for oragainst an association between the variables

Professor Shahjahan Khan, PhD Lecture 1

Page 189: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

About contingency tables

Each row of counts produces a conditionaldistributionEach column of counts produces a conditionaldistributionThe conditional distributions provide evidence for oragainst an association between the variables

Professor Shahjahan Khan, PhD Lecture 1

Page 190: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

About contingency tables

Each row of counts produces a conditionaldistributionEach column of counts produces a conditionaldistributionThe conditional distributions provide evidence for oragainst an association between the variables

Professor Shahjahan Khan, PhD Lecture 1

Page 191: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

Associations between variables

ExampleIs there an association between gender and major?Yes—the row conditional distributions are not the same.The female–male split is similar for all majors except forAdministration.

Female Male Total

Accounting 54.8% 45.2% 100%

Administration 69.5% 30.5% 100%

Economics 45.5% 54.5% 100%

Finance 50.8% 49.2% 100%Professor Shahjahan Khan, PhD Lecture 1

Page 192: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

Associations between variables

ExampleIs there an association between gender and major?Yes—the row conditional distributions are not the same.The female–male split is similar for all majors except forAdministration.

Female Male Total

Accounting 54.8% 45.2% 100%

Administration 69.5% 30.5% 100%

Economics 45.5% 54.5% 100%

Finance 50.8% 49.2% 100%Professor Shahjahan Khan, PhD Lecture 1

Page 193: Data Analysis STA2300 - Transtutors · Data Analysis STA2300 Professor Shahjahan Khan, PhD School of Agricultural, Computational and Environmental Sciences Faculty of Health, Engineering

§1.1 What is/are Statistics?§1.2 About data

§1.3 Displaying categorical data

Bar chartsPie chartsContingency tables

Keeping up

Material covered today: §Module 1 (Study Book)Before next week, read §Module 2 (Study Book)Tutorials commence this week:

bring textbook; study book (or Module 1) with tutorialquestions; calculator

Assignment 1 is due next weekRemember to check the The Learning Centre websitefor additional assistance

Professor Shahjahan Khan, PhD Lecture 1