Algorithms for Modern Data Sets (COMP 3801)

Post on 12-Jul-2022

5 views 0 download

Transcript of Algorithms for Modern Data Sets (COMP 3801)

Algorithms for Modern Data Sets(COMP 3801)

Anil Maheshwari

anil@scs.carleton.caSchool of Computer ScienceCarleton University

1

Outline

Main Topics

Required Background

HyFlex Teaching

Course Evaluation

Help Me

2

Main Topics

Motivating Algorithmic Question

1. How search engines search?

2. How do recommendation systems recommend?

3. How to find a matching fingerprint?

4. How to find similar documents?

5. How to detect junk emails?

6. How to find what is trending in last week or last month?

7. How to find nearest neighbor of a query point in high dimensional data?

8. How to find communities on the web?

9. . . .

3

Our Focus

• Algorithmic Techniques

• Key Ideas

• Learn bit of Math

• High-level details

4

Required Background

Background

• Basic Data StructuresLists, Stacks, BST, Hashing, Searching and Sorting

See Pat Morin’s https://opendatastructures.org/

• Basic Probability TheoryExpectation, Variance, Concentration Bounds

Michiel Smid’s COMP 2804 Notes http://cglab.ca/~michiel/DiscreteStructures/

Joe Blitzstein https://projects.iq.harvard.edu/stat110/youtube

• Linear AlgebraEigenvalues and Eigenvectors, Null Space, Positive Definitive Matrices, QΛQT , SVD

Gilbert Strang

https://ocw.mit.edu/courses/mathematics/18-06-linear-algebra-spring-2010/video-lectures/

• Algorithmic TechniquesPseudocode, Correctness, Analysis (Recurrences + O,Ω-notation), Analysis of (randomized) Quicksort. Algorithms for

Connectivity, Strong Connectivity, MST, SSSP, Graph Traversal. Techniques: D& C, Greedy, Dynamic Programming.

5

HyFlex Teaching

(Off/On)line Course

• Lectures twice a week over zoom either from the HyFlex room or frommy office

• Ample references will be provided1. Mining of Massive Data Sets http://www.mmds.org/2. Notes on Algorithm Design on my web-page

http://people.scs.carleton.ca/~maheshwa/3. Numerous research articles

(accessible from Carleton Library/Web).4. Pointers to useful Videos.

6

Office Hours

Likely during and after the class, and setup over e-mail.

7

Course Evaluation

Evaluation Parameters

Principles

• Open-ended course by design

• Contents evolve

• Enjoyable learning experience

• Use forums effectively

Evaluation Components

• 4 Assignments: 40%

• 2 Tests: 24%

• Final Exam: 36%

8

Help Me

PLEASE, PLEASE, PLEASE, PLEASE

1. Post your questions on Forums

2. Post useful links/references/ideas

3. Take initiative to learn

4. E-mail now if you want to learn some specific algorithmic technique

5. Seek Help, Have Fun & Smile

9

Fall 2021

10