Algorithms for Modern Data Sets (COMP 3801)

15
Algorithms for Modern Data Sets (COMP 3801) Anil Maheshwari [email protected] School of Computer Science Carleton University 1

Transcript of Algorithms for Modern Data Sets (COMP 3801)

Page 1: Algorithms for Modern Data Sets (COMP 3801)

Algorithms for Modern Data Sets(COMP 3801)

Anil Maheshwari

[email protected] of Computer ScienceCarleton University

1

Page 2: Algorithms for Modern Data Sets (COMP 3801)

Outline

Main Topics

Required Background

HyFlex Teaching

Course Evaluation

Help Me

2

Page 3: Algorithms for Modern Data Sets (COMP 3801)

Main Topics

Page 4: Algorithms for Modern Data Sets (COMP 3801)

Motivating Algorithmic Question

1. How search engines search?

2. How do recommendation systems recommend?

3. How to find a matching fingerprint?

4. How to find similar documents?

5. How to detect junk emails?

6. How to find what is trending in last week or last month?

7. How to find nearest neighbor of a query point in high dimensional data?

8. How to find communities on the web?

9. . . .

3

Page 5: Algorithms for Modern Data Sets (COMP 3801)

Our Focus

• Algorithmic Techniques

• Key Ideas

• Learn bit of Math

• High-level details

4

Page 6: Algorithms for Modern Data Sets (COMP 3801)

Required Background

Page 7: Algorithms for Modern Data Sets (COMP 3801)

Background

• Basic Data StructuresLists, Stacks, BST, Hashing, Searching and Sorting

See Pat Morin’s https://opendatastructures.org/

• Basic Probability TheoryExpectation, Variance, Concentration Bounds

Michiel Smid’s COMP 2804 Notes http://cglab.ca/~michiel/DiscreteStructures/

Joe Blitzstein https://projects.iq.harvard.edu/stat110/youtube

• Linear AlgebraEigenvalues and Eigenvectors, Null Space, Positive Definitive Matrices, QΛQT , SVD

Gilbert Strang

https://ocw.mit.edu/courses/mathematics/18-06-linear-algebra-spring-2010/video-lectures/

• Algorithmic TechniquesPseudocode, Correctness, Analysis (Recurrences + O,Ω-notation), Analysis of (randomized) Quicksort. Algorithms for

Connectivity, Strong Connectivity, MST, SSSP, Graph Traversal. Techniques: D& C, Greedy, Dynamic Programming.

5

Page 8: Algorithms for Modern Data Sets (COMP 3801)

HyFlex Teaching

Page 9: Algorithms for Modern Data Sets (COMP 3801)

(Off/On)line Course

• Lectures twice a week over zoom either from the HyFlex room or frommy office

• Ample references will be provided1. Mining of Massive Data Sets http://www.mmds.org/2. Notes on Algorithm Design on my web-page

http://people.scs.carleton.ca/~maheshwa/3. Numerous research articles

(accessible from Carleton Library/Web).4. Pointers to useful Videos.

6

Page 10: Algorithms for Modern Data Sets (COMP 3801)

Office Hours

Likely during and after the class, and setup over e-mail.

7

Page 11: Algorithms for Modern Data Sets (COMP 3801)

Course Evaluation

Page 12: Algorithms for Modern Data Sets (COMP 3801)

Evaluation Parameters

Principles

• Open-ended course by design

• Contents evolve

• Enjoyable learning experience

• Use forums effectively

Evaluation Components

• 4 Assignments: 40%

• 2 Tests: 24%

• Final Exam: 36%

8

Page 13: Algorithms for Modern Data Sets (COMP 3801)

Help Me

Page 14: Algorithms for Modern Data Sets (COMP 3801)

PLEASE, PLEASE, PLEASE, PLEASE

1. Post your questions on Forums

2. Post useful links/references/ideas

3. Take initiative to learn

4. E-mail now if you want to learn some specific algorithmic technique

5. Seek Help, Have Fun & Smile

9

Page 15: Algorithms for Modern Data Sets (COMP 3801)

Fall 2021

10