INF 141 COURSE SUMMARY Crista Lopes. Lecture Objective Know what you know.

22
INF 141 COURSE SUMMARY Crista Lopes

Transcript of INF 141 COURSE SUMMARY Crista Lopes. Lecture Objective Know what you know.

Page 1: INF 141 COURSE SUMMARY Crista Lopes. Lecture Objective Know what you know.

INF 141COURSE SUMMARY

Crista Lopes

Page 2: INF 141 COURSE SUMMARY Crista Lopes. Lecture Objective Know what you know.

Lecture Objective

Know what you know

Page 3: INF 141 COURSE SUMMARY Crista Lopes. Lecture Objective Know what you know.

Problem Space of this course “Big Data” How to

collect it index it search it for relevant information

Page 4: INF 141 COURSE SUMMARY Crista Lopes. Lecture Objective Know what you know.

Industry segment of this course Search engines

Google MS Bing nameless others

Web information retrieval is big $$$

Page 5: INF 141 COURSE SUMMARY Crista Lopes. Lecture Objective Know what you know.

Technical content of this course

Engineering Math

Page 6: INF 141 COURSE SUMMARY Crista Lopes. Lecture Objective Know what you know.

Lecture 2

Search engines history Search & advertising on the Web Web corpus

Page 7: INF 141 COURSE SUMMARY Crista Lopes. Lecture Objective Know what you know.

Lecture 3

Characteristics of the Web duplication, linkage, spam how big rate of change evolution

Characteristics of Web search Users reasons for searching characteristics of queries used behavior towards results the need behind the query

Page 8: INF 141 COURSE SUMMARY Crista Lopes. Lecture Objective Know what you know.

Lecture 4

Search Engine Optimization

Page 9: INF 141 COURSE SUMMARY Crista Lopes. Lecture Objective Know what you know.

Lectures 5 and 6

Web crawling architecture of a crawling infrastructure algorithms constraints

Page 10: INF 141 COURSE SUMMARY Crista Lopes. Lecture Objective Know what you know.

Lectures 7, 8 and 9

Index construction what index is efficient data structures efficient algorithms for constructing it

Page 11: INF 141 COURSE SUMMARY Crista Lopes. Lecture Objective Know what you know.

Lecture 10

Map Reduce Index compression

Page 12: INF 141 COURSE SUMMARY Crista Lopes. Lecture Objective Know what you know.

Lecture 11

Retrieval boolean zones TF metrics

Page 13: INF 141 COURSE SUMMARY Crista Lopes. Lecture Objective Know what you know.

Lecture 12

Ranked Retrieval weighting fields

Page 14: INF 141 COURSE SUMMARY Crista Lopes. Lecture Objective Know what you know.

Lecture 13

Better scoring TF-IDF Corpus-wide statistics

Page 15: INF 141 COURSE SUMMARY Crista Lopes. Lecture Objective Know what you know.

Lecture 14

Vector Space model Score by magnitude (euclidian distance) Score by angle (cosine distance)

Page 16: INF 141 COURSE SUMMARY Crista Lopes. Lecture Objective Know what you know.

Lecture 15

Language statistics Language processing

tokenizing stemming stopping

Link analysis PageRank

Page 17: INF 141 COURSE SUMMARY Crista Lopes. Lecture Objective Know what you know.

Lecture 16

Hadoop

Page 18: INF 141 COURSE SUMMARY Crista Lopes. Lecture Objective Know what you know.

Lecture 17

Retrieval performance: precision and recall

Latent Semantic Analysis Singular Matrix Decomposition

Page 19: INF 141 COURSE SUMMARY Crista Lopes. Lecture Objective Know what you know.

Lecture 18

Retrieval on LSI

Use of Latent Dirichlet Allocation (LSA)

Page 20: INF 141 COURSE SUMMARY Crista Lopes. Lecture Objective Know what you know.

All together

Search engines history Search & advertising on the Web Web corpus Characteristics of the Web Characteristics of Web search Users Search Engine Optimization Web crawling Index construction Index compression Map Reduce Boolean retrieval Parametric retrieval Scored retrieval TF-IDF and corpus-wide statistics Language statistics Language processing Link analysis (PageRank) Hadoop Precision and Recall LSI (and LDA)

Page 21: INF 141 COURSE SUMMARY Crista Lopes. Lecture Objective Know what you know.

Big Data jobs

plenty… not just traditional search

making sense of data

search on google

Page 22: INF 141 COURSE SUMMARY Crista Lopes. Lecture Objective Know what you know.

Where to go from here

Data mining Machine learning