Introduction to Information Retrieval - Kangwoncs.kangwon.ac.kr/~leeck/IR/PostingList.pdf ·...
18
Introduction to Information Retrieval PostingList Park Cheon Eum
Transcript of Introduction to Information Retrieval - Kangwoncs.kangwon.ac.kr/~leeck/IR/PostingList.pdf ·...
Introduction to Information Retrieval
Introduction to
Information Retrieval
PostingList
Park Cheon Eum
Introduction to Information Retrieval
Algorithm
start
doc1, … , 10
split(doc1,…,10)
doc1,…,10 < id
append(docs, doc1,…,10)
sort, uniq
posting
postring결과
End
Introduction to Information Retrieval
Algorithm - Indexer steps: Token sequence
문서 내용을 토큰 별로 나누어 ID를 설정한다.
I did enact Julius
Caesar I was killed
i' the Capitol;
Brutus killed me.
Doc 1
So let it be with
Caesar. The noble
Brutus hath told you
Caesar was ambitious
Doc 2
Introduction to Information Retrieval
Algorithm - Indexer steps: Dictionary & Postings
같은 단어 && 같은 ID 는 하나만 남긴다. (= frequency)
같은 단어 && 다른 ID는 Posting한다.
Sec. 1.2