Named entity Fusepool project

6
Fusepool Named Entity Recognition Gábor Reményi, GeoX

Transcript of Named entity Fusepool project

Page 1: Named entity Fusepool project

FusepoolNamed Entity RecognitionGábor Reményi, GeoX

Page 2: Named entity Fusepool project

Named Entity Recognition (NER)

● Named Entitieso Persons, Organizations, Locations, Diseases,

Products, etc.o The aim of NER is to locate entities in unstructured

text documents

● Domains (NER models)o News-basedo Mobile technology o Chemical elementso Cancero Diseases

Page 3: Named entity Fusepool project

Named Entity Recognition 2.(NER)

● Uses statistical models to locate entities in texts● Predicts the entities based on the context of the text

o Can recognize new entities Entities outside of the training data

o False positive entitieso False negative entities

● Creating new models is very time-consumingo Well defined domaino List of entities from the domaino Considerate amount of annotated training text

Page 4: Named entity Fusepool project

Dictionary Matching(SMA)

● Aho-Corasick dictionary-matching algorithm to locate keywords in textso Alternative solution for entity extractiono Not model based, no training

● Digital search treeo Allows very fast search

● No prediction, only matchingo Cannot find new keywordso No false +/- entities

Page 5: Named entity Fusepool project

NER versus SMANER versus SMA examples 1.

http://82.141.158.251/cner_v1/

1.: Mr. Washington lives in Seattle. He has a company named Washington Iron Co. that is the biggest iron producer in Washington.

2.: Gabor Remenyi lives in Budapest. Gabor has a company named Remenyi Iron Co. that is the biggest iron producer in Hungary.

3.: Gabor Remenyi lives in Kiskunfalva. Gabor has a company named Remenyi Iron Co. that is the biggest iron producer in Hungary.

4.: Gabor Remenyi lives with Kiskunfalva. Gabor has a company named Remenyi Iron Co. that is the biggest iron producer in Hungary.

Page 6: Named entity Fusepool project

NER versus SMANew entitiesFalse positiveFalse negativ

NER versus SMA examples 2.