Named entity Fusepool project
-
Upload
fusepool-sme-project -
Category
Technology
-
view
29 -
download
0
Transcript of Named entity Fusepool project
FusepoolNamed Entity RecognitionGábor Reményi, GeoX
Named Entity Recognition (NER)
● Named Entitieso Persons, Organizations, Locations, Diseases,
Products, etc.o The aim of NER is to locate entities in unstructured
text documents
● Domains (NER models)o News-basedo Mobile technology o Chemical elementso Cancero Diseases
Named Entity Recognition 2.(NER)
● Uses statistical models to locate entities in texts● Predicts the entities based on the context of the text
o Can recognize new entities Entities outside of the training data
o False positive entitieso False negative entities
● Creating new models is very time-consumingo Well defined domaino List of entities from the domaino Considerate amount of annotated training text
Dictionary Matching(SMA)
● Aho-Corasick dictionary-matching algorithm to locate keywords in textso Alternative solution for entity extractiono Not model based, no training
● Digital search treeo Allows very fast search
● No prediction, only matchingo Cannot find new keywordso No false +/- entities
NER versus SMANER versus SMA examples 1.
http://82.141.158.251/cner_v1/
1.: Mr. Washington lives in Seattle. He has a company named Washington Iron Co. that is the biggest iron producer in Washington.
2.: Gabor Remenyi lives in Budapest. Gabor has a company named Remenyi Iron Co. that is the biggest iron producer in Hungary.
3.: Gabor Remenyi lives in Kiskunfalva. Gabor has a company named Remenyi Iron Co. that is the biggest iron producer in Hungary.
4.: Gabor Remenyi lives with Kiskunfalva. Gabor has a company named Remenyi Iron Co. that is the biggest iron producer in Hungary.
NER versus SMANew entitiesFalse positiveFalse negativ
NER versus SMA examples 2.