Domain-Specific Insight Graphs (DIG) · DIG Technology rawwmessywdisconnected clean worganized...

16
Domain-Specific Insight Graphs (DIG) Pedro Szekely May 2017 1

Transcript of Domain-Specific Insight Graphs (DIG) · DIG Technology rawwmessywdisconnected clean worganized...

Page 1: Domain-Specific Insight Graphs (DIG) · DIG Technology rawwmessywdisconnected clean worganized wlinked hardto query, analyze & visualize easy to query, analyze & visualize 7. Steps

Domain-Specific Insight Graphs (DIG)

Pedro SzekelyMay 2017

1

Page 2: Domain-Specific Insight Graphs (DIG) · DIG Technology rawwmessywdisconnected clean worganized wlinked hardto query, analyze & visualize easy to query, analyze & visualize 7. Steps

dig.isi.edu2

Page 3: Domain-Specific Insight Graphs (DIG) · DIG Technology rawwmessywdisconnected clean worganized wlinked hardto query, analyze & visualize easy to query, analyze & visualize 7. Steps

Use the web to answer investigative questions

3

Page 4: Domain-Specific Insight Graphs (DIG) · DIG Technology rawwmessywdisconnected clean worganized wlinked hardto query, analyze & visualize easy to query, analyze & visualize 7. Steps

Use Case: Human Trafficking

100 million pages>5,000 Web sites

help victims &prosecute traffickers

4

Page 5: Domain-Specific Insight Graphs (DIG) · DIG Technology rawwmessywdisconnected clean worganized wlinked hardto query, analyze & visualize easy to query, analyze & visualize 7. Steps

Investigating a Reported Victim

San Diego, where else?5

Page 6: Domain-Specific Insight Graphs (DIG) · DIG Technology rawwmessywdisconnected clean worganized wlinked hardto query, analyze & visualize easy to query, analyze & visualize 7. Steps

Locations Where A Potential Victim Was Advertised

6

Page 7: Domain-Specific Insight Graphs (DIG) · DIG Technology rawwmessywdisconnected clean worganized wlinked hardto query, analyze & visualize easy to query, analyze & visualize 7. Steps

DIG Technology

raw w messy w disconnected clean w organized w linkedhard to query, analyze & visualize easy to query, analyze & visualize

7

Page 8: Domain-Specific Insight Graphs (DIG) · DIG Technology rawwmessywdisconnected clean worganized wlinked hardto query, analyze & visualize easy to query, analyze & visualize 7. Steps

Steps To Build a DIG

Crawling ExtractionData Acquisition

Mapping ToOntology

Entity Linking& Similarity

Knowledge GraphDeployment

Query &Visualization

ElasticSearch

GraphDB

schema.org geonames

8

Page 9: Domain-Specific Insight Graphs (DIG) · DIG Technology rawwmessywdisconnected clean worganized wlinked hardto query, analyze & visualize easy to query, analyze & visualize 7. Steps

Data Acquisition

batch w real-time

Web pages w Web service database w CSV w Excel

XML w JSON

9

Page 10: Domain-Specific Insight Graphs (DIG) · DIG Technology rawwmessywdisconnected clean worganized wlinked hardto query, analyze & visualize easy to query, analyze & visualize 7. Steps

Information ExtractionText

Web pages

Web tables

Images

PDF10

Page 11: Domain-Specific Insight Graphs (DIG) · DIG Technology rawwmessywdisconnected clean worganized wlinked hardto query, analyze & visualize easy to query, analyze & visualize 7. Steps

“YOU don't wanna miss out on ME :) Perfect lil booty Green eyes Long curly black hair Im a Irish, Armenian and Filipino mixed princess :) ❤ Kim ❤7○7~7two7~7four77 ❤ HH 80 roses ❤ Hour 120 roses ❤ 15 mins 60 roses”

name: Kimeye-color: greenhair-color: black

phone: 707-727-7477rate: $60/15min

$80/30min$120/60min

11

Page 12: Domain-Specific Insight Graphs (DIG) · DIG Technology rawwmessywdisconnected clean worganized wlinked hardto query, analyze & visualize easy to query, analyze & visualize 7. Steps

12

Page 13: Domain-Specific Insight Graphs (DIG) · DIG Technology rawwmessywdisconnected clean worganized wlinked hardto query, analyze & visualize easy to query, analyze & visualize 7. Steps

13

Page 14: Domain-Specific Insight Graphs (DIG) · DIG Technology rawwmessywdisconnected clean worganized wlinked hardto query, analyze & visualize easy to query, analyze & visualize 7. Steps

Schema Alignment karma.isi.edu

ServicesRelationalSources

{ JSON-LD }

Hierarchical Sources

Schema.org

14

Page 15: Domain-Specific Insight Graphs (DIG) · DIG Technology rawwmessywdisconnected clean worganized wlinked hardto query, analyze & visualize easy to query, analyze & visualize 7. Steps

Linking Using Image Similarity

15

Page 16: Domain-Specific Insight Graphs (DIG) · DIG Technology rawwmessywdisconnected clean worganized wlinked hardto query, analyze & visualize easy to query, analyze & visualize 7. Steps

DIG ApplicationsHuman Trafficking Identify victims, prosecute traffickers

Cyber AttacksPredict cyber attacks from dark web data

Firearms TraffickingIdentify illegal sales

PatentsIdentify patent trolls

Securities FraudIdentify fraudulent stocks in the Penny Stock market 16