Powerful Information Discovery with Big Knowledge Graphs ndash
The Offshore Leaks CaseOntotext July 2016
Data - Content - Userbull Psycho-graphic vs demographic profiles
bull Build behavioural profiles on the basis of
semantic metadata associated with the assets
bull Control results bias with runtime parameters
bull Create semantic fingerprints of assets
bull Driven off of a knowledge graph
bull Automatically adapts through machine
learning
bull Semantic Database
bull Replication Cluster for enterprise clients
bull Connectors to 3rd party indexingstorage
products amp hybrid queries
Data Layer ndash the Core
Semantic Fingerprints of Content
Instance Data Relationships Facts
Ontology Schema Domain Model
GraphDB Node Zoom In
Node 1 Node 3
Master 1 Master 2Enterprise
Semantic Enrichment Overview
Personalization ndash User Actions Model
perform
comments
votes
posts
preview
read
contains leads to
readleads to
preview
Article
Search Action
Result
Date
FTS Q Tag
Cat
Tag set
results
cattaxonomy
Search Log
-------------
-------------
-------------
-------------
-------------
Quick news-analytics case
bull Our Dynamic Semantic
Publishing platform
offers linking of text
with big open data
graphs
bull One can navigate from
text to concepts get
trends related entities
and news
bull Try it at httpnowontotextcom
FF-NEWS Data Integration and Loading
bull DBpedia (the English version only) 496M statements
bull Geonames (all geographic features on Earth) 150M statementsminus owlsameAs links between DBpedia and Geonames 471K statements
bull Company registry data (GLEI) 3M statements
bull News metadata (from NOW) 128M statements
bull Total size 986М statementsminus Mapped to FIBO 667M explicit statements + 318M inferred statements
minus RDFRank and geo-spatial indices enabled to allow for ranking and efficient geo-spatial constraints
Open data integration for news analytics
Technology Semantic Content Enrichment
News Metadata
bull Metadata from Ontotextrsquos Dynamic Semantic Publishing platformminus Automatically generated as part of the NOWontotextcom semantic news showcase
bull News stream from Google since Feb 2015 about 10k newsmonthminus ~70 tags (annotations) per news article
bull Tags link text mentions of concepts to the knowledge graphminus Technically these are URIs for entities (people organizations locations etc) and key phrases
Apr 2016Hidden Relationships in Data and Risk Analytics
News Metadata
Apr 2016Hidden Relationships in Data and Risk Analytics
Category Count
International News 52 074
Science and Technology 23 201
Sports 20 714
Business 15 155
Lifestyle 11 684
122 828
Mentions entity type Count
Keyphrase 2 589 676
Organization 1 276 441
Location 1 260 972
Person 1 248 784
Work 309 093
Event 258 388
RelationPersonRole 236 638
Species 180 946
Sample queries at httpff-newsontotextcom
F1 Big cities in Eastern Europe
F2 Airports near London
F3 People and organizations related to Google
F4 Top-level industries by number of companies
F5 Mentions in the news of an organization and its related entities
F7 Most popular companies per industry including children
F8 Regional exposition of company ndash normalized
FF-NEWS is in Beta Not officially launched but available to play with
Open data integration for news analytics
News Popularity Ranking Automotive
Open data integration for news analytics
Rank Company News RankCompany incl mentions of child
companies News
1 General Motors 2722 1 General Motors 4620
2 Tesla Motors 2346 2 Volkswagen Group 3999
3 Volkswagen 2299 3 Fiat Chrysler Automobiles 2658
4 Ford Motor Company 1934 4 Tesla Motors 2370
5 Toyota 1325 5 Ford Motor Company 2125
6 Chevrolet 1264 6 Toyota 1656
7 Chrysler 1054 7 Renault-Nissan Alliance 1332
8 Fiat Chrysler Automobiles 1011 8 Honda 864
9 Audi AG 972 9 BMW 715
10 Honda 717 10 Takata Corporation 547
News Popularity Finance
Open data integration for news analytics
Rank Company News Rank Company incl mentions of controlled News
1 Bloomberg LP 3203 1 Intra Bank 261667
2 Goldman Sachs 1992 2 Hinduja Bank (Switzerland) 49731
3 JP Morgan Chase 1712 3 China Merchants Bank 38288
4 Wells Fargo 1688 4 Alphabet Inc 22601
5 Citigroup 1557 5 Capital Group Companies 4076
6 HSBC Holdings 1546 6 Bloomberg LP 3611
7 Deutsche Bank 1414 7 Exor 2704
8 Bank of America 1335 8 Nasdaq Inc 2082
9 Barclays 1260 9 JP Morgan Chase 1972
10 UBS 694 10 Sentinel Capital Partners 1053
Note Including investment funds stock exchanges agencies etc
News Popularity Banking
Open data integration for news analytics
Rank Company News Rank Company incl mentions of controlled News
1 Goldman Sachs 996 1 China Merchants Bank 38288
2 JP Morgan Chase 856 2 JP Morgan Chase 1972
3 HSBC Holdings 773 3 Goldman Sachs 1030
4 Deutsche Bank 707 4 HSBC 966
5 Barclays 630 5 Bank of America 771
6 Citigroup 519 6 Deutsche Bank 742
7 Bank of America 445 7 Barclays 681
8 Wells Fargo 422 8 Citigroup 630
9 UBS 347 9 Wells Fargo 428
10 Chase 126 10 UBS 347
Offshore Leaks Database from ICIJ
bull Published by the International Consortium of Investigative
Journalists (ICIJ) on 9th of May
bull A ldquosearchable databaserdquo about 320 000 offshore companies
minus 214 000 extracted from Panama Papers (valid until 2015)
minus More than 100 000 from 2013 Offshore leaks investigation (valid until 2010)
bull CSV extract from a graph database available for download
bull httpsoffshoreleaksicijorg
Open data integration for news analytics
Offshore Leaks Database
Open data integration for news analytics
Offshore Leaks DB as Linked Open Data
bull Ontotext published the Offshore Leaks DB as Linked Open Data
bull Available for exploration querying and download at
httpdataontotextcom
bull ONTOTEXT DISCLAIMERS
We use the data as is provided by ICIJ We make no representations and warranties of any kind
including warranties of title accuracy absence of errors or fitness for particular purpose All
transformations query results and derivative works are used only to showcase the service and
technological capabilities and not to serve as basis for any statements or conclusions
Open data integration for news analytics
Enrichment and structuring of the data
bull Relationship type hierarchy
minus About 80 types of relationship types in the original dataset got organized in a property hierarchy
bull Classification of officers into Person and Company
minus In the original database there is no way to distinguish whether an officer is a physical person
bull Mapping to DBPedia
minus 209 countries referred in Offshore Leaks DB are mapped to DBPedia
minus About 3000 persons and 300 companies mapped to DBPedia
bull Overall size of the repository 22M statements (20M explicit)
Open data integration for news analytics
The RDF-ization Process
bull Linked data variant produced without programming
minus The raw CSV files are RDF-ized using TARQL httptarqlgithubio
minus Data was further interlinked and enriched in GraphDB using SPARQL
bull The process is documented in this README file
bull All relevant artifacts are open-source available at
httpsgithubcomOntotext-ADleaks
bull The entire publishing and mapping took about 15 person-days
minus Including dataontotextcom portal setup promotion documentation etc
Open data integration for news analytics
Sample queries at httpdataontotextcom
Q1 Countries by number of entities related to them
Q2 Country pairs by ownership statistics
Q3 Statistics by incorporation year
Q4 Officers and entities by number of capital relations
Q5 Countries in Eastern Europe by number of owners
Q6 Intermediaries in Asia by name
Q7 The best connected officers
Q8 Countries by number of Person and Company officers
Play with semantically enriched news
httpnowontotextcom
Play with open data at
httpdataontotextcom and httpff-
newsontotextcom
Data - Content - Userbull Psycho-graphic vs demographic profiles
bull Build behavioural profiles on the basis of
semantic metadata associated with the assets
bull Control results bias with runtime parameters
bull Create semantic fingerprints of assets
bull Driven off of a knowledge graph
bull Automatically adapts through machine
learning
bull Semantic Database
bull Replication Cluster for enterprise clients
bull Connectors to 3rd party indexingstorage
products amp hybrid queries
Data Layer ndash the Core
Semantic Fingerprints of Content
Instance Data Relationships Facts
Ontology Schema Domain Model
GraphDB Node Zoom In
Node 1 Node 3
Master 1 Master 2Enterprise
Semantic Enrichment Overview
Personalization ndash User Actions Model
perform
comments
votes
posts
preview
read
contains leads to
readleads to
preview
Article
Search Action
Result
Date
FTS Q Tag
Cat
Tag set
results
cattaxonomy
Search Log
-------------
-------------
-------------
-------------
-------------
Quick news-analytics case
bull Our Dynamic Semantic
Publishing platform
offers linking of text
with big open data
graphs
bull One can navigate from
text to concepts get
trends related entities
and news
bull Try it at httpnowontotextcom
FF-NEWS Data Integration and Loading
bull DBpedia (the English version only) 496M statements
bull Geonames (all geographic features on Earth) 150M statementsminus owlsameAs links between DBpedia and Geonames 471K statements
bull Company registry data (GLEI) 3M statements
bull News metadata (from NOW) 128M statements
bull Total size 986М statementsminus Mapped to FIBO 667M explicit statements + 318M inferred statements
minus RDFRank and geo-spatial indices enabled to allow for ranking and efficient geo-spatial constraints
Open data integration for news analytics
Technology Semantic Content Enrichment
News Metadata
bull Metadata from Ontotextrsquos Dynamic Semantic Publishing platformminus Automatically generated as part of the NOWontotextcom semantic news showcase
bull News stream from Google since Feb 2015 about 10k newsmonthminus ~70 tags (annotations) per news article
bull Tags link text mentions of concepts to the knowledge graphminus Technically these are URIs for entities (people organizations locations etc) and key phrases
Apr 2016Hidden Relationships in Data and Risk Analytics
News Metadata
Apr 2016Hidden Relationships in Data and Risk Analytics
Category Count
International News 52 074
Science and Technology 23 201
Sports 20 714
Business 15 155
Lifestyle 11 684
122 828
Mentions entity type Count
Keyphrase 2 589 676
Organization 1 276 441
Location 1 260 972
Person 1 248 784
Work 309 093
Event 258 388
RelationPersonRole 236 638
Species 180 946
Sample queries at httpff-newsontotextcom
F1 Big cities in Eastern Europe
F2 Airports near London
F3 People and organizations related to Google
F4 Top-level industries by number of companies
F5 Mentions in the news of an organization and its related entities
F7 Most popular companies per industry including children
F8 Regional exposition of company ndash normalized
FF-NEWS is in Beta Not officially launched but available to play with
Open data integration for news analytics
News Popularity Ranking Automotive
Open data integration for news analytics
Rank Company News RankCompany incl mentions of child
companies News
1 General Motors 2722 1 General Motors 4620
2 Tesla Motors 2346 2 Volkswagen Group 3999
3 Volkswagen 2299 3 Fiat Chrysler Automobiles 2658
4 Ford Motor Company 1934 4 Tesla Motors 2370
5 Toyota 1325 5 Ford Motor Company 2125
6 Chevrolet 1264 6 Toyota 1656
7 Chrysler 1054 7 Renault-Nissan Alliance 1332
8 Fiat Chrysler Automobiles 1011 8 Honda 864
9 Audi AG 972 9 BMW 715
10 Honda 717 10 Takata Corporation 547
News Popularity Finance
Open data integration for news analytics
Rank Company News Rank Company incl mentions of controlled News
1 Bloomberg LP 3203 1 Intra Bank 261667
2 Goldman Sachs 1992 2 Hinduja Bank (Switzerland) 49731
3 JP Morgan Chase 1712 3 China Merchants Bank 38288
4 Wells Fargo 1688 4 Alphabet Inc 22601
5 Citigroup 1557 5 Capital Group Companies 4076
6 HSBC Holdings 1546 6 Bloomberg LP 3611
7 Deutsche Bank 1414 7 Exor 2704
8 Bank of America 1335 8 Nasdaq Inc 2082
9 Barclays 1260 9 JP Morgan Chase 1972
10 UBS 694 10 Sentinel Capital Partners 1053
Note Including investment funds stock exchanges agencies etc
News Popularity Banking
Open data integration for news analytics
Rank Company News Rank Company incl mentions of controlled News
1 Goldman Sachs 996 1 China Merchants Bank 38288
2 JP Morgan Chase 856 2 JP Morgan Chase 1972
3 HSBC Holdings 773 3 Goldman Sachs 1030
4 Deutsche Bank 707 4 HSBC 966
5 Barclays 630 5 Bank of America 771
6 Citigroup 519 6 Deutsche Bank 742
7 Bank of America 445 7 Barclays 681
8 Wells Fargo 422 8 Citigroup 630
9 UBS 347 9 Wells Fargo 428
10 Chase 126 10 UBS 347
Offshore Leaks Database from ICIJ
bull Published by the International Consortium of Investigative
Journalists (ICIJ) on 9th of May
bull A ldquosearchable databaserdquo about 320 000 offshore companies
minus 214 000 extracted from Panama Papers (valid until 2015)
minus More than 100 000 from 2013 Offshore leaks investigation (valid until 2010)
bull CSV extract from a graph database available for download
bull httpsoffshoreleaksicijorg
Open data integration for news analytics
Offshore Leaks Database
Open data integration for news analytics
Offshore Leaks DB as Linked Open Data
bull Ontotext published the Offshore Leaks DB as Linked Open Data
bull Available for exploration querying and download at
httpdataontotextcom
bull ONTOTEXT DISCLAIMERS
We use the data as is provided by ICIJ We make no representations and warranties of any kind
including warranties of title accuracy absence of errors or fitness for particular purpose All
transformations query results and derivative works are used only to showcase the service and
technological capabilities and not to serve as basis for any statements or conclusions
Open data integration for news analytics
Enrichment and structuring of the data
bull Relationship type hierarchy
minus About 80 types of relationship types in the original dataset got organized in a property hierarchy
bull Classification of officers into Person and Company
minus In the original database there is no way to distinguish whether an officer is a physical person
bull Mapping to DBPedia
minus 209 countries referred in Offshore Leaks DB are mapped to DBPedia
minus About 3000 persons and 300 companies mapped to DBPedia
bull Overall size of the repository 22M statements (20M explicit)
Open data integration for news analytics
The RDF-ization Process
bull Linked data variant produced without programming
minus The raw CSV files are RDF-ized using TARQL httptarqlgithubio
minus Data was further interlinked and enriched in GraphDB using SPARQL
bull The process is documented in this README file
bull All relevant artifacts are open-source available at
httpsgithubcomOntotext-ADleaks
bull The entire publishing and mapping took about 15 person-days
minus Including dataontotextcom portal setup promotion documentation etc
Open data integration for news analytics
Sample queries at httpdataontotextcom
Q1 Countries by number of entities related to them
Q2 Country pairs by ownership statistics
Q3 Statistics by incorporation year
Q4 Officers and entities by number of capital relations
Q5 Countries in Eastern Europe by number of owners
Q6 Intermediaries in Asia by name
Q7 The best connected officers
Q8 Countries by number of Person and Company officers
Play with semantically enriched news
httpnowontotextcom
Play with open data at
httpdataontotextcom and httpff-
newsontotextcom
Data Layer ndash the Core
Semantic Fingerprints of Content
Instance Data Relationships Facts
Ontology Schema Domain Model
GraphDB Node Zoom In
Node 1 Node 3
Master 1 Master 2Enterprise
Semantic Enrichment Overview
Personalization ndash User Actions Model
perform
comments
votes
posts
preview
read
contains leads to
readleads to
preview
Article
Search Action
Result
Date
FTS Q Tag
Cat
Tag set
results
cattaxonomy
Search Log
-------------
-------------
-------------
-------------
-------------
Quick news-analytics case
bull Our Dynamic Semantic
Publishing platform
offers linking of text
with big open data
graphs
bull One can navigate from
text to concepts get
trends related entities
and news
bull Try it at httpnowontotextcom
FF-NEWS Data Integration and Loading
bull DBpedia (the English version only) 496M statements
bull Geonames (all geographic features on Earth) 150M statementsminus owlsameAs links between DBpedia and Geonames 471K statements
bull Company registry data (GLEI) 3M statements
bull News metadata (from NOW) 128M statements
bull Total size 986М statementsminus Mapped to FIBO 667M explicit statements + 318M inferred statements
minus RDFRank and geo-spatial indices enabled to allow for ranking and efficient geo-spatial constraints
Open data integration for news analytics
Technology Semantic Content Enrichment
News Metadata
bull Metadata from Ontotextrsquos Dynamic Semantic Publishing platformminus Automatically generated as part of the NOWontotextcom semantic news showcase
bull News stream from Google since Feb 2015 about 10k newsmonthminus ~70 tags (annotations) per news article
bull Tags link text mentions of concepts to the knowledge graphminus Technically these are URIs for entities (people organizations locations etc) and key phrases
Apr 2016Hidden Relationships in Data and Risk Analytics
News Metadata
Apr 2016Hidden Relationships in Data and Risk Analytics
Category Count
International News 52 074
Science and Technology 23 201
Sports 20 714
Business 15 155
Lifestyle 11 684
122 828
Mentions entity type Count
Keyphrase 2 589 676
Organization 1 276 441
Location 1 260 972
Person 1 248 784
Work 309 093
Event 258 388
RelationPersonRole 236 638
Species 180 946
Sample queries at httpff-newsontotextcom
F1 Big cities in Eastern Europe
F2 Airports near London
F3 People and organizations related to Google
F4 Top-level industries by number of companies
F5 Mentions in the news of an organization and its related entities
F7 Most popular companies per industry including children
F8 Regional exposition of company ndash normalized
FF-NEWS is in Beta Not officially launched but available to play with
Open data integration for news analytics
News Popularity Ranking Automotive
Open data integration for news analytics
Rank Company News RankCompany incl mentions of child
companies News
1 General Motors 2722 1 General Motors 4620
2 Tesla Motors 2346 2 Volkswagen Group 3999
3 Volkswagen 2299 3 Fiat Chrysler Automobiles 2658
4 Ford Motor Company 1934 4 Tesla Motors 2370
5 Toyota 1325 5 Ford Motor Company 2125
6 Chevrolet 1264 6 Toyota 1656
7 Chrysler 1054 7 Renault-Nissan Alliance 1332
8 Fiat Chrysler Automobiles 1011 8 Honda 864
9 Audi AG 972 9 BMW 715
10 Honda 717 10 Takata Corporation 547
News Popularity Finance
Open data integration for news analytics
Rank Company News Rank Company incl mentions of controlled News
1 Bloomberg LP 3203 1 Intra Bank 261667
2 Goldman Sachs 1992 2 Hinduja Bank (Switzerland) 49731
3 JP Morgan Chase 1712 3 China Merchants Bank 38288
4 Wells Fargo 1688 4 Alphabet Inc 22601
5 Citigroup 1557 5 Capital Group Companies 4076
6 HSBC Holdings 1546 6 Bloomberg LP 3611
7 Deutsche Bank 1414 7 Exor 2704
8 Bank of America 1335 8 Nasdaq Inc 2082
9 Barclays 1260 9 JP Morgan Chase 1972
10 UBS 694 10 Sentinel Capital Partners 1053
Note Including investment funds stock exchanges agencies etc
News Popularity Banking
Open data integration for news analytics
Rank Company News Rank Company incl mentions of controlled News
1 Goldman Sachs 996 1 China Merchants Bank 38288
2 JP Morgan Chase 856 2 JP Morgan Chase 1972
3 HSBC Holdings 773 3 Goldman Sachs 1030
4 Deutsche Bank 707 4 HSBC 966
5 Barclays 630 5 Bank of America 771
6 Citigroup 519 6 Deutsche Bank 742
7 Bank of America 445 7 Barclays 681
8 Wells Fargo 422 8 Citigroup 630
9 UBS 347 9 Wells Fargo 428
10 Chase 126 10 UBS 347
Offshore Leaks Database from ICIJ
bull Published by the International Consortium of Investigative
Journalists (ICIJ) on 9th of May
bull A ldquosearchable databaserdquo about 320 000 offshore companies
minus 214 000 extracted from Panama Papers (valid until 2015)
minus More than 100 000 from 2013 Offshore leaks investigation (valid until 2010)
bull CSV extract from a graph database available for download
bull httpsoffshoreleaksicijorg
Open data integration for news analytics
Offshore Leaks Database
Open data integration for news analytics
Offshore Leaks DB as Linked Open Data
bull Ontotext published the Offshore Leaks DB as Linked Open Data
bull Available for exploration querying and download at
httpdataontotextcom
bull ONTOTEXT DISCLAIMERS
We use the data as is provided by ICIJ We make no representations and warranties of any kind
including warranties of title accuracy absence of errors or fitness for particular purpose All
transformations query results and derivative works are used only to showcase the service and
technological capabilities and not to serve as basis for any statements or conclusions
Open data integration for news analytics
Enrichment and structuring of the data
bull Relationship type hierarchy
minus About 80 types of relationship types in the original dataset got organized in a property hierarchy
bull Classification of officers into Person and Company
minus In the original database there is no way to distinguish whether an officer is a physical person
bull Mapping to DBPedia
minus 209 countries referred in Offshore Leaks DB are mapped to DBPedia
minus About 3000 persons and 300 companies mapped to DBPedia
bull Overall size of the repository 22M statements (20M explicit)
Open data integration for news analytics
The RDF-ization Process
bull Linked data variant produced without programming
minus The raw CSV files are RDF-ized using TARQL httptarqlgithubio
minus Data was further interlinked and enriched in GraphDB using SPARQL
bull The process is documented in this README file
bull All relevant artifacts are open-source available at
httpsgithubcomOntotext-ADleaks
bull The entire publishing and mapping took about 15 person-days
minus Including dataontotextcom portal setup promotion documentation etc
Open data integration for news analytics
Sample queries at httpdataontotextcom
Q1 Countries by number of entities related to them
Q2 Country pairs by ownership statistics
Q3 Statistics by incorporation year
Q4 Officers and entities by number of capital relations
Q5 Countries in Eastern Europe by number of owners
Q6 Intermediaries in Asia by name
Q7 The best connected officers
Q8 Countries by number of Person and Company officers
Play with semantically enriched news
httpnowontotextcom
Play with open data at
httpdataontotextcom and httpff-
newsontotextcom
Semantic Enrichment Overview
Personalization ndash User Actions Model
perform
comments
votes
posts
preview
read
contains leads to
readleads to
preview
Article
Search Action
Result
Date
FTS Q Tag
Cat
Tag set
results
cattaxonomy
Search Log
-------------
-------------
-------------
-------------
-------------
Quick news-analytics case
bull Our Dynamic Semantic
Publishing platform
offers linking of text
with big open data
graphs
bull One can navigate from
text to concepts get
trends related entities
and news
bull Try it at httpnowontotextcom
FF-NEWS Data Integration and Loading
bull DBpedia (the English version only) 496M statements
bull Geonames (all geographic features on Earth) 150M statementsminus owlsameAs links between DBpedia and Geonames 471K statements
bull Company registry data (GLEI) 3M statements
bull News metadata (from NOW) 128M statements
bull Total size 986М statementsminus Mapped to FIBO 667M explicit statements + 318M inferred statements
minus RDFRank and geo-spatial indices enabled to allow for ranking and efficient geo-spatial constraints
Open data integration for news analytics
Technology Semantic Content Enrichment
News Metadata
bull Metadata from Ontotextrsquos Dynamic Semantic Publishing platformminus Automatically generated as part of the NOWontotextcom semantic news showcase
bull News stream from Google since Feb 2015 about 10k newsmonthminus ~70 tags (annotations) per news article
bull Tags link text mentions of concepts to the knowledge graphminus Technically these are URIs for entities (people organizations locations etc) and key phrases
Apr 2016Hidden Relationships in Data and Risk Analytics
News Metadata
Apr 2016Hidden Relationships in Data and Risk Analytics
Category Count
International News 52 074
Science and Technology 23 201
Sports 20 714
Business 15 155
Lifestyle 11 684
122 828
Mentions entity type Count
Keyphrase 2 589 676
Organization 1 276 441
Location 1 260 972
Person 1 248 784
Work 309 093
Event 258 388
RelationPersonRole 236 638
Species 180 946
Sample queries at httpff-newsontotextcom
F1 Big cities in Eastern Europe
F2 Airports near London
F3 People and organizations related to Google
F4 Top-level industries by number of companies
F5 Mentions in the news of an organization and its related entities
F7 Most popular companies per industry including children
F8 Regional exposition of company ndash normalized
FF-NEWS is in Beta Not officially launched but available to play with
Open data integration for news analytics
News Popularity Ranking Automotive
Open data integration for news analytics
Rank Company News RankCompany incl mentions of child
companies News
1 General Motors 2722 1 General Motors 4620
2 Tesla Motors 2346 2 Volkswagen Group 3999
3 Volkswagen 2299 3 Fiat Chrysler Automobiles 2658
4 Ford Motor Company 1934 4 Tesla Motors 2370
5 Toyota 1325 5 Ford Motor Company 2125
6 Chevrolet 1264 6 Toyota 1656
7 Chrysler 1054 7 Renault-Nissan Alliance 1332
8 Fiat Chrysler Automobiles 1011 8 Honda 864
9 Audi AG 972 9 BMW 715
10 Honda 717 10 Takata Corporation 547
News Popularity Finance
Open data integration for news analytics
Rank Company News Rank Company incl mentions of controlled News
1 Bloomberg LP 3203 1 Intra Bank 261667
2 Goldman Sachs 1992 2 Hinduja Bank (Switzerland) 49731
3 JP Morgan Chase 1712 3 China Merchants Bank 38288
4 Wells Fargo 1688 4 Alphabet Inc 22601
5 Citigroup 1557 5 Capital Group Companies 4076
6 HSBC Holdings 1546 6 Bloomberg LP 3611
7 Deutsche Bank 1414 7 Exor 2704
8 Bank of America 1335 8 Nasdaq Inc 2082
9 Barclays 1260 9 JP Morgan Chase 1972
10 UBS 694 10 Sentinel Capital Partners 1053
Note Including investment funds stock exchanges agencies etc
News Popularity Banking
Open data integration for news analytics
Rank Company News Rank Company incl mentions of controlled News
1 Goldman Sachs 996 1 China Merchants Bank 38288
2 JP Morgan Chase 856 2 JP Morgan Chase 1972
3 HSBC Holdings 773 3 Goldman Sachs 1030
4 Deutsche Bank 707 4 HSBC 966
5 Barclays 630 5 Bank of America 771
6 Citigroup 519 6 Deutsche Bank 742
7 Bank of America 445 7 Barclays 681
8 Wells Fargo 422 8 Citigroup 630
9 UBS 347 9 Wells Fargo 428
10 Chase 126 10 UBS 347
Offshore Leaks Database from ICIJ
bull Published by the International Consortium of Investigative
Journalists (ICIJ) on 9th of May
bull A ldquosearchable databaserdquo about 320 000 offshore companies
minus 214 000 extracted from Panama Papers (valid until 2015)
minus More than 100 000 from 2013 Offshore leaks investigation (valid until 2010)
bull CSV extract from a graph database available for download
bull httpsoffshoreleaksicijorg
Open data integration for news analytics
Offshore Leaks Database
Open data integration for news analytics
Offshore Leaks DB as Linked Open Data
bull Ontotext published the Offshore Leaks DB as Linked Open Data
bull Available for exploration querying and download at
httpdataontotextcom
bull ONTOTEXT DISCLAIMERS
We use the data as is provided by ICIJ We make no representations and warranties of any kind
including warranties of title accuracy absence of errors or fitness for particular purpose All
transformations query results and derivative works are used only to showcase the service and
technological capabilities and not to serve as basis for any statements or conclusions
Open data integration for news analytics
Enrichment and structuring of the data
bull Relationship type hierarchy
minus About 80 types of relationship types in the original dataset got organized in a property hierarchy
bull Classification of officers into Person and Company
minus In the original database there is no way to distinguish whether an officer is a physical person
bull Mapping to DBPedia
minus 209 countries referred in Offshore Leaks DB are mapped to DBPedia
minus About 3000 persons and 300 companies mapped to DBPedia
bull Overall size of the repository 22M statements (20M explicit)
Open data integration for news analytics
The RDF-ization Process
bull Linked data variant produced without programming
minus The raw CSV files are RDF-ized using TARQL httptarqlgithubio
minus Data was further interlinked and enriched in GraphDB using SPARQL
bull The process is documented in this README file
bull All relevant artifacts are open-source available at
httpsgithubcomOntotext-ADleaks
bull The entire publishing and mapping took about 15 person-days
minus Including dataontotextcom portal setup promotion documentation etc
Open data integration for news analytics
Sample queries at httpdataontotextcom
Q1 Countries by number of entities related to them
Q2 Country pairs by ownership statistics
Q3 Statistics by incorporation year
Q4 Officers and entities by number of capital relations
Q5 Countries in Eastern Europe by number of owners
Q6 Intermediaries in Asia by name
Q7 The best connected officers
Q8 Countries by number of Person and Company officers
Play with semantically enriched news
httpnowontotextcom
Play with open data at
httpdataontotextcom and httpff-
newsontotextcom
Personalization ndash User Actions Model
perform
comments
votes
posts
preview
read
contains leads to
readleads to
preview
Article
Search Action
Result
Date
FTS Q Tag
Cat
Tag set
results
cattaxonomy
Search Log
-------------
-------------
-------------
-------------
-------------
Quick news-analytics case
bull Our Dynamic Semantic
Publishing platform
offers linking of text
with big open data
graphs
bull One can navigate from
text to concepts get
trends related entities
and news
bull Try it at httpnowontotextcom
FF-NEWS Data Integration and Loading
bull DBpedia (the English version only) 496M statements
bull Geonames (all geographic features on Earth) 150M statementsminus owlsameAs links between DBpedia and Geonames 471K statements
bull Company registry data (GLEI) 3M statements
bull News metadata (from NOW) 128M statements
bull Total size 986М statementsminus Mapped to FIBO 667M explicit statements + 318M inferred statements
minus RDFRank and geo-spatial indices enabled to allow for ranking and efficient geo-spatial constraints
Open data integration for news analytics
Technology Semantic Content Enrichment
News Metadata
bull Metadata from Ontotextrsquos Dynamic Semantic Publishing platformminus Automatically generated as part of the NOWontotextcom semantic news showcase
bull News stream from Google since Feb 2015 about 10k newsmonthminus ~70 tags (annotations) per news article
bull Tags link text mentions of concepts to the knowledge graphminus Technically these are URIs for entities (people organizations locations etc) and key phrases
Apr 2016Hidden Relationships in Data and Risk Analytics
News Metadata
Apr 2016Hidden Relationships in Data and Risk Analytics
Category Count
International News 52 074
Science and Technology 23 201
Sports 20 714
Business 15 155
Lifestyle 11 684
122 828
Mentions entity type Count
Keyphrase 2 589 676
Organization 1 276 441
Location 1 260 972
Person 1 248 784
Work 309 093
Event 258 388
RelationPersonRole 236 638
Species 180 946
Sample queries at httpff-newsontotextcom
F1 Big cities in Eastern Europe
F2 Airports near London
F3 People and organizations related to Google
F4 Top-level industries by number of companies
F5 Mentions in the news of an organization and its related entities
F7 Most popular companies per industry including children
F8 Regional exposition of company ndash normalized
FF-NEWS is in Beta Not officially launched but available to play with
Open data integration for news analytics
News Popularity Ranking Automotive
Open data integration for news analytics
Rank Company News RankCompany incl mentions of child
companies News
1 General Motors 2722 1 General Motors 4620
2 Tesla Motors 2346 2 Volkswagen Group 3999
3 Volkswagen 2299 3 Fiat Chrysler Automobiles 2658
4 Ford Motor Company 1934 4 Tesla Motors 2370
5 Toyota 1325 5 Ford Motor Company 2125
6 Chevrolet 1264 6 Toyota 1656
7 Chrysler 1054 7 Renault-Nissan Alliance 1332
8 Fiat Chrysler Automobiles 1011 8 Honda 864
9 Audi AG 972 9 BMW 715
10 Honda 717 10 Takata Corporation 547
News Popularity Finance
Open data integration for news analytics
Rank Company News Rank Company incl mentions of controlled News
1 Bloomberg LP 3203 1 Intra Bank 261667
2 Goldman Sachs 1992 2 Hinduja Bank (Switzerland) 49731
3 JP Morgan Chase 1712 3 China Merchants Bank 38288
4 Wells Fargo 1688 4 Alphabet Inc 22601
5 Citigroup 1557 5 Capital Group Companies 4076
6 HSBC Holdings 1546 6 Bloomberg LP 3611
7 Deutsche Bank 1414 7 Exor 2704
8 Bank of America 1335 8 Nasdaq Inc 2082
9 Barclays 1260 9 JP Morgan Chase 1972
10 UBS 694 10 Sentinel Capital Partners 1053
Note Including investment funds stock exchanges agencies etc
News Popularity Banking
Open data integration for news analytics
Rank Company News Rank Company incl mentions of controlled News
1 Goldman Sachs 996 1 China Merchants Bank 38288
2 JP Morgan Chase 856 2 JP Morgan Chase 1972
3 HSBC Holdings 773 3 Goldman Sachs 1030
4 Deutsche Bank 707 4 HSBC 966
5 Barclays 630 5 Bank of America 771
6 Citigroup 519 6 Deutsche Bank 742
7 Bank of America 445 7 Barclays 681
8 Wells Fargo 422 8 Citigroup 630
9 UBS 347 9 Wells Fargo 428
10 Chase 126 10 UBS 347
Offshore Leaks Database from ICIJ
bull Published by the International Consortium of Investigative
Journalists (ICIJ) on 9th of May
bull A ldquosearchable databaserdquo about 320 000 offshore companies
minus 214 000 extracted from Panama Papers (valid until 2015)
minus More than 100 000 from 2013 Offshore leaks investigation (valid until 2010)
bull CSV extract from a graph database available for download
bull httpsoffshoreleaksicijorg
Open data integration for news analytics
Offshore Leaks Database
Open data integration for news analytics
Offshore Leaks DB as Linked Open Data
bull Ontotext published the Offshore Leaks DB as Linked Open Data
bull Available for exploration querying and download at
httpdataontotextcom
bull ONTOTEXT DISCLAIMERS
We use the data as is provided by ICIJ We make no representations and warranties of any kind
including warranties of title accuracy absence of errors or fitness for particular purpose All
transformations query results and derivative works are used only to showcase the service and
technological capabilities and not to serve as basis for any statements or conclusions
Open data integration for news analytics
Enrichment and structuring of the data
bull Relationship type hierarchy
minus About 80 types of relationship types in the original dataset got organized in a property hierarchy
bull Classification of officers into Person and Company
minus In the original database there is no way to distinguish whether an officer is a physical person
bull Mapping to DBPedia
minus 209 countries referred in Offshore Leaks DB are mapped to DBPedia
minus About 3000 persons and 300 companies mapped to DBPedia
bull Overall size of the repository 22M statements (20M explicit)
Open data integration for news analytics
The RDF-ization Process
bull Linked data variant produced without programming
minus The raw CSV files are RDF-ized using TARQL httptarqlgithubio
minus Data was further interlinked and enriched in GraphDB using SPARQL
bull The process is documented in this README file
bull All relevant artifacts are open-source available at
httpsgithubcomOntotext-ADleaks
bull The entire publishing and mapping took about 15 person-days
minus Including dataontotextcom portal setup promotion documentation etc
Open data integration for news analytics
Sample queries at httpdataontotextcom
Q1 Countries by number of entities related to them
Q2 Country pairs by ownership statistics
Q3 Statistics by incorporation year
Q4 Officers and entities by number of capital relations
Q5 Countries in Eastern Europe by number of owners
Q6 Intermediaries in Asia by name
Q7 The best connected officers
Q8 Countries by number of Person and Company officers
Play with semantically enriched news
httpnowontotextcom
Play with open data at
httpdataontotextcom and httpff-
newsontotextcom
Quick news-analytics case
bull Our Dynamic Semantic
Publishing platform
offers linking of text
with big open data
graphs
bull One can navigate from
text to concepts get
trends related entities
and news
bull Try it at httpnowontotextcom
FF-NEWS Data Integration and Loading
bull DBpedia (the English version only) 496M statements
bull Geonames (all geographic features on Earth) 150M statementsminus owlsameAs links between DBpedia and Geonames 471K statements
bull Company registry data (GLEI) 3M statements
bull News metadata (from NOW) 128M statements
bull Total size 986М statementsminus Mapped to FIBO 667M explicit statements + 318M inferred statements
minus RDFRank and geo-spatial indices enabled to allow for ranking and efficient geo-spatial constraints
Open data integration for news analytics
Technology Semantic Content Enrichment
News Metadata
bull Metadata from Ontotextrsquos Dynamic Semantic Publishing platformminus Automatically generated as part of the NOWontotextcom semantic news showcase
bull News stream from Google since Feb 2015 about 10k newsmonthminus ~70 tags (annotations) per news article
bull Tags link text mentions of concepts to the knowledge graphminus Technically these are URIs for entities (people organizations locations etc) and key phrases
Apr 2016Hidden Relationships in Data and Risk Analytics
News Metadata
Apr 2016Hidden Relationships in Data and Risk Analytics
Category Count
International News 52 074
Science and Technology 23 201
Sports 20 714
Business 15 155
Lifestyle 11 684
122 828
Mentions entity type Count
Keyphrase 2 589 676
Organization 1 276 441
Location 1 260 972
Person 1 248 784
Work 309 093
Event 258 388
RelationPersonRole 236 638
Species 180 946
Sample queries at httpff-newsontotextcom
F1 Big cities in Eastern Europe
F2 Airports near London
F3 People and organizations related to Google
F4 Top-level industries by number of companies
F5 Mentions in the news of an organization and its related entities
F7 Most popular companies per industry including children
F8 Regional exposition of company ndash normalized
FF-NEWS is in Beta Not officially launched but available to play with
Open data integration for news analytics
News Popularity Ranking Automotive
Open data integration for news analytics
Rank Company News RankCompany incl mentions of child
companies News
1 General Motors 2722 1 General Motors 4620
2 Tesla Motors 2346 2 Volkswagen Group 3999
3 Volkswagen 2299 3 Fiat Chrysler Automobiles 2658
4 Ford Motor Company 1934 4 Tesla Motors 2370
5 Toyota 1325 5 Ford Motor Company 2125
6 Chevrolet 1264 6 Toyota 1656
7 Chrysler 1054 7 Renault-Nissan Alliance 1332
8 Fiat Chrysler Automobiles 1011 8 Honda 864
9 Audi AG 972 9 BMW 715
10 Honda 717 10 Takata Corporation 547
News Popularity Finance
Open data integration for news analytics
Rank Company News Rank Company incl mentions of controlled News
1 Bloomberg LP 3203 1 Intra Bank 261667
2 Goldman Sachs 1992 2 Hinduja Bank (Switzerland) 49731
3 JP Morgan Chase 1712 3 China Merchants Bank 38288
4 Wells Fargo 1688 4 Alphabet Inc 22601
5 Citigroup 1557 5 Capital Group Companies 4076
6 HSBC Holdings 1546 6 Bloomberg LP 3611
7 Deutsche Bank 1414 7 Exor 2704
8 Bank of America 1335 8 Nasdaq Inc 2082
9 Barclays 1260 9 JP Morgan Chase 1972
10 UBS 694 10 Sentinel Capital Partners 1053
Note Including investment funds stock exchanges agencies etc
News Popularity Banking
Open data integration for news analytics
Rank Company News Rank Company incl mentions of controlled News
1 Goldman Sachs 996 1 China Merchants Bank 38288
2 JP Morgan Chase 856 2 JP Morgan Chase 1972
3 HSBC Holdings 773 3 Goldman Sachs 1030
4 Deutsche Bank 707 4 HSBC 966
5 Barclays 630 5 Bank of America 771
6 Citigroup 519 6 Deutsche Bank 742
7 Bank of America 445 7 Barclays 681
8 Wells Fargo 422 8 Citigroup 630
9 UBS 347 9 Wells Fargo 428
10 Chase 126 10 UBS 347
Offshore Leaks Database from ICIJ
bull Published by the International Consortium of Investigative
Journalists (ICIJ) on 9th of May
bull A ldquosearchable databaserdquo about 320 000 offshore companies
minus 214 000 extracted from Panama Papers (valid until 2015)
minus More than 100 000 from 2013 Offshore leaks investigation (valid until 2010)
bull CSV extract from a graph database available for download
bull httpsoffshoreleaksicijorg
Open data integration for news analytics
Offshore Leaks Database
Open data integration for news analytics
Offshore Leaks DB as Linked Open Data
bull Ontotext published the Offshore Leaks DB as Linked Open Data
bull Available for exploration querying and download at
httpdataontotextcom
bull ONTOTEXT DISCLAIMERS
We use the data as is provided by ICIJ We make no representations and warranties of any kind
including warranties of title accuracy absence of errors or fitness for particular purpose All
transformations query results and derivative works are used only to showcase the service and
technological capabilities and not to serve as basis for any statements or conclusions
Open data integration for news analytics
Enrichment and structuring of the data
bull Relationship type hierarchy
minus About 80 types of relationship types in the original dataset got organized in a property hierarchy
bull Classification of officers into Person and Company
minus In the original database there is no way to distinguish whether an officer is a physical person
bull Mapping to DBPedia
minus 209 countries referred in Offshore Leaks DB are mapped to DBPedia
minus About 3000 persons and 300 companies mapped to DBPedia
bull Overall size of the repository 22M statements (20M explicit)
Open data integration for news analytics
The RDF-ization Process
bull Linked data variant produced without programming
minus The raw CSV files are RDF-ized using TARQL httptarqlgithubio
minus Data was further interlinked and enriched in GraphDB using SPARQL
bull The process is documented in this README file
bull All relevant artifacts are open-source available at
httpsgithubcomOntotext-ADleaks
bull The entire publishing and mapping took about 15 person-days
minus Including dataontotextcom portal setup promotion documentation etc
Open data integration for news analytics
Sample queries at httpdataontotextcom
Q1 Countries by number of entities related to them
Q2 Country pairs by ownership statistics
Q3 Statistics by incorporation year
Q4 Officers and entities by number of capital relations
Q5 Countries in Eastern Europe by number of owners
Q6 Intermediaries in Asia by name
Q7 The best connected officers
Q8 Countries by number of Person and Company officers
Play with semantically enriched news
httpnowontotextcom
Play with open data at
httpdataontotextcom and httpff-
newsontotextcom
FF-NEWS Data Integration and Loading
bull DBpedia (the English version only) 496M statements
bull Geonames (all geographic features on Earth) 150M statementsminus owlsameAs links between DBpedia and Geonames 471K statements
bull Company registry data (GLEI) 3M statements
bull News metadata (from NOW) 128M statements
bull Total size 986М statementsminus Mapped to FIBO 667M explicit statements + 318M inferred statements
minus RDFRank and geo-spatial indices enabled to allow for ranking and efficient geo-spatial constraints
Open data integration for news analytics
Technology Semantic Content Enrichment
News Metadata
bull Metadata from Ontotextrsquos Dynamic Semantic Publishing platformminus Automatically generated as part of the NOWontotextcom semantic news showcase
bull News stream from Google since Feb 2015 about 10k newsmonthminus ~70 tags (annotations) per news article
bull Tags link text mentions of concepts to the knowledge graphminus Technically these are URIs for entities (people organizations locations etc) and key phrases
Apr 2016Hidden Relationships in Data and Risk Analytics
News Metadata
Apr 2016Hidden Relationships in Data and Risk Analytics
Category Count
International News 52 074
Science and Technology 23 201
Sports 20 714
Business 15 155
Lifestyle 11 684
122 828
Mentions entity type Count
Keyphrase 2 589 676
Organization 1 276 441
Location 1 260 972
Person 1 248 784
Work 309 093
Event 258 388
RelationPersonRole 236 638
Species 180 946
Sample queries at httpff-newsontotextcom
F1 Big cities in Eastern Europe
F2 Airports near London
F3 People and organizations related to Google
F4 Top-level industries by number of companies
F5 Mentions in the news of an organization and its related entities
F7 Most popular companies per industry including children
F8 Regional exposition of company ndash normalized
FF-NEWS is in Beta Not officially launched but available to play with
Open data integration for news analytics
News Popularity Ranking Automotive
Open data integration for news analytics
Rank Company News RankCompany incl mentions of child
companies News
1 General Motors 2722 1 General Motors 4620
2 Tesla Motors 2346 2 Volkswagen Group 3999
3 Volkswagen 2299 3 Fiat Chrysler Automobiles 2658
4 Ford Motor Company 1934 4 Tesla Motors 2370
5 Toyota 1325 5 Ford Motor Company 2125
6 Chevrolet 1264 6 Toyota 1656
7 Chrysler 1054 7 Renault-Nissan Alliance 1332
8 Fiat Chrysler Automobiles 1011 8 Honda 864
9 Audi AG 972 9 BMW 715
10 Honda 717 10 Takata Corporation 547
News Popularity Finance
Open data integration for news analytics
Rank Company News Rank Company incl mentions of controlled News
1 Bloomberg LP 3203 1 Intra Bank 261667
2 Goldman Sachs 1992 2 Hinduja Bank (Switzerland) 49731
3 JP Morgan Chase 1712 3 China Merchants Bank 38288
4 Wells Fargo 1688 4 Alphabet Inc 22601
5 Citigroup 1557 5 Capital Group Companies 4076
6 HSBC Holdings 1546 6 Bloomberg LP 3611
7 Deutsche Bank 1414 7 Exor 2704
8 Bank of America 1335 8 Nasdaq Inc 2082
9 Barclays 1260 9 JP Morgan Chase 1972
10 UBS 694 10 Sentinel Capital Partners 1053
Note Including investment funds stock exchanges agencies etc
News Popularity Banking
Open data integration for news analytics
Rank Company News Rank Company incl mentions of controlled News
1 Goldman Sachs 996 1 China Merchants Bank 38288
2 JP Morgan Chase 856 2 JP Morgan Chase 1972
3 HSBC Holdings 773 3 Goldman Sachs 1030
4 Deutsche Bank 707 4 HSBC 966
5 Barclays 630 5 Bank of America 771
6 Citigroup 519 6 Deutsche Bank 742
7 Bank of America 445 7 Barclays 681
8 Wells Fargo 422 8 Citigroup 630
9 UBS 347 9 Wells Fargo 428
10 Chase 126 10 UBS 347
Offshore Leaks Database from ICIJ
bull Published by the International Consortium of Investigative
Journalists (ICIJ) on 9th of May
bull A ldquosearchable databaserdquo about 320 000 offshore companies
minus 214 000 extracted from Panama Papers (valid until 2015)
minus More than 100 000 from 2013 Offshore leaks investigation (valid until 2010)
bull CSV extract from a graph database available for download
bull httpsoffshoreleaksicijorg
Open data integration for news analytics
Offshore Leaks Database
Open data integration for news analytics
Offshore Leaks DB as Linked Open Data
bull Ontotext published the Offshore Leaks DB as Linked Open Data
bull Available for exploration querying and download at
httpdataontotextcom
bull ONTOTEXT DISCLAIMERS
We use the data as is provided by ICIJ We make no representations and warranties of any kind
including warranties of title accuracy absence of errors or fitness for particular purpose All
transformations query results and derivative works are used only to showcase the service and
technological capabilities and not to serve as basis for any statements or conclusions
Open data integration for news analytics
Enrichment and structuring of the data
bull Relationship type hierarchy
minus About 80 types of relationship types in the original dataset got organized in a property hierarchy
bull Classification of officers into Person and Company
minus In the original database there is no way to distinguish whether an officer is a physical person
bull Mapping to DBPedia
minus 209 countries referred in Offshore Leaks DB are mapped to DBPedia
minus About 3000 persons and 300 companies mapped to DBPedia
bull Overall size of the repository 22M statements (20M explicit)
Open data integration for news analytics
The RDF-ization Process
bull Linked data variant produced without programming
minus The raw CSV files are RDF-ized using TARQL httptarqlgithubio
minus Data was further interlinked and enriched in GraphDB using SPARQL
bull The process is documented in this README file
bull All relevant artifacts are open-source available at
httpsgithubcomOntotext-ADleaks
bull The entire publishing and mapping took about 15 person-days
minus Including dataontotextcom portal setup promotion documentation etc
Open data integration for news analytics
Sample queries at httpdataontotextcom
Q1 Countries by number of entities related to them
Q2 Country pairs by ownership statistics
Q3 Statistics by incorporation year
Q4 Officers and entities by number of capital relations
Q5 Countries in Eastern Europe by number of owners
Q6 Intermediaries in Asia by name
Q7 The best connected officers
Q8 Countries by number of Person and Company officers
Play with semantically enriched news
httpnowontotextcom
Play with open data at
httpdataontotextcom and httpff-
newsontotextcom
Technology Semantic Content Enrichment
News Metadata
bull Metadata from Ontotextrsquos Dynamic Semantic Publishing platformminus Automatically generated as part of the NOWontotextcom semantic news showcase
bull News stream from Google since Feb 2015 about 10k newsmonthminus ~70 tags (annotations) per news article
bull Tags link text mentions of concepts to the knowledge graphminus Technically these are URIs for entities (people organizations locations etc) and key phrases
Apr 2016Hidden Relationships in Data and Risk Analytics
News Metadata
Apr 2016Hidden Relationships in Data and Risk Analytics
Category Count
International News 52 074
Science and Technology 23 201
Sports 20 714
Business 15 155
Lifestyle 11 684
122 828
Mentions entity type Count
Keyphrase 2 589 676
Organization 1 276 441
Location 1 260 972
Person 1 248 784
Work 309 093
Event 258 388
RelationPersonRole 236 638
Species 180 946
Sample queries at httpff-newsontotextcom
F1 Big cities in Eastern Europe
F2 Airports near London
F3 People and organizations related to Google
F4 Top-level industries by number of companies
F5 Mentions in the news of an organization and its related entities
F7 Most popular companies per industry including children
F8 Regional exposition of company ndash normalized
FF-NEWS is in Beta Not officially launched but available to play with
Open data integration for news analytics
News Popularity Ranking Automotive
Open data integration for news analytics
Rank Company News RankCompany incl mentions of child
companies News
1 General Motors 2722 1 General Motors 4620
2 Tesla Motors 2346 2 Volkswagen Group 3999
3 Volkswagen 2299 3 Fiat Chrysler Automobiles 2658
4 Ford Motor Company 1934 4 Tesla Motors 2370
5 Toyota 1325 5 Ford Motor Company 2125
6 Chevrolet 1264 6 Toyota 1656
7 Chrysler 1054 7 Renault-Nissan Alliance 1332
8 Fiat Chrysler Automobiles 1011 8 Honda 864
9 Audi AG 972 9 BMW 715
10 Honda 717 10 Takata Corporation 547
News Popularity Finance
Open data integration for news analytics
Rank Company News Rank Company incl mentions of controlled News
1 Bloomberg LP 3203 1 Intra Bank 261667
2 Goldman Sachs 1992 2 Hinduja Bank (Switzerland) 49731
3 JP Morgan Chase 1712 3 China Merchants Bank 38288
4 Wells Fargo 1688 4 Alphabet Inc 22601
5 Citigroup 1557 5 Capital Group Companies 4076
6 HSBC Holdings 1546 6 Bloomberg LP 3611
7 Deutsche Bank 1414 7 Exor 2704
8 Bank of America 1335 8 Nasdaq Inc 2082
9 Barclays 1260 9 JP Morgan Chase 1972
10 UBS 694 10 Sentinel Capital Partners 1053
Note Including investment funds stock exchanges agencies etc
News Popularity Banking
Open data integration for news analytics
Rank Company News Rank Company incl mentions of controlled News
1 Goldman Sachs 996 1 China Merchants Bank 38288
2 JP Morgan Chase 856 2 JP Morgan Chase 1972
3 HSBC Holdings 773 3 Goldman Sachs 1030
4 Deutsche Bank 707 4 HSBC 966
5 Barclays 630 5 Bank of America 771
6 Citigroup 519 6 Deutsche Bank 742
7 Bank of America 445 7 Barclays 681
8 Wells Fargo 422 8 Citigroup 630
9 UBS 347 9 Wells Fargo 428
10 Chase 126 10 UBS 347
Offshore Leaks Database from ICIJ
bull Published by the International Consortium of Investigative
Journalists (ICIJ) on 9th of May
bull A ldquosearchable databaserdquo about 320 000 offshore companies
minus 214 000 extracted from Panama Papers (valid until 2015)
minus More than 100 000 from 2013 Offshore leaks investigation (valid until 2010)
bull CSV extract from a graph database available for download
bull httpsoffshoreleaksicijorg
Open data integration for news analytics
Offshore Leaks Database
Open data integration for news analytics
Offshore Leaks DB as Linked Open Data
bull Ontotext published the Offshore Leaks DB as Linked Open Data
bull Available for exploration querying and download at
httpdataontotextcom
bull ONTOTEXT DISCLAIMERS
We use the data as is provided by ICIJ We make no representations and warranties of any kind
including warranties of title accuracy absence of errors or fitness for particular purpose All
transformations query results and derivative works are used only to showcase the service and
technological capabilities and not to serve as basis for any statements or conclusions
Open data integration for news analytics
Enrichment and structuring of the data
bull Relationship type hierarchy
minus About 80 types of relationship types in the original dataset got organized in a property hierarchy
bull Classification of officers into Person and Company
minus In the original database there is no way to distinguish whether an officer is a physical person
bull Mapping to DBPedia
minus 209 countries referred in Offshore Leaks DB are mapped to DBPedia
minus About 3000 persons and 300 companies mapped to DBPedia
bull Overall size of the repository 22M statements (20M explicit)
Open data integration for news analytics
The RDF-ization Process
bull Linked data variant produced without programming
minus The raw CSV files are RDF-ized using TARQL httptarqlgithubio
minus Data was further interlinked and enriched in GraphDB using SPARQL
bull The process is documented in this README file
bull All relevant artifacts are open-source available at
httpsgithubcomOntotext-ADleaks
bull The entire publishing and mapping took about 15 person-days
minus Including dataontotextcom portal setup promotion documentation etc
Open data integration for news analytics
Sample queries at httpdataontotextcom
Q1 Countries by number of entities related to them
Q2 Country pairs by ownership statistics
Q3 Statistics by incorporation year
Q4 Officers and entities by number of capital relations
Q5 Countries in Eastern Europe by number of owners
Q6 Intermediaries in Asia by name
Q7 The best connected officers
Q8 Countries by number of Person and Company officers
Play with semantically enriched news
httpnowontotextcom
Play with open data at
httpdataontotextcom and httpff-
newsontotextcom
News Metadata
bull Metadata from Ontotextrsquos Dynamic Semantic Publishing platformminus Automatically generated as part of the NOWontotextcom semantic news showcase
bull News stream from Google since Feb 2015 about 10k newsmonthminus ~70 tags (annotations) per news article
bull Tags link text mentions of concepts to the knowledge graphminus Technically these are URIs for entities (people organizations locations etc) and key phrases
Apr 2016Hidden Relationships in Data and Risk Analytics
News Metadata
Apr 2016Hidden Relationships in Data and Risk Analytics
Category Count
International News 52 074
Science and Technology 23 201
Sports 20 714
Business 15 155
Lifestyle 11 684
122 828
Mentions entity type Count
Keyphrase 2 589 676
Organization 1 276 441
Location 1 260 972
Person 1 248 784
Work 309 093
Event 258 388
RelationPersonRole 236 638
Species 180 946
Sample queries at httpff-newsontotextcom
F1 Big cities in Eastern Europe
F2 Airports near London
F3 People and organizations related to Google
F4 Top-level industries by number of companies
F5 Mentions in the news of an organization and its related entities
F7 Most popular companies per industry including children
F8 Regional exposition of company ndash normalized
FF-NEWS is in Beta Not officially launched but available to play with
Open data integration for news analytics
News Popularity Ranking Automotive
Open data integration for news analytics
Rank Company News RankCompany incl mentions of child
companies News
1 General Motors 2722 1 General Motors 4620
2 Tesla Motors 2346 2 Volkswagen Group 3999
3 Volkswagen 2299 3 Fiat Chrysler Automobiles 2658
4 Ford Motor Company 1934 4 Tesla Motors 2370
5 Toyota 1325 5 Ford Motor Company 2125
6 Chevrolet 1264 6 Toyota 1656
7 Chrysler 1054 7 Renault-Nissan Alliance 1332
8 Fiat Chrysler Automobiles 1011 8 Honda 864
9 Audi AG 972 9 BMW 715
10 Honda 717 10 Takata Corporation 547
News Popularity Finance
Open data integration for news analytics
Rank Company News Rank Company incl mentions of controlled News
1 Bloomberg LP 3203 1 Intra Bank 261667
2 Goldman Sachs 1992 2 Hinduja Bank (Switzerland) 49731
3 JP Morgan Chase 1712 3 China Merchants Bank 38288
4 Wells Fargo 1688 4 Alphabet Inc 22601
5 Citigroup 1557 5 Capital Group Companies 4076
6 HSBC Holdings 1546 6 Bloomberg LP 3611
7 Deutsche Bank 1414 7 Exor 2704
8 Bank of America 1335 8 Nasdaq Inc 2082
9 Barclays 1260 9 JP Morgan Chase 1972
10 UBS 694 10 Sentinel Capital Partners 1053
Note Including investment funds stock exchanges agencies etc
News Popularity Banking
Open data integration for news analytics
Rank Company News Rank Company incl mentions of controlled News
1 Goldman Sachs 996 1 China Merchants Bank 38288
2 JP Morgan Chase 856 2 JP Morgan Chase 1972
3 HSBC Holdings 773 3 Goldman Sachs 1030
4 Deutsche Bank 707 4 HSBC 966
5 Barclays 630 5 Bank of America 771
6 Citigroup 519 6 Deutsche Bank 742
7 Bank of America 445 7 Barclays 681
8 Wells Fargo 422 8 Citigroup 630
9 UBS 347 9 Wells Fargo 428
10 Chase 126 10 UBS 347
Offshore Leaks Database from ICIJ
bull Published by the International Consortium of Investigative
Journalists (ICIJ) on 9th of May
bull A ldquosearchable databaserdquo about 320 000 offshore companies
minus 214 000 extracted from Panama Papers (valid until 2015)
minus More than 100 000 from 2013 Offshore leaks investigation (valid until 2010)
bull CSV extract from a graph database available for download
bull httpsoffshoreleaksicijorg
Open data integration for news analytics
Offshore Leaks Database
Open data integration for news analytics
Offshore Leaks DB as Linked Open Data
bull Ontotext published the Offshore Leaks DB as Linked Open Data
bull Available for exploration querying and download at
httpdataontotextcom
bull ONTOTEXT DISCLAIMERS
We use the data as is provided by ICIJ We make no representations and warranties of any kind
including warranties of title accuracy absence of errors or fitness for particular purpose All
transformations query results and derivative works are used only to showcase the service and
technological capabilities and not to serve as basis for any statements or conclusions
Open data integration for news analytics
Enrichment and structuring of the data
bull Relationship type hierarchy
minus About 80 types of relationship types in the original dataset got organized in a property hierarchy
bull Classification of officers into Person and Company
minus In the original database there is no way to distinguish whether an officer is a physical person
bull Mapping to DBPedia
minus 209 countries referred in Offshore Leaks DB are mapped to DBPedia
minus About 3000 persons and 300 companies mapped to DBPedia
bull Overall size of the repository 22M statements (20M explicit)
Open data integration for news analytics
The RDF-ization Process
bull Linked data variant produced without programming
minus The raw CSV files are RDF-ized using TARQL httptarqlgithubio
minus Data was further interlinked and enriched in GraphDB using SPARQL
bull The process is documented in this README file
bull All relevant artifacts are open-source available at
httpsgithubcomOntotext-ADleaks
bull The entire publishing and mapping took about 15 person-days
minus Including dataontotextcom portal setup promotion documentation etc
Open data integration for news analytics
Sample queries at httpdataontotextcom
Q1 Countries by number of entities related to them
Q2 Country pairs by ownership statistics
Q3 Statistics by incorporation year
Q4 Officers and entities by number of capital relations
Q5 Countries in Eastern Europe by number of owners
Q6 Intermediaries in Asia by name
Q7 The best connected officers
Q8 Countries by number of Person and Company officers
Play with semantically enriched news
httpnowontotextcom
Play with open data at
httpdataontotextcom and httpff-
newsontotextcom
News Metadata
Apr 2016Hidden Relationships in Data and Risk Analytics
Category Count
International News 52 074
Science and Technology 23 201
Sports 20 714
Business 15 155
Lifestyle 11 684
122 828
Mentions entity type Count
Keyphrase 2 589 676
Organization 1 276 441
Location 1 260 972
Person 1 248 784
Work 309 093
Event 258 388
RelationPersonRole 236 638
Species 180 946
Sample queries at httpff-newsontotextcom
F1 Big cities in Eastern Europe
F2 Airports near London
F3 People and organizations related to Google
F4 Top-level industries by number of companies
F5 Mentions in the news of an organization and its related entities
F7 Most popular companies per industry including children
F8 Regional exposition of company ndash normalized
FF-NEWS is in Beta Not officially launched but available to play with
Open data integration for news analytics
News Popularity Ranking Automotive
Open data integration for news analytics
Rank Company News RankCompany incl mentions of child
companies News
1 General Motors 2722 1 General Motors 4620
2 Tesla Motors 2346 2 Volkswagen Group 3999
3 Volkswagen 2299 3 Fiat Chrysler Automobiles 2658
4 Ford Motor Company 1934 4 Tesla Motors 2370
5 Toyota 1325 5 Ford Motor Company 2125
6 Chevrolet 1264 6 Toyota 1656
7 Chrysler 1054 7 Renault-Nissan Alliance 1332
8 Fiat Chrysler Automobiles 1011 8 Honda 864
9 Audi AG 972 9 BMW 715
10 Honda 717 10 Takata Corporation 547
News Popularity Finance
Open data integration for news analytics
Rank Company News Rank Company incl mentions of controlled News
1 Bloomberg LP 3203 1 Intra Bank 261667
2 Goldman Sachs 1992 2 Hinduja Bank (Switzerland) 49731
3 JP Morgan Chase 1712 3 China Merchants Bank 38288
4 Wells Fargo 1688 4 Alphabet Inc 22601
5 Citigroup 1557 5 Capital Group Companies 4076
6 HSBC Holdings 1546 6 Bloomberg LP 3611
7 Deutsche Bank 1414 7 Exor 2704
8 Bank of America 1335 8 Nasdaq Inc 2082
9 Barclays 1260 9 JP Morgan Chase 1972
10 UBS 694 10 Sentinel Capital Partners 1053
Note Including investment funds stock exchanges agencies etc
News Popularity Banking
Open data integration for news analytics
Rank Company News Rank Company incl mentions of controlled News
1 Goldman Sachs 996 1 China Merchants Bank 38288
2 JP Morgan Chase 856 2 JP Morgan Chase 1972
3 HSBC Holdings 773 3 Goldman Sachs 1030
4 Deutsche Bank 707 4 HSBC 966
5 Barclays 630 5 Bank of America 771
6 Citigroup 519 6 Deutsche Bank 742
7 Bank of America 445 7 Barclays 681
8 Wells Fargo 422 8 Citigroup 630
9 UBS 347 9 Wells Fargo 428
10 Chase 126 10 UBS 347
Offshore Leaks Database from ICIJ
bull Published by the International Consortium of Investigative
Journalists (ICIJ) on 9th of May
bull A ldquosearchable databaserdquo about 320 000 offshore companies
minus 214 000 extracted from Panama Papers (valid until 2015)
minus More than 100 000 from 2013 Offshore leaks investigation (valid until 2010)
bull CSV extract from a graph database available for download
bull httpsoffshoreleaksicijorg
Open data integration for news analytics
Offshore Leaks Database
Open data integration for news analytics
Offshore Leaks DB as Linked Open Data
bull Ontotext published the Offshore Leaks DB as Linked Open Data
bull Available for exploration querying and download at
httpdataontotextcom
bull ONTOTEXT DISCLAIMERS
We use the data as is provided by ICIJ We make no representations and warranties of any kind
including warranties of title accuracy absence of errors or fitness for particular purpose All
transformations query results and derivative works are used only to showcase the service and
technological capabilities and not to serve as basis for any statements or conclusions
Open data integration for news analytics
Enrichment and structuring of the data
bull Relationship type hierarchy
minus About 80 types of relationship types in the original dataset got organized in a property hierarchy
bull Classification of officers into Person and Company
minus In the original database there is no way to distinguish whether an officer is a physical person
bull Mapping to DBPedia
minus 209 countries referred in Offshore Leaks DB are mapped to DBPedia
minus About 3000 persons and 300 companies mapped to DBPedia
bull Overall size of the repository 22M statements (20M explicit)
Open data integration for news analytics
The RDF-ization Process
bull Linked data variant produced without programming
minus The raw CSV files are RDF-ized using TARQL httptarqlgithubio
minus Data was further interlinked and enriched in GraphDB using SPARQL
bull The process is documented in this README file
bull All relevant artifacts are open-source available at
httpsgithubcomOntotext-ADleaks
bull The entire publishing and mapping took about 15 person-days
minus Including dataontotextcom portal setup promotion documentation etc
Open data integration for news analytics
Sample queries at httpdataontotextcom
Q1 Countries by number of entities related to them
Q2 Country pairs by ownership statistics
Q3 Statistics by incorporation year
Q4 Officers and entities by number of capital relations
Q5 Countries in Eastern Europe by number of owners
Q6 Intermediaries in Asia by name
Q7 The best connected officers
Q8 Countries by number of Person and Company officers
Play with semantically enriched news
httpnowontotextcom
Play with open data at
httpdataontotextcom and httpff-
newsontotextcom
Sample queries at httpff-newsontotextcom
F1 Big cities in Eastern Europe
F2 Airports near London
F3 People and organizations related to Google
F4 Top-level industries by number of companies
F5 Mentions in the news of an organization and its related entities
F7 Most popular companies per industry including children
F8 Regional exposition of company ndash normalized
FF-NEWS is in Beta Not officially launched but available to play with
Open data integration for news analytics
News Popularity Ranking Automotive
Open data integration for news analytics
Rank Company News RankCompany incl mentions of child
companies News
1 General Motors 2722 1 General Motors 4620
2 Tesla Motors 2346 2 Volkswagen Group 3999
3 Volkswagen 2299 3 Fiat Chrysler Automobiles 2658
4 Ford Motor Company 1934 4 Tesla Motors 2370
5 Toyota 1325 5 Ford Motor Company 2125
6 Chevrolet 1264 6 Toyota 1656
7 Chrysler 1054 7 Renault-Nissan Alliance 1332
8 Fiat Chrysler Automobiles 1011 8 Honda 864
9 Audi AG 972 9 BMW 715
10 Honda 717 10 Takata Corporation 547
News Popularity Finance
Open data integration for news analytics
Rank Company News Rank Company incl mentions of controlled News
1 Bloomberg LP 3203 1 Intra Bank 261667
2 Goldman Sachs 1992 2 Hinduja Bank (Switzerland) 49731
3 JP Morgan Chase 1712 3 China Merchants Bank 38288
4 Wells Fargo 1688 4 Alphabet Inc 22601
5 Citigroup 1557 5 Capital Group Companies 4076
6 HSBC Holdings 1546 6 Bloomberg LP 3611
7 Deutsche Bank 1414 7 Exor 2704
8 Bank of America 1335 8 Nasdaq Inc 2082
9 Barclays 1260 9 JP Morgan Chase 1972
10 UBS 694 10 Sentinel Capital Partners 1053
Note Including investment funds stock exchanges agencies etc
News Popularity Banking
Open data integration for news analytics
Rank Company News Rank Company incl mentions of controlled News
1 Goldman Sachs 996 1 China Merchants Bank 38288
2 JP Morgan Chase 856 2 JP Morgan Chase 1972
3 HSBC Holdings 773 3 Goldman Sachs 1030
4 Deutsche Bank 707 4 HSBC 966
5 Barclays 630 5 Bank of America 771
6 Citigroup 519 6 Deutsche Bank 742
7 Bank of America 445 7 Barclays 681
8 Wells Fargo 422 8 Citigroup 630
9 UBS 347 9 Wells Fargo 428
10 Chase 126 10 UBS 347
Offshore Leaks Database from ICIJ
bull Published by the International Consortium of Investigative
Journalists (ICIJ) on 9th of May
bull A ldquosearchable databaserdquo about 320 000 offshore companies
minus 214 000 extracted from Panama Papers (valid until 2015)
minus More than 100 000 from 2013 Offshore leaks investigation (valid until 2010)
bull CSV extract from a graph database available for download
bull httpsoffshoreleaksicijorg
Open data integration for news analytics
Offshore Leaks Database
Open data integration for news analytics
Offshore Leaks DB as Linked Open Data
bull Ontotext published the Offshore Leaks DB as Linked Open Data
bull Available for exploration querying and download at
httpdataontotextcom
bull ONTOTEXT DISCLAIMERS
We use the data as is provided by ICIJ We make no representations and warranties of any kind
including warranties of title accuracy absence of errors or fitness for particular purpose All
transformations query results and derivative works are used only to showcase the service and
technological capabilities and not to serve as basis for any statements or conclusions
Open data integration for news analytics
Enrichment and structuring of the data
bull Relationship type hierarchy
minus About 80 types of relationship types in the original dataset got organized in a property hierarchy
bull Classification of officers into Person and Company
minus In the original database there is no way to distinguish whether an officer is a physical person
bull Mapping to DBPedia
minus 209 countries referred in Offshore Leaks DB are mapped to DBPedia
minus About 3000 persons and 300 companies mapped to DBPedia
bull Overall size of the repository 22M statements (20M explicit)
Open data integration for news analytics
The RDF-ization Process
bull Linked data variant produced without programming
minus The raw CSV files are RDF-ized using TARQL httptarqlgithubio
minus Data was further interlinked and enriched in GraphDB using SPARQL
bull The process is documented in this README file
bull All relevant artifacts are open-source available at
httpsgithubcomOntotext-ADleaks
bull The entire publishing and mapping took about 15 person-days
minus Including dataontotextcom portal setup promotion documentation etc
Open data integration for news analytics
Sample queries at httpdataontotextcom
Q1 Countries by number of entities related to them
Q2 Country pairs by ownership statistics
Q3 Statistics by incorporation year
Q4 Officers and entities by number of capital relations
Q5 Countries in Eastern Europe by number of owners
Q6 Intermediaries in Asia by name
Q7 The best connected officers
Q8 Countries by number of Person and Company officers
Play with semantically enriched news
httpnowontotextcom
Play with open data at
httpdataontotextcom and httpff-
newsontotextcom
News Popularity Ranking Automotive
Open data integration for news analytics
Rank Company News RankCompany incl mentions of child
companies News
1 General Motors 2722 1 General Motors 4620
2 Tesla Motors 2346 2 Volkswagen Group 3999
3 Volkswagen 2299 3 Fiat Chrysler Automobiles 2658
4 Ford Motor Company 1934 4 Tesla Motors 2370
5 Toyota 1325 5 Ford Motor Company 2125
6 Chevrolet 1264 6 Toyota 1656
7 Chrysler 1054 7 Renault-Nissan Alliance 1332
8 Fiat Chrysler Automobiles 1011 8 Honda 864
9 Audi AG 972 9 BMW 715
10 Honda 717 10 Takata Corporation 547
News Popularity Finance
Open data integration for news analytics
Rank Company News Rank Company incl mentions of controlled News
1 Bloomberg LP 3203 1 Intra Bank 261667
2 Goldman Sachs 1992 2 Hinduja Bank (Switzerland) 49731
3 JP Morgan Chase 1712 3 China Merchants Bank 38288
4 Wells Fargo 1688 4 Alphabet Inc 22601
5 Citigroup 1557 5 Capital Group Companies 4076
6 HSBC Holdings 1546 6 Bloomberg LP 3611
7 Deutsche Bank 1414 7 Exor 2704
8 Bank of America 1335 8 Nasdaq Inc 2082
9 Barclays 1260 9 JP Morgan Chase 1972
10 UBS 694 10 Sentinel Capital Partners 1053
Note Including investment funds stock exchanges agencies etc
News Popularity Banking
Open data integration for news analytics
Rank Company News Rank Company incl mentions of controlled News
1 Goldman Sachs 996 1 China Merchants Bank 38288
2 JP Morgan Chase 856 2 JP Morgan Chase 1972
3 HSBC Holdings 773 3 Goldman Sachs 1030
4 Deutsche Bank 707 4 HSBC 966
5 Barclays 630 5 Bank of America 771
6 Citigroup 519 6 Deutsche Bank 742
7 Bank of America 445 7 Barclays 681
8 Wells Fargo 422 8 Citigroup 630
9 UBS 347 9 Wells Fargo 428
10 Chase 126 10 UBS 347
Offshore Leaks Database from ICIJ
bull Published by the International Consortium of Investigative
Journalists (ICIJ) on 9th of May
bull A ldquosearchable databaserdquo about 320 000 offshore companies
minus 214 000 extracted from Panama Papers (valid until 2015)
minus More than 100 000 from 2013 Offshore leaks investigation (valid until 2010)
bull CSV extract from a graph database available for download
bull httpsoffshoreleaksicijorg
Open data integration for news analytics
Offshore Leaks Database
Open data integration for news analytics
Offshore Leaks DB as Linked Open Data
bull Ontotext published the Offshore Leaks DB as Linked Open Data
bull Available for exploration querying and download at
httpdataontotextcom
bull ONTOTEXT DISCLAIMERS
We use the data as is provided by ICIJ We make no representations and warranties of any kind
including warranties of title accuracy absence of errors or fitness for particular purpose All
transformations query results and derivative works are used only to showcase the service and
technological capabilities and not to serve as basis for any statements or conclusions
Open data integration for news analytics
Enrichment and structuring of the data
bull Relationship type hierarchy
minus About 80 types of relationship types in the original dataset got organized in a property hierarchy
bull Classification of officers into Person and Company
minus In the original database there is no way to distinguish whether an officer is a physical person
bull Mapping to DBPedia
minus 209 countries referred in Offshore Leaks DB are mapped to DBPedia
minus About 3000 persons and 300 companies mapped to DBPedia
bull Overall size of the repository 22M statements (20M explicit)
Open data integration for news analytics
The RDF-ization Process
bull Linked data variant produced without programming
minus The raw CSV files are RDF-ized using TARQL httptarqlgithubio
minus Data was further interlinked and enriched in GraphDB using SPARQL
bull The process is documented in this README file
bull All relevant artifacts are open-source available at
httpsgithubcomOntotext-ADleaks
bull The entire publishing and mapping took about 15 person-days
minus Including dataontotextcom portal setup promotion documentation etc
Open data integration for news analytics
Sample queries at httpdataontotextcom
Q1 Countries by number of entities related to them
Q2 Country pairs by ownership statistics
Q3 Statistics by incorporation year
Q4 Officers and entities by number of capital relations
Q5 Countries in Eastern Europe by number of owners
Q6 Intermediaries in Asia by name
Q7 The best connected officers
Q8 Countries by number of Person and Company officers
Play with semantically enriched news
httpnowontotextcom
Play with open data at
httpdataontotextcom and httpff-
newsontotextcom
News Popularity Finance
Open data integration for news analytics
Rank Company News Rank Company incl mentions of controlled News
1 Bloomberg LP 3203 1 Intra Bank 261667
2 Goldman Sachs 1992 2 Hinduja Bank (Switzerland) 49731
3 JP Morgan Chase 1712 3 China Merchants Bank 38288
4 Wells Fargo 1688 4 Alphabet Inc 22601
5 Citigroup 1557 5 Capital Group Companies 4076
6 HSBC Holdings 1546 6 Bloomberg LP 3611
7 Deutsche Bank 1414 7 Exor 2704
8 Bank of America 1335 8 Nasdaq Inc 2082
9 Barclays 1260 9 JP Morgan Chase 1972
10 UBS 694 10 Sentinel Capital Partners 1053
Note Including investment funds stock exchanges agencies etc
News Popularity Banking
Open data integration for news analytics
Rank Company News Rank Company incl mentions of controlled News
1 Goldman Sachs 996 1 China Merchants Bank 38288
2 JP Morgan Chase 856 2 JP Morgan Chase 1972
3 HSBC Holdings 773 3 Goldman Sachs 1030
4 Deutsche Bank 707 4 HSBC 966
5 Barclays 630 5 Bank of America 771
6 Citigroup 519 6 Deutsche Bank 742
7 Bank of America 445 7 Barclays 681
8 Wells Fargo 422 8 Citigroup 630
9 UBS 347 9 Wells Fargo 428
10 Chase 126 10 UBS 347
Offshore Leaks Database from ICIJ
bull Published by the International Consortium of Investigative
Journalists (ICIJ) on 9th of May
bull A ldquosearchable databaserdquo about 320 000 offshore companies
minus 214 000 extracted from Panama Papers (valid until 2015)
minus More than 100 000 from 2013 Offshore leaks investigation (valid until 2010)
bull CSV extract from a graph database available for download
bull httpsoffshoreleaksicijorg
Open data integration for news analytics
Offshore Leaks Database
Open data integration for news analytics
Offshore Leaks DB as Linked Open Data
bull Ontotext published the Offshore Leaks DB as Linked Open Data
bull Available for exploration querying and download at
httpdataontotextcom
bull ONTOTEXT DISCLAIMERS
We use the data as is provided by ICIJ We make no representations and warranties of any kind
including warranties of title accuracy absence of errors or fitness for particular purpose All
transformations query results and derivative works are used only to showcase the service and
technological capabilities and not to serve as basis for any statements or conclusions
Open data integration for news analytics
Enrichment and structuring of the data
bull Relationship type hierarchy
minus About 80 types of relationship types in the original dataset got organized in a property hierarchy
bull Classification of officers into Person and Company
minus In the original database there is no way to distinguish whether an officer is a physical person
bull Mapping to DBPedia
minus 209 countries referred in Offshore Leaks DB are mapped to DBPedia
minus About 3000 persons and 300 companies mapped to DBPedia
bull Overall size of the repository 22M statements (20M explicit)
Open data integration for news analytics
The RDF-ization Process
bull Linked data variant produced without programming
minus The raw CSV files are RDF-ized using TARQL httptarqlgithubio
minus Data was further interlinked and enriched in GraphDB using SPARQL
bull The process is documented in this README file
bull All relevant artifacts are open-source available at
httpsgithubcomOntotext-ADleaks
bull The entire publishing and mapping took about 15 person-days
minus Including dataontotextcom portal setup promotion documentation etc
Open data integration for news analytics
Sample queries at httpdataontotextcom
Q1 Countries by number of entities related to them
Q2 Country pairs by ownership statistics
Q3 Statistics by incorporation year
Q4 Officers and entities by number of capital relations
Q5 Countries in Eastern Europe by number of owners
Q6 Intermediaries in Asia by name
Q7 The best connected officers
Q8 Countries by number of Person and Company officers
Play with semantically enriched news
httpnowontotextcom
Play with open data at
httpdataontotextcom and httpff-
newsontotextcom
News Popularity Banking
Open data integration for news analytics
Rank Company News Rank Company incl mentions of controlled News
1 Goldman Sachs 996 1 China Merchants Bank 38288
2 JP Morgan Chase 856 2 JP Morgan Chase 1972
3 HSBC Holdings 773 3 Goldman Sachs 1030
4 Deutsche Bank 707 4 HSBC 966
5 Barclays 630 5 Bank of America 771
6 Citigroup 519 6 Deutsche Bank 742
7 Bank of America 445 7 Barclays 681
8 Wells Fargo 422 8 Citigroup 630
9 UBS 347 9 Wells Fargo 428
10 Chase 126 10 UBS 347
Offshore Leaks Database from ICIJ
bull Published by the International Consortium of Investigative
Journalists (ICIJ) on 9th of May
bull A ldquosearchable databaserdquo about 320 000 offshore companies
minus 214 000 extracted from Panama Papers (valid until 2015)
minus More than 100 000 from 2013 Offshore leaks investigation (valid until 2010)
bull CSV extract from a graph database available for download
bull httpsoffshoreleaksicijorg
Open data integration for news analytics
Offshore Leaks Database
Open data integration for news analytics
Offshore Leaks DB as Linked Open Data
bull Ontotext published the Offshore Leaks DB as Linked Open Data
bull Available for exploration querying and download at
httpdataontotextcom
bull ONTOTEXT DISCLAIMERS
We use the data as is provided by ICIJ We make no representations and warranties of any kind
including warranties of title accuracy absence of errors or fitness for particular purpose All
transformations query results and derivative works are used only to showcase the service and
technological capabilities and not to serve as basis for any statements or conclusions
Open data integration for news analytics
Enrichment and structuring of the data
bull Relationship type hierarchy
minus About 80 types of relationship types in the original dataset got organized in a property hierarchy
bull Classification of officers into Person and Company
minus In the original database there is no way to distinguish whether an officer is a physical person
bull Mapping to DBPedia
minus 209 countries referred in Offshore Leaks DB are mapped to DBPedia
minus About 3000 persons and 300 companies mapped to DBPedia
bull Overall size of the repository 22M statements (20M explicit)
Open data integration for news analytics
The RDF-ization Process
bull Linked data variant produced without programming
minus The raw CSV files are RDF-ized using TARQL httptarqlgithubio
minus Data was further interlinked and enriched in GraphDB using SPARQL
bull The process is documented in this README file
bull All relevant artifacts are open-source available at
httpsgithubcomOntotext-ADleaks
bull The entire publishing and mapping took about 15 person-days
minus Including dataontotextcom portal setup promotion documentation etc
Open data integration for news analytics
Sample queries at httpdataontotextcom
Q1 Countries by number of entities related to them
Q2 Country pairs by ownership statistics
Q3 Statistics by incorporation year
Q4 Officers and entities by number of capital relations
Q5 Countries in Eastern Europe by number of owners
Q6 Intermediaries in Asia by name
Q7 The best connected officers
Q8 Countries by number of Person and Company officers
Play with semantically enriched news
httpnowontotextcom
Play with open data at
httpdataontotextcom and httpff-
newsontotextcom
Offshore Leaks Database from ICIJ
bull Published by the International Consortium of Investigative
Journalists (ICIJ) on 9th of May
bull A ldquosearchable databaserdquo about 320 000 offshore companies
minus 214 000 extracted from Panama Papers (valid until 2015)
minus More than 100 000 from 2013 Offshore leaks investigation (valid until 2010)
bull CSV extract from a graph database available for download
bull httpsoffshoreleaksicijorg
Open data integration for news analytics
Offshore Leaks Database
Open data integration for news analytics
Offshore Leaks DB as Linked Open Data
bull Ontotext published the Offshore Leaks DB as Linked Open Data
bull Available for exploration querying and download at
httpdataontotextcom
bull ONTOTEXT DISCLAIMERS
We use the data as is provided by ICIJ We make no representations and warranties of any kind
including warranties of title accuracy absence of errors or fitness for particular purpose All
transformations query results and derivative works are used only to showcase the service and
technological capabilities and not to serve as basis for any statements or conclusions
Open data integration for news analytics
Enrichment and structuring of the data
bull Relationship type hierarchy
minus About 80 types of relationship types in the original dataset got organized in a property hierarchy
bull Classification of officers into Person and Company
minus In the original database there is no way to distinguish whether an officer is a physical person
bull Mapping to DBPedia
minus 209 countries referred in Offshore Leaks DB are mapped to DBPedia
minus About 3000 persons and 300 companies mapped to DBPedia
bull Overall size of the repository 22M statements (20M explicit)
Open data integration for news analytics
The RDF-ization Process
bull Linked data variant produced without programming
minus The raw CSV files are RDF-ized using TARQL httptarqlgithubio
minus Data was further interlinked and enriched in GraphDB using SPARQL
bull The process is documented in this README file
bull All relevant artifacts are open-source available at
httpsgithubcomOntotext-ADleaks
bull The entire publishing and mapping took about 15 person-days
minus Including dataontotextcom portal setup promotion documentation etc
Open data integration for news analytics
Sample queries at httpdataontotextcom
Q1 Countries by number of entities related to them
Q2 Country pairs by ownership statistics
Q3 Statistics by incorporation year
Q4 Officers and entities by number of capital relations
Q5 Countries in Eastern Europe by number of owners
Q6 Intermediaries in Asia by name
Q7 The best connected officers
Q8 Countries by number of Person and Company officers
Play with semantically enriched news
httpnowontotextcom
Play with open data at
httpdataontotextcom and httpff-
newsontotextcom
Offshore Leaks Database
Open data integration for news analytics
Offshore Leaks DB as Linked Open Data
bull Ontotext published the Offshore Leaks DB as Linked Open Data
bull Available for exploration querying and download at
httpdataontotextcom
bull ONTOTEXT DISCLAIMERS
We use the data as is provided by ICIJ We make no representations and warranties of any kind
including warranties of title accuracy absence of errors or fitness for particular purpose All
transformations query results and derivative works are used only to showcase the service and
technological capabilities and not to serve as basis for any statements or conclusions
Open data integration for news analytics
Enrichment and structuring of the data
bull Relationship type hierarchy
minus About 80 types of relationship types in the original dataset got organized in a property hierarchy
bull Classification of officers into Person and Company
minus In the original database there is no way to distinguish whether an officer is a physical person
bull Mapping to DBPedia
minus 209 countries referred in Offshore Leaks DB are mapped to DBPedia
minus About 3000 persons and 300 companies mapped to DBPedia
bull Overall size of the repository 22M statements (20M explicit)
Open data integration for news analytics
The RDF-ization Process
bull Linked data variant produced without programming
minus The raw CSV files are RDF-ized using TARQL httptarqlgithubio
minus Data was further interlinked and enriched in GraphDB using SPARQL
bull The process is documented in this README file
bull All relevant artifacts are open-source available at
httpsgithubcomOntotext-ADleaks
bull The entire publishing and mapping took about 15 person-days
minus Including dataontotextcom portal setup promotion documentation etc
Open data integration for news analytics
Sample queries at httpdataontotextcom
Q1 Countries by number of entities related to them
Q2 Country pairs by ownership statistics
Q3 Statistics by incorporation year
Q4 Officers and entities by number of capital relations
Q5 Countries in Eastern Europe by number of owners
Q6 Intermediaries in Asia by name
Q7 The best connected officers
Q8 Countries by number of Person and Company officers
Play with semantically enriched news
httpnowontotextcom
Play with open data at
httpdataontotextcom and httpff-
newsontotextcom
Offshore Leaks DB as Linked Open Data
bull Ontotext published the Offshore Leaks DB as Linked Open Data
bull Available for exploration querying and download at
httpdataontotextcom
bull ONTOTEXT DISCLAIMERS
We use the data as is provided by ICIJ We make no representations and warranties of any kind
including warranties of title accuracy absence of errors or fitness for particular purpose All
transformations query results and derivative works are used only to showcase the service and
technological capabilities and not to serve as basis for any statements or conclusions
Open data integration for news analytics
Enrichment and structuring of the data
bull Relationship type hierarchy
minus About 80 types of relationship types in the original dataset got organized in a property hierarchy
bull Classification of officers into Person and Company
minus In the original database there is no way to distinguish whether an officer is a physical person
bull Mapping to DBPedia
minus 209 countries referred in Offshore Leaks DB are mapped to DBPedia
minus About 3000 persons and 300 companies mapped to DBPedia
bull Overall size of the repository 22M statements (20M explicit)
Open data integration for news analytics
The RDF-ization Process
bull Linked data variant produced without programming
minus The raw CSV files are RDF-ized using TARQL httptarqlgithubio
minus Data was further interlinked and enriched in GraphDB using SPARQL
bull The process is documented in this README file
bull All relevant artifacts are open-source available at
httpsgithubcomOntotext-ADleaks
bull The entire publishing and mapping took about 15 person-days
minus Including dataontotextcom portal setup promotion documentation etc
Open data integration for news analytics
Sample queries at httpdataontotextcom
Q1 Countries by number of entities related to them
Q2 Country pairs by ownership statistics
Q3 Statistics by incorporation year
Q4 Officers and entities by number of capital relations
Q5 Countries in Eastern Europe by number of owners
Q6 Intermediaries in Asia by name
Q7 The best connected officers
Q8 Countries by number of Person and Company officers
Play with semantically enriched news
httpnowontotextcom
Play with open data at
httpdataontotextcom and httpff-
newsontotextcom
Enrichment and structuring of the data
bull Relationship type hierarchy
minus About 80 types of relationship types in the original dataset got organized in a property hierarchy
bull Classification of officers into Person and Company
minus In the original database there is no way to distinguish whether an officer is a physical person
bull Mapping to DBPedia
minus 209 countries referred in Offshore Leaks DB are mapped to DBPedia
minus About 3000 persons and 300 companies mapped to DBPedia
bull Overall size of the repository 22M statements (20M explicit)
Open data integration for news analytics
The RDF-ization Process
bull Linked data variant produced without programming
minus The raw CSV files are RDF-ized using TARQL httptarqlgithubio
minus Data was further interlinked and enriched in GraphDB using SPARQL
bull The process is documented in this README file
bull All relevant artifacts are open-source available at
httpsgithubcomOntotext-ADleaks
bull The entire publishing and mapping took about 15 person-days
minus Including dataontotextcom portal setup promotion documentation etc
Open data integration for news analytics
Sample queries at httpdataontotextcom
Q1 Countries by number of entities related to them
Q2 Country pairs by ownership statistics
Q3 Statistics by incorporation year
Q4 Officers and entities by number of capital relations
Q5 Countries in Eastern Europe by number of owners
Q6 Intermediaries in Asia by name
Q7 The best connected officers
Q8 Countries by number of Person and Company officers
Play with semantically enriched news
httpnowontotextcom
Play with open data at
httpdataontotextcom and httpff-
newsontotextcom
The RDF-ization Process
bull Linked data variant produced without programming
minus The raw CSV files are RDF-ized using TARQL httptarqlgithubio
minus Data was further interlinked and enriched in GraphDB using SPARQL
bull The process is documented in this README file
bull All relevant artifacts are open-source available at
httpsgithubcomOntotext-ADleaks
bull The entire publishing and mapping took about 15 person-days
minus Including dataontotextcom portal setup promotion documentation etc
Open data integration for news analytics
Sample queries at httpdataontotextcom
Q1 Countries by number of entities related to them
Q2 Country pairs by ownership statistics
Q3 Statistics by incorporation year
Q4 Officers and entities by number of capital relations
Q5 Countries in Eastern Europe by number of owners
Q6 Intermediaries in Asia by name
Q7 The best connected officers
Q8 Countries by number of Person and Company officers
Play with semantically enriched news
httpnowontotextcom
Play with open data at
httpdataontotextcom and httpff-
newsontotextcom
Sample queries at httpdataontotextcom
Q1 Countries by number of entities related to them
Q2 Country pairs by ownership statistics
Q3 Statistics by incorporation year
Q4 Officers and entities by number of capital relations
Q5 Countries in Eastern Europe by number of owners
Q6 Intermediaries in Asia by name
Q7 The best connected officers
Q8 Countries by number of Person and Company officers
Play with semantically enriched news
httpnowontotextcom
Play with open data at
httpdataontotextcom and httpff-
newsontotextcom