Talk of Europe @ DHBenelux2015
-
Upload
laura-hollink -
Category
Science
-
view
98 -
download
0
Transcript of Talk of Europe @ DHBenelux2015
The possibilities and challenges of using linked data for academic research
The case of the Talk of Europe project
Laura Hollink Centrum Wiskunde & Informatica, Amsterdam Martijn Kleppe Erasmus University Rotterdam Max Kemman University of Luxembourg Astrid van Aggelen VU University Amsterdam Willem van Hage SynerScope, Helvoirt
European Parliament as Linked Data
• Goal: publish the plenary debates of the European Parliament as Linked Open Data
• Why is this important? A. Large scale analysis across
time spans B. To residents of the European
Union access to the proceedings of the European parliament is a formal right.
• Linked Data: a format for publishing data on the Web, with URI’s as permanent identifiers, designed for connecting pieces of data.
Data
14M statements about the 30K speeches by 3K speakers in 1K session days that were held in the EU parliament between 1999 and 2014
Links
Country namesMembers of Parliament
Members of Parliament + Parties Members of
ParliamentOnline database with background information about MEPs: “committee, party group and delegation membership, as well as leadership positions” [An Automated Database of the European Parliament. Bjørn Høyland, Indraneel Sircar, and Simon Hix, European Union Politics 10(1):143-152, 2009.]
Example 1: speeches that contain a certain keyword
Query: all speeches that contain the phrase “open data”
…. So let us go for open data, let us go for utilisation of all the instruments available to that end! …..
…. but there too governments are encouraging the use of open data to increase transparency, accountability and citizen participation ….
…. We already have many open data projects in the Member States and local authorities…..
Example 2: speeches that contain a certain keyword by date
Mentions of 'human rights'
dates
Frequency
0200
400
600
800
1999 2000 2001 2003 2004 2005 2006 2007 2009 2010 2011 2012 2013
Example 2: speeches that contain a certain keyword by country
AT BE BG CY CZ DE DK EE ES FI FR GB GR HR HU IE IT LT LU LV MT NL PL PT RO SE SI SK
Mentions of 'human rights' by country
01000
2000
3000
4000
5000
6000
7000
Example 3: background info about the MEPs
• MEPs that were not born in Europe.
Members of Parliament
Integrate data from the EU parliament with external datasets
What other knowledge do we have available?
GBP Region Population density Neighbouring countries
Age Religion
Education Spouse / children
Previous occupations Place of birth/residence
Speeches in the Italian parliament
Membership of committees Leadership positions
DEMO tomorrow 14:20-16:00
• •
Implications for use?
Credibility • Who created it? How? • The quality may vary:
• EP vs. Wikipedia
Completeness • How complete is it? Is there a
way to tell how complete it is? • Completeness may vary:
• EP vs. wikipedia
Update frequency • When was the data last
updated? • Update frequency may vary:
• EP vs. “An automated database of the EP”
Credibility, completeness, update frequency of the links
• Who made them? How? When? How complete are they?
Message: the need for dataset evaluation is exacerbated when using linked data
How to use this data, in practice
The bad news: we don’t have a friendly user interface :’(!!!
!The good news: our data + all sources we link to are openly available for everyone :)!!Options for use:
1. Tell us what you want to know and we will write you a query.
2. Go to our website, copy-paste an example query into the query editor.
3. Go to our website, write a SPARQL query in the query editor
!4. Query our SPARQL endpoint
programmatically.!Website: via http://talkofeurope.eu/data/
Use of the data during three Creative Camps
• 3 events of one week each, where people are invited to work with our data on-site.!
• Outcome CC #1 in Hilversum:• Links to the Italian
parliament.• Detection of people who
speak about an unusual mix of topics.
• Sentiment analysis