News Fact-checking: One Practical Application of Linked Statistics

1
News Fact-checking: One Practical Application of Linked Statistics 3. 3. LinkedSTAT http://linkedstat.spaziodati.eu ISTAT SDMX SOAP Web Service http://sdmx.istat.it/WS_T/NsiStdV20Service.asmx SDMX-ML SDMX-to-RDF XSL transformations https://github.com/csarven/linked-sdmx Virtuoso Quad Store http://www.ladige.it/articoli/2012/07/17/poverta-trentino-resta-isola-felice NEWS FACT-CHECKING is the process of verifying accuracy of facts in publications Fact-checking is a tedious, time-, resource-consuming and error-prone process * The original text is: “Nel 2011, secondo i dati Istat, ... la provincia di Trento (3,4%), la Lombardia (4,2%), la Valle d'Aosta e il Veneto (4,3%) presentano i valori più bassi dell'incidenza di povertà [relativa].” “In 2011, according to Istat, ... the province of Trento (3.4%), Lombardy (4.2%), Valle d'Aosta and Veneto (4.3%) have the lowest value of the incidence of [relative] poverty.” * RDF/XML RDF Data Cube Vocabulary PROV-O Ontology SKOS and XKOS SDMX-RDF, ... Fact-checking: How to find a right set of dimension/value pairs for a given fact to construct queries for it? ISTAT: new possibilities to disseminate statistics and facilitate data certification. DBpedia/Wikipedia: automatic updates of statistical data “In 2011, according to Istat, ... the province of Trento (3.4%) ... value of the incidence of [relative] poverty.” Dimension Value territory linked-istat-property:REF_AREA “Provincia Autonoma Trento” http://linkedstat.spaziodati.eu/code/1.1/ CL_REFAREA/ITD2 reference time period linked-istat-property:TIME_PERIOD “2011” <http://reference.data.gov.uk/id/year/2011> statistical indicator linked-istat-property:IND_TYPE “incidenza di povertà relativa familiare” <http://linkedstat.spaziodati.eu/code/1.1/ CL_AGGREG_FAMIGLIE/INCID_POVREL_ FAM> http://linkedstat.spaziodati.eu/sparql MANUAL FACT-CHECKING – review of the citations' content dedicated fact-checking departments - only major infrequent periodicals can afford them (Der Spiegel, The Guardian, Esquire, Forbes); no budget in small publishing organisations - impractical for frequent publications nonprofit fact-checking organisations (FactCheck.org, PolitiFact.com) crowd-checking platforms (FactCheckEU.org) Tatiana Tarasova [email protected] SpazioDati, Trento, Italy “Poverty, Trentino remains a happy island” l'Adige.it, 17 Luglio 2012 What if the facts would be linked to the underlying data sources? publishing ISTAT http://dati.istat.it/ as Linked Data http://dati.istat.it/ All the queries and scripts produced during the LinkedSTAT project are available at https: //www.assembla.com/spaces/linked-istat/ Fact-checking with LinkedSTAT SELECT DISTINCT ?dataset ?title ?structure WHERE { ?dataset a qb:DataSet . ?dataset dcterms:title ?title . FILTER(contains(str(?title), "Incidenza di povertà relativa")) ?dataset qb:structure ?structure .} SELECT DISTINCT ?codeList ?p ?o WHERE { <http://linkedstat.spaziodati.eu/property/REF_AREA> qb:codeList ? codeList . ?codeList ?p ?o} SELECT DISTINCT ?obs ?value WHERE { ?obs rdf:type qb:Observation . ?obs linked-istat-property:REF_AREA <http://linkedstat.spaziodati.eu/code/1.2/CL_REFAREA/ITD2> . ?obs linked-istat-property:TIME_PERIOD <http://reference.data.gov.uk/id/year/2011> . ?obs linked-istat-property:IND_TYPE <http://linkedstat.spaziodati.eu/code/1.1/CL_AGGREG_FAMIGLIE/INCID_POVREL_FAM> . ?obs linked-istat-property:OBS_VALUE ?value .} Step1: retrieve the structure of the relevant dataset Step2: retrieve code lists that provide values Step3: retrieve the value of the required observation Future

description

This is the poster for SemStat at ISWC 2014 in Riva del Garda. SemStat 2014 was the "Second International Workshop on Semantic Statistics". Our poster is about a use case on fact-checking using the potential of Linked Statistics.

Transcript of News Fact-checking: One Practical Application of Linked Statistics

Page 1: News Fact-checking: One Practical Application of Linked Statistics

  News Fact-checking:One Practical Application of Linked Statistics

33.133.1

LinkedSTAT http://linkedstat.spaziodati.eu

ISTAT SDMX SOAP Web Servicehttp://sdmx.istat.it/WS_T/NsiStdV20Service.asmx

SDMX-ML SDMX-to-RDF XSL transformationshttps://github.com/csarven/linked-sdmx

VirtuosoQuad Store

http://www.ladige.it/articoli/2012/07/17/poverta-trentino-resta-isola-felice

NEWS FACT-CHECKING is the process of verifying accuracy of facts in publications

Fact-checking is a tedious, time-, resource-consuming and error-prone process

* The original text is:“Nel 2011, secondo i dati Istat, ... la provincia di Trento (3,4%), la Lombardia (4,2%), la Valle d'Aosta e il Veneto (4,3%) presentano i valori più bassi  dell'incidenza di povertà [relativa].”

“In 2011, according to Istat, ...the province of Trento (3.4%),Lombardy (4.2%), Valle d'Aosta and Veneto (4.3%) have the lowest value of the incidence of [relative] poverty.” *

RDF/XML

RDF Data Cube VocabularyPROV-O OntologySKOS and XKOSSDMX-RDF, ...

Fact-checking: How to find a right set of dimension/valuepairs for a given fact to construct queries for it?

ISTAT: new possibilities to disseminate  statistics andfacilitate data certification.

●DBpedia/Wikipedia: automatic updates of statistical data

“In 2011, according to Istat, ... the province of Trento (3.4%) ... value  of the incidence of [relative] poverty.”

Dimension Value

territory

linked-istat-property:REF_AREA

“Provincia Autonoma Trento”

http://linkedstat.spaziodati.eu/code/1.1/CL_REFAREA/ITD2

reference time period

linked-istat-property:TIME_PERIOD

“2011”

<http://reference.data.gov.uk/id/year/2011>

statistical indicator

linked-istat-property:IND_TYPE

“incidenza di povertà relativa familiare”

<http://linkedstat.spaziodati.eu/code/1.1/CL_AGGREG_FAMIGLIE/INCID_POVREL_FAM>

http://linkedstat.spaziodati.eu/sparql

MANUAL FACT-CHECKING – review of the citations' content● dedicated fact-checking departments- only major infrequent periodicals can afford them (Der Spiegel, The Guardian, Esquire, Forbes);   no budget in small publishing organisations- impractical for frequent publications

● nonprofit fact-checking organisations (FactCheck.org, PolitiFact.com)● crowd-checking platforms (FactCheckEU.org)

Tatiana Tarasova [email protected] SpazioDati, Trento, Italy

“Poverty, Trentino remains a happy island” l'Adige.it, 17 Luglio 2012

What if the facts would be linked to the underlying data sources?

publishing ISTAT http://dati.istat.it/ as Linked Data

http://dati.istat.it/

All the queries and scripts produced during the LinkedSTAT project are available athttps: //www.assembla.com/spaces/linked-istat/

Fact-checking with LinkedSTAT

SELECT DISTINCT ?dataset ?title ?structureWHERE {?dataset a qb:DataSet .?dataset dcterms:title ?title .FILTER(contains(str(?title), "Incidenza di povertà relativa"))?dataset qb:structure ?structure .}

SELECT DISTINCT ?codeList ?p ?oWHERE {<http://linkedstat.spaziodati.eu/property/REF_AREA> qb:codeList ?codeList .?codeList ?p ?o}

SELECT DISTINCT ?obs ?valueWHERE {?obs rdf:type qb:Observation .?obs linked-istat-property:REF_AREA <http://linkedstat.spaziodati.eu/code/1.2/CL_REFAREA/ITD2> .?obs linked-istat-property:TIME_PERIOD <http://reference.data.gov.uk/id/year/2011> .?obs linked-istat-property:IND_TYPE <http://linkedstat.spaziodati.eu/code/1.1/CL_AGGREG_FAMIGLIE/INCID_POVREL_FAM> .?obs linked-istat-property:OBS_VALUE ?value .}

Step1: retrieve the structure of the relevant dataset

Step2: retrieve code lists that provide values

Step3: retrieve the value of the required observation

Future