Confidential Data Sharing and the Semantic Web · Confidential Data Sharing and the Semantic Web"...

Post on 13-May-2020

5 views 0 download

Transcript of Confidential Data Sharing and the Semantic Web · Confidential Data Sharing and the Semantic Web"...

Confidential Data Sharingand the Semantic Web"

Tim Finin"University of Maryland,

Baltimore County"

Conf. Data Collections Workshop, Redmond WA, 9 Sept 2009

Confidential Data Sharingand the Semantic Web"

Tim Finin"University of Maryland,

Baltimore County"

Conf. Data Collections Workshop, Redmond WA, 9 Sept 2009

Assured Information Sharing"Tim Finin"

University of Maryland,Baltimore County"

Conf. Data Collections Workshop, Redmond WA, 9 Sept 2009

Information Assurance?"• Information assurance (IA) is a

generalization of information security"– not just access control"

• Goal: ensure confidentiality, integrity, auth-entication, availability and non-repudiation"– each has challenging and complex issues"

• Note focus on managing information-related risks!– managing ≠ guaranteeing, risks ≈ costs"

Managing the Assured Information Sharing Lifecycle "

• MURI project sponsored by AFOSR (2008-2013)"

• Team includes six universities:"– UMBC (Finin)"– UT Dallas (Thuraisingham)"– UT San Antonio (Sandhu)"– Purdue (Bertino)"– Illinois (Han)"– Michigan (Adamic)"

use  

acquire"

discover"

Assured Information Sharing?"• Traditional approaches to IA discourage

information sharing"• The challenge for AIS is to encourage

information sharing with IA"• And to address specific issues raised by

information sharing, e.g.,"– interoperability and integration"– data semantics"– data quality"– etc."

Motivation for AIS"• 9/11 and related events illustrated problems

in managing sensitive information "• Managing Web information & services with

appropriate security, privacy and simplicity is increasingly important and challenging "

• Autonomous devices (mobile phones, rout-ers & medical equipment) need to share, too"

• EMRs a national goal, raises privacy issues"• Business needs better models for DRM"

Many underlying problems prevent/hinder sharing"

• Sharing takes effort and maybe has risks. Why should I bother?"

• How can I constrain how shared information is used?"• How do I know what information is available?"• Do I understand what the information means?"• Is the information accurate and timely?"• How can I safely let others know what can share?"• What privacy will I have in sharing information?"• Weʼre under attack and I need this information to prevent a disaster!"

Information value chain"

adver)ze  

discover  

acquire  use  

release  

Information value chain"

adver)ze  

discover  

acquire  use  

release  

Potentially, everyone is both an information consumer and producer

Information value chain"

adver)ze  

discover  

acquire  use  

release  

A system discovers information it can use from the advertisements of others

The advertizing/discovery process must be

controlled to prevent inappropriate disclosure

Information value chain"

adver)ze  

discover  

acquire  use  

release  

The principles negotiate a policy for the information’s acquisition and use

Negotiation involves exchange of credentials

& certificates, producing permis- sions & obligations

Information value chain"

adver)ze  

discover  

acquire  use  

release  

The information is used, often resulting in the discovery of new knowledge

We must assure correct semantics and information quality

Information value chain"

adver)ze  

discover  

acquire  use  

release  

which is screened, adapted and summarized for possible release

Enforce obligations on usage and re-sharing, privacy-preserving summaries, incentives for sharing

Information value chain"

adver)ze  

discover  

acquire  use  

release  

and appropriately characterized in advertisements for others to find

Incentives encourage offering to share

information

Our AISL research areas"Weʼve organized our research into four major areas"• New policy models, languages and tools"• Datamining, data quality and privacy preserving systems"• Social networks and incentives"• AIS service/agent oriented infrastructure"And will evaluate our work in several integrated applications in the out years"

Two AIS MURI Projects"

http://aisl.umbc.edu/

http://www.projectpresidio.com/

Semantic Web? Big Data!"• Massive amounts of data drives many fields"• The Web is the biggest innovation in

information sharing in our lifetime"• Focus on unstructured (text, audio, images)

but increasingly structured and semi-structured data is online also"

• Data and knowledge interoperability and integration and are key problems"

• Working toward a Web of data"

Twenty years ago…"Tim Berners-Leeʼs 1989 WWW proposal described a web of relationships among named objects unifying many information management tasks"Capsule history"•  Guhaʼs MCF (~94) "•  XML+MCF=>RDF (~96)"•  RDF+OO=>RDFS (~99)"•  RDFS+KR=>DAML+OIL (00)"•  W3Cʼs SW activity (01)"•  W3Cʼs OWL (03)"•  SPARQL, RDFa (08)"•  Rules (09)"

http://www.w3.org/History/1989/proposal.html

Ten years ago …."• The W3C started

developing standards for the Semantic Web"

• The vision, technology and use cases are still evolving"

• Moving from a web of documents to a web of data"

Today"

Tomorrow"

DBpedia: Linked Data lynchpin

http://dbpedia.org/sparql/

PREFIX dbp: <http://dbpedia.org/resource/> PREFIX dbpo: <http://dbpedia.org/ontology/> SELECT distinct ?Property ?Place WHERE {dbp:Barack_Obama ?Property ?Place . ?Place rdf:type dbpo:Place .}

Background knowledge"Weʼve used this as background knowledge in two recent challenge tasks run by NIST "

– ACE 2008: Recognizing co-referent entity mentions in text documents"

– TAC 2009: Automatically populating Wikipedia infoboxes with new data extracted from text"

Conclusion"• Assured information sharing involves

techniques to encourage information sharing"– while assuring appropriate confidentiality,

privacy, data quality and interoperability"• Semantic Web technologies offer a way to

share common policy concepts, policies, and domain models"– as well promote data and knowledge

integration"

http://ebiquity.umbc.edu/

Contrast with a non-Web approach"

The W3C Semantic Web approach is • Distributed • Open • Non-proprietary • Standards based

Dbpedia : Wikipedia in RDF"• A community effort to extract

structured information fromWikipedia and publish as RDFon the Web"

• Effort started in 2006 with EU funding"• Data and software open sourced"• DBpedia doesnʼt extract information from

Wikipediaʼs text, but from the its structured information, e.g., links, categories, infoboxes"