Download - Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

Page 1: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

transparency, collaboration and information sharing

solution architecture tools and techniques using the social data web

george thomas, 1105 ea2009

Page 2: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

agenda• An overview of Web Oriented Architecture (WOA) design principles that

have made the Web the most successful distributed computing platform ever created will be given.

• Technologies for exposing raw data and publishing semantically enriched structured data for persistence and syndication on the Web as public records will be described.

• Technologies that enable interoperability across these published assets and currently disparate data sources to achieve low cost, large scale data federation will be described.

• Widgets and services that consume and transform this data for interactive and integration purposes will be discussed in the context of different stakeholder views.

• A Web-scale approach to Business Intelligence leveraging Cloud Computing approaches to data archive analysis will be described.

• Finally, the applicability of the proposed solution architecture to the Federal Segment Architecture Methodology and tools like Visualization to Understand Expenditures in IT will be discussed.

Page 3: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

agenda• An overview of Web Oriented Architecture (WOA) design principles that

have made the Web the most successful distributed computing platform ever created will be given.

• Technologies for exposing raw data and publishing semantically enriched structured data for persistence and syndication on the Web as public records will be described.

• Technologies that enable interoperability across these published assets and currently disparate data sources to achieve low cost, large scale data federation will be described.

• Widgets and services that consume and transform this data for interactive and integration purposes will be discussed in the context of different stakeholder views.

• A Web-scale approach to Business Intelligence leveraging Cloud Computing approaches to data archive analysis will be described.

• Finally, the applicability of the proposed solution architecture to the Federal Segment Architecture Methodology and tools like Visualization to Understand Expenditures in IT will be discussed.

Page 4: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

Web Oriented Architecture (WOA)• REpresentational State Transfer (REST)

– The architectural style of the World Wide Web– aka Resource Oriented Architecture (ROA)

• hyperlinks dereference (information) resource representations– HTTP URI's and content negotiation

• user agent prefers .htm, .xml, .rdf, .etc

• statefulness– servers maintain resource state, clients maintain application state

• RESTful Web services– HTTP uniform interface

• CRUD analog to HTTP PUT/GET/POST/DELETE– contrast to Remote Procedure Call (RPC) style Web services

• SOAP/WSDL, you design the methods to invoke

• global visibility (the Web) and persistence (permalinks)– caching, crawling, indexing

Page 5: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

agenda• An overview of Web Oriented Architecture (WOA) design principles that

have made the Web the most successful distributed computing platform ever created will be given.

• Technologies for exposing raw data and publishing semantically enriched structured data for persistence and syndication on the Web as public records will be described.

• Technologies that enable interoperability across these published assets and currently disparate data sources to achieve low cost, large scale data federation will be described.

• Widgets and services that consume and transform this data for interactive and integration purposes will be discussed in the context of different stakeholder views.

• A Web-scale approach to Business Intelligence leveraging Cloud Computing approaches to data archive analysis will be described.

• Finally, the applicability of the proposed solution architecture to the Federal Segment Architecture Methodology and tools like Visualization to Understand Expenditures in IT will be discussed.

Page 6: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

XForms - human data capture• Orbeon server side XForms engine, Ajax browser GUI's

• catalog and builder apps• create new XSD bound forms• populate, persist, search• Tomcat and eXist• off-line capability• transformation pipeline

Page 7: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

Atom Publishing Protocol (APP)• automated invocation of the RESTful Web service

– HTTP PUT/POST the spreadsheet or XML instance doc• to

• where else is APP used?– Google Data API's, Microsoft Live Framework

Page 8: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

Atom Syndication Format• transform XForm or APP captured info into XHTML+RDFa • (permalinked) public recordset in feed entry <content>

Page 9: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.


Page 10: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

small, discreet, component ontology/data-domain-metamodels

Page 11: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

web page = web service

Page 12: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

RDFa enabled 'deep link' discovery• Rich Snippets from Google

• SearchMonkey from Yahoo

Page 13: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

agenda• An overview of Web Oriented Architecture (WOA) design principles that

have made the Web the most successful distributed computing platform ever created will be given.

• Technologies for exposing raw data and publishing semantically enriched structured data for persistence and syndication on the Web as public records will be described.

• Technologies that enable interoperability across these published assets and currently disparate data sources to achieve low cost, large scale data federation will be described.

• Widgets and services that consume and transform this data for interactive and integration purposes will be discussed in the context of different stakeholder views.

• A Web-scale approach to Business Intelligence leveraging Cloud Computing approaches to data archive analysis will be described.

• Finally, the applicability of the proposed solution architecture to the Federal Segment Architecture Methodology and tools like Visualization to Understand Expenditures in IT will be discussed.

Page 14: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

goal: federated dataset correlation• graph based dynamic schema evolution across silos

– centralization/normalization not required (or realistic/practical!)

Page 15: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

Web as DB - Web API• Linking Open (Government) Data (LOD)

• SPARQL endpoints

Page 16: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

browse: from web of docs to web of data

Page 17: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

• content negotiation, user agent prefers;– human (html) or machine (rdf/xml) readable


Page 18: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

• now at the bottom of the same page/actor/10– triple is Subject (S) Predicate (P) Object (O)

• 10 (S) vocabulary:property (P) <object> (O)

– properties link to other dataset instances• that use different datatype definitions

– note D2R app, expose RDB as RDF, SPARQL to SQL

Page 19: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.• <subject> has predicate {space} object1 , objectN ; repeat until .

<> foaf:page <> ,

<> ;

owl:sameAs <> , <> ;

rdf:type movie:actor ,

foaf:Person .

• this is an 'N3' RDF serialization, instead of RDF/XML (or others)

• some properties have RESTful SPARQL queries as <objects>

foaf:person rdfs:seeAlso <<>

Page 20: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

Web based SPARQL query builder is powered by 'Virtuoso' that provides a 'SPARQL endpoint' (DRM 'query point')

Page 21: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

creates query

• use response data in next query

Page 22: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

authoritative metadata - provided tags!!• using standardized datatype and property specifications

• ontologies emerges from social folksonomy

Page 23: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

agenda• An overview of Web Oriented Architecture (WOA) design principles that

have made the Web the most successful distributed computing platform ever created will be given.

• Technologies for exposing raw data and publishing semantically enriched structured data for persistence and syndication on the Web as public records will be described.

• Technologies that enable interoperability across these published assets and currently disparate data sources to achieve low cost, large scale data federation will be described.

• Widgets and services that consume and transform this data for interactive and integration purposes will be discussed in the context of different stakeholder views.

• A Web-scale approach to Business Intelligence leveraging Cloud Computing approaches to data archive analysis will be described.

• Finally, the applicability of the proposed solution architecture to the Federal Segment Architecture Methodology and tools like Visualization to Understand Expenditures in IT will be discussed.

Page 24: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

indexing/searching the Data Web

Page 25: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

aggregation and live data reporting

Page 26: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

many to many set visualization used to aggregate data across multiple (data) 'bases' on

Page 27: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

ad-hoc analyst/end-user 'meshups'

Page 28: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.


• = OMG BMM + CPIC (+SOA...)– Obama is an instance of the Federal Enterprise type

• Federal Enterprise (S) Fed Ent Goal (P) Goal (O)

Page 29: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

/rdf/bizmo.federal_enterprise (excerpt)• (W3C/FBase) <subject/topic> <predicate/property>

<object/topic> <> <> "Federal


<> <> "1"^^<>.

<> <> <>.

<> <> <>.

<> <> <>.

<> <> <>.

<> <> <>.

<> <> <>.

<> <> <>.

<> <> <>.

Page 30: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

connecting the data dots:• create the following subject/predicate/object or topic/property/topic


Goal / amplifies / Vision

Objective / quantifies / Goal

Federal Enterprise / (has) Fed Ent Goal / (of type) Goal

Federal Agency / maintains / Exhibit 53

Exhibit 53 / contains (multiple) / Exhibit 53 Recordset(s)

Exhibit 53 Recordset / Supports Federal Goal / (of type) Goal

• then create instances with data from

Obama / is of type / Federal Enterprise

Obama / has a Fed Ent Goal / Health Care Reform

HHS / is of type / Federal Agency

HHS / maintains / HHS Exhibit 53

HHS Exhibit 53 / contains / Nat Health Info Network Connect

Nat Health Info Network Connect / supports Obama Goal / Health Care Reform

Page 31: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

search all 'bases' for 'Exhibit 53' interface to

Page 32: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

base/bizmo/e53 returns

• a collection (2 instances) of an Exhibit 53 topic– one from HHS and GSA (data from

• triple in Exhibit 53 topic schema– Exhibit 53 (S) contains (P) Exhibit 53 Recordset (O)

Page 33: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

discovering unknown data structures

• the power of 'faceted' search and browsing• interactive query – which of these?

– Ex53 Recordset (S) Supports Federal Goal (P) ? (O)

Page 34: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

traversing the data graph

• from info about an IT investment• to info about Administration priorities

• 2 Ex53's to 3 Recordsets to 1 that has Obama Goal– <uri> (S) <uri> (P) <uri> (O)

Page 35: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009. - more faceted filtering

Page 36: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

scatter chart driven by tag clouds

Page 37: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

more multi-dataset faceted meshups

Page 38: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

drag & drop metadata/data 'curation'

Page 39: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

publish new freemix merged dataset choose a stylesheet, view lenses and facets to include for your end users to interact with

Page 40: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

agenda• An overview of Web Oriented Architecture (WOA) design principles that

have made the Web the most successful distributed computing platform ever created will be given.

• Technologies for exposing raw data and publishing semantically enriched structured data for persistence and syndication on the Web as public records will be described.

• Technologies that enable interoperability across these published assets and currently disparate data sources to achieve low cost, large scale data federation will be described.

• Widgets and services that consume and transform this data for interactive and integration purposes will be discussed in the context of different stakeholder views.

• A Web-scale approach to Business Intelligence leveraging Cloud Computing approaches to data archive analysis will be described.

• Finally, the applicability of the proposed solution architecture to the Federal Segment Architecture Methodology and tools like Visualization to Understand Expenditures in IT will be discussed.

Page 41: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

crowdsourced analyticsshown using 'Top Braid Composer Maestro' from

'SPARQLMotion' script – also see Yahoo | Deri |

Page 42: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

cloud scale analytics (petabyte batch)• proprietary Google

– GFS, BigTable and MapReduce

– page rank impl• open source Apache Hadoop

– HDFS, HBase and MapReduce

– entity, RDFa extraction• Amazon EMR, Cloudera

– COSS prof service providers

Page 43: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009. - cloud graph store• Software as a Service, enabling rapid development with zero deployment


• a simple, consistent web API for storing, managing and retrieving both structured and unstructured data

• flexible, schema-free metadata that allows applications to be easily evolved

• a range of data access and query options enabling easy integration into both new and existing applications

• access control options to support hosting of both public and private data

• a data hosting solution that is founded on open internet standards and web architectural best practices

• ...

• every resource in your (data)store has a unique URL from which its metadata can be retrieved with a single web request

• SPARQL queries can be used to perform more complex queries, retrieving results as a tabular result set or as RDF

• content negotiation can be used to retrieve data as RDF, XML, or JSON allowing you to chose the right format for your application

Page 44: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

agenda• An overview of Web Oriented Architecture (WOA) design principles that

have made the Web the most successful distributed computing platform ever created will be given.

• Technologies for exposing raw data and publishing semantically enriched structured data for persistence and syndication on the Web as public records will be described.

• Technologies that enable interoperability across these published assets and currently disparate data sources to achieve low cost, large scale data federation will be described.

• Widgets and services that consume and transform this data for interactive and integration purposes will be discussed in the context of different stakeholder views.

• A Web-scale approach to Business Intelligence leveraging Cloud Computing approaches to data archive analysis will be described.

• Finally, the applicability of the proposed solution architecture to the Federal Segment Architecture Methodology and tools like Visualization to Understand Expenditures in IT will be discussed.

Page 45: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

application to EA discipline getting there from here

– stop:• publishing / analyzing / visualizing unstructured data• using structure data only in file or message exchanges

– start:• align Gov and Web architecture (including EA KB's!)• publish component ontologies on the Web• and begin linking their metadata and data• using the Social Data Web

– continue:• embrace emergent structure and continuous improvement• using open source and enabling long-tail crowd-sourcing

Page 46: Transparency, collaboration and information sharing solution architecture tools and techniques using the social data web george thomas, 1105 ea2009.

q&a - discussion• thanks for your time and attention!

• contact me


– GSA OCIO Chief Enterprise Architect– FCIOC-AIC Services Subcommittee Chair– W3C eGov IG invited expert– OMG GovDTF Steering Committee– Graduate School Faculty SOA Instructor