Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

download Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

of 61

Transcript of Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    1/61

    1

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    2/61

    Happy to go off track and follow the thread of any interesting questions and discussion

    that arise as we go.

    2

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    3/61

    See http://www.w3.org/2009/Talks/0427-web30-tbl/ for Tim Berners-Lees take on

    many of these same themes.

    3

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    4/61

    Im not going to dwell on this, because everyone in this class by now surely has a

    deeper and more sophisticated understanding of how we got to where we are. But

    looking at the steps to this point in the context of a timeline may help us understandthe current Semantic Web landscape.

    Two key characteristics of the birth and success of Web 1.0:

    1. From the very beginning was founded on democratic principles of no nodes in the

    Web being privileged anyone can link to anyone

    2. There were (relatively speaking) very few data publishers in the Web initially. Most

    users browsed only to consume information.

    4

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    5/61

    The AJAX technology stack allowed developers to create mature Web applications

    (approaching parity with fat-client applications) rather than (only) Web pages. It also

    began allowing Web content to be repurposed to applications beyond the browser(desktop, embedded devices, mobile devices, ).

    Eventually, these Web applications began allowing Web users to contribute to parts of

    the Web rather than (only) consume Web pages.

    Beginning in 2002, Web thought leaders (esp. Dale Dougherty, Tim OReilly, John

    Battelle) began referring to the confluence of user-generated content, Web-as-

    platform, social Web, read-write Web, wisdom of crowds, as Web 2.0.

    5

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    6/61

    6

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    7/61

    7

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    8/61

    At the physical level, computers are connected to switches, routers, etc. network

    links.

    8

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    9/61

    The Internet directly links machines by abstracting away the network-link boundaries.

    9

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    10/61

    Each computer participating in the Web (Web server) is providing access to many

    documents (Web pages). The Web lets us make links between these documents.

    10

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    11/61

    The Web lets us abstract away the computers and the Internet and focus on the linked

    documents.

    11

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    12/61

    But people are rarely interested in the documents. Theyre interested in the

    informationthe datawithin the documents.

    12

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    13/61

    The Semantic Web abstracts away the documents (the sources of the information), and

    leaves us with data linked together. Linked Data, Web of Data, etc. This is the Web

    part ofSemantic Web.

    13

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    14/61

    It also gives us the tools to understand the Web of data and bring structure

    (understanding) to it. This is the semantics part ofSemantic Web.

    14

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    15/61

    Tim Berners-Lee first used the term Semantic Web to describe a vision for the future of

    the Web as early as the first WWW conference in 1994.

    Along with Jim Hendler and Ora Lasila, he laid out this vision in a 2001 article in

    Scientific American.

    In 2007, the birth of the Linking Open Data project saw the first real concerted efforts

    to build out the Semantic Web by publishing data sets on the Web that could be

    queried and linked to one another.

    2008-2010 saw significant uptake in Semantic Web support on the Web and insideenterprises, highlighted by support from Google and Yahoo and data from Best Buy, NY

    times, US and UK governments. (Also: Drupal support in Drupal 7 and FaceBook Open

    Graph Protocol (2010)).

    This is a long time span, and yet many (myself included) would hesitate to say that the

    Semantic Web (Web 3.0) is here. When will that day come? How do we tell?

    15

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    16/61

    Whats been happening this whole time? (Between the introduction of the vision and

    today.) A lot of technology, standards, tool, and product development. Also, a lot of

    advocacy.

    16

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    17/61

    This is the ultimate vision as per the original Scientific American article. Referred to last

    week as the top-down approach.

    17

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    18/61

    Many of the people that have been building the technologies, standards, and tools are

    doing so with these ends in mind. They have (disruptive, game-changing) problems

    today and these technologies provide a way to solve them today.

    18

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    19/61

    19

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    20/61

    Different nuances, but the same actual thing. Still, you can often tell a lot about

    someones view ofSemantic Web based on the terms they choose to you to describe it.

    Linked Data Web has been relatively speaking successful in gaining traction.

    20

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    21/61

    Ideas?

    Incremental value improved efficiency

    Link/integrate beyond traditional enterprise sources greater value, more appealing

    partner

    Shadow data (emails, documents, spreadsheets, presentations, )

    Partner data (upstream/downstream supply chain, customers, partners,

    channels, )

    Needle in haystack (reasoning, inference, rules) greater value

    Reach improved efficiency

    21

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    22/61

    On the Web, the advent ofSemantic Web is about moving the paradigm from

    document-centric to data-centric. The enterprise has elements of that too, but more

    often than not within enterprise IT semantics is about moving from relational database

    a

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    23/61

    23

    Important to the enterprise value proposition: this is overlay technology that does not require an ove

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    24/61

    (This slide best told with animation in the original PowerPoint.)

    The Semantic Web paradigms allows new and updated data to be brought into the

    fold incrementally, without starting over. This makes it particularly amenable to

    changing requirements.

    24

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    25/61

    Cray Proprietary 25

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    26/61

    Databases that traditionally manage enterprise data are ITartifacts. Theyre crafted by

    IT, for IT: asking scientists or other business domain experts to understand a relational

    model with scores of tables, IDs, key/value tables, unused columns, etc. is completely

    unrealistic.

    The semantic model is a conceptual model. It eschews IDs, keys, etc. in favor of

    concepts and relationships expressed/expressible in human language. This is reflected

    in software that is built with Semantic Web data. This means that when a researcher is

    linking their results spreadsheet, theyre dealing only in concepts that theyre familiar

    with (organism, cell line, % inhibition, 4P, IC50, etc.). And that in turn means that this

    approach works regardless of whatever spreadsheet layout a particular collaborator is

    using: researchers can continue using their current spreadsheets, with no change.

    26

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    27/61

    27

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    28/61

    Were not yet at the point where the Semantic Web is a magic crank. Its not yet:

    An automated way for pharmaceutical companies to discover new drugs for their

    pipelines

    An automated way for oil and gas companies to identify productive drilling locations

    The (generic, intelligent) travel butler, or other autonomous Web-based agent

    But nevertheless, a lot of people are embracing linked data in a lot of ways, and a lot of

    companies are using Semantic Web technologies and a linked data approach

    successfully today. What follows are some examples.

    28

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    29/61

    Web 1.0 and Web 2.0(+) are core parts of our lives, from readingCNN.com to buying

    things on amazon.com to facebook and twitter and Web-delivered mobile apps for

    scanning bar codes, looking up music, etc.

    Web 3.0 is not so obvious. The answer to the question is at least occasionally, but you

    probably never see it.. Well see some examples of where you might be seeing the

    fringes of Web 3.0 in the coming slides, including:

    Facebook open graph protocol

    Drupal

    RDFa in search results with Google and Yahoo!

    BBC World Cup site

    29

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    30/61

    Courtesy W3C SWEO group, http://linkeddata.org/docs/eswc2007-poster-linking-open-

    data.pdf

    30

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    31/61

    31

    Courtesy of Richard Cyganiak, http://richard.cyganiak.de/2007/10/lod/

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    32/61

    32

    Courtesy ofChris Bizer, http://www4.wiwiss.fu-berlin.de/bizer/pub/lod-datasets_2009-03-05.html

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    33/61

    33

    Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    34/61

    Possible answers:

    Few people are driven by data ownership, data portability

    People are drawn to specific sites

    People _want_ to segment their online profiles (c.f. Facebook vs. LinkedIn)

    Drupalwhich runs 1% of the worlds Web sitesis on the leading edge of adoption of

    the Semantic Web for content-driven sites. Drupal 7 exposes the semantics of Drupal

    sites natural structures to Google/Yahoo! with RDFa. Also modules for SIOC and

    Facebook OGP.

    34

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    35/61

    The key point here is that though FB published this protocol, it relies on open Semantic

    Web standards (RDFa) that anyone else can consume. The same semantics allow

    people to link the Like button to the type of artifact being liked (movie, here) and also

    can allow search engines to give more structure, query engines to find more data, etc.

    35

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    36/61

    Image courtesy of http://bio2rdf.org/ .

    Scientific data makes up a significant portion of the current Linked Data Web. This is

    information on proteins and genes, pathways, and sequences, chemistry and genetics,

    This diagram shows some of the information available and how its linked together.

    Nodes are sized according to their quantity of data, and links are sized according to the

    quantity of links.

    36

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    37/61

    37

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    38/61

    38

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    39/61

    39

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    40/61

    40

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    41/61

    41

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    42/61

    Google (Rich Snippets) and Yahoo! (originally Search Monkey) consume semantic

    markup to enhance search listings.

    42

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    43/61

    http://searchnewscentral.com/20110207129/Technical/rdfa-the-inside-story-from-

    best-buy.html

    43

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    44/61

    Many enterprise uses ofSemantic Web / Linked Data are highlighted at:

    http://www.w3.org/2001/sw/sweo/public/UseCases/

    44

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    45/61

    Question: Where in this scenario do you think Semantic Web concepts and

    technologies are being employed? What would the alternative be?

    Answers: integrating data to get as large a universe as possible; rules and reasoning tointelligently filter the data

    45

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    46/61

    Combine manual tagging with ontology-driven reasoning and ontology-driven dynamic

    aggregation (700 index pages, more than the rest of the sports site combined) to

    produce a dynamic, cross-indexed, cross-linked, usefulsite for the World Cup.

    What is the semantic value here?

    * Produce an information rich site at many levels of aggregation (player, team,

    geography, group, ) without employing a large fleet of editors to curate the sites

    _content_. Instead, maintain an ontology and provide a content tagging process.

    * Use the ontology to help automate the tagging process (forward-chaining inference

    based on taxonomies)

    For more details:

    http://www.bbc.co.uk/blogs/bbcinternet/2010/07/bbc_world_cup_2010_dynamic_se

    m.html

    http://www.bbc.co.uk/blogs/bbcinternet/2010/07/the_world_cup_and_a_call_to_ac.h

    tml

    46

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    47/61

    Other governments with similar efforts. Australia, Sweden, New Zealand, , various

    local governments

    47

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    48/61

    At TED2010, Tim Berners-Lee reported back on one years worth of progress after the

    push for raw data began in 2009.

    Q: What's special about Semantic Web / Linked Data here? What would be different ifthis were all put out using "Web 2.0" approaches?

    * baked into the Web -- _easy to publish and consume via the existing Web

    infrastructure, flexible, heterogeneous_

    * "semantic" -- _easy for 3rd parties to understand - no screen scraping, "guessing" -

    lots you can do with it (layer cake)

    48

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    49/61

    49

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    50/61

    Its not all sunshine, rainbows, and puppies

    (This slide better with animation, sorry!)

    50

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    51/61

    51

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    52/61

    Industry: no different from any other investment expect to see ROI, whether in the

    form of time-to-market, competitive advantage, greater efficiencies, lessened resource

    requirements, etc. Look for disruptive (10x) improvements.

    Linked Data Web:

    Putting raw data on the Linked Data Web takes work.

    Scientific data is funded by government money, with requirements for openness

    Commercial data is driven by ROI (cf Best Buys experience)

    Government data is trickyat the whims of politics. (cf data.gov.uk with the change

    from Gordon Brown to David Cameron)

    Maintaining links between data sets is tricky. Is it any trickier than building the

    document Web? (Maybe not.)

    52

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    53/61

    Another example NY Times, while embracing Linked Data Web to some extent, is

    putting their real content behind a pay wall.

    Image copyright Scott Brinker, with attribution tohttp://www.chiefmartec.com/2010/01/7-business-models-for-linked-data.html .

    See also http://www.ldodds.com/blog/2010/01/thoughts-on-linked-data-business-

    models/ --

    Advertising is hard when people arent the consumers and when all data is semantically

    identified! (Advertising via Ts&Cs possible)

    53

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    54/61

    A large (Fortune 100) company might have 10,000 database. And some of those

    database might be huge 10 TB or bigger.

    But large enterprises are also sub-segmentable in ways in which the Web is not. Thereare divisional, departmental, geographic problems that can be solved as if solving the

    problems of a much smaller enterprise.

    There are social challenges (some of which are covered elsewhere in section of the

    talk), but there are also pure technical challenges when working at Web scale:

    Distributed query

    Cache invalidation

    Link rot

    Data rot (Linked CT example)

    Rules / reasoning across data sets

    54

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    55/61

    55

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    56/61

    (While this is a challenge for being able to fully exploit the Linked Data Web, its also an

    opportunity before the Linked Data Web, there was little opportunity to find and

    improve these sorts of data quality issues. Linked Data gives us visibility into these data

    issues so that source data can be improved. But it is still a challenge to figure out amodel for improving and verifying data quality before individual human interpretation

    can be removed from the chain.)

    56

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    57/61

    Possible ideas:

    Up-vote/down-vote for data and data sets(wisdom of crowds)

    Build agents off of authoritative (1st-party) sources

    Certified sources, audited sources, regulated sources

    57

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    58/61

    Potential approaches to trust:

    Digital signatures

    Social network analysis

    Multiple assertions of the same fact (voting, data quality all over again) Provenance (how did we arrive at this data assertion)

    Trust the contributions its data quality all over again! (specific facts, sets of facts,

    entire data sets)

    Related issue: uncertainty

    Within an enterprise: accepted sources of authority; default trust state

    58

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    59/61

    Enterprises can derive incremental value via a small number ofSemantic Web vendors

    and Semantic Web knowledgeable system integrators (Sis). To gain traction on a Web-

    scale, however, requires the world of Web 1.0 & 2.0 (LAMP, JSON, ) developers to

    adopt these new (and arguably more complex) technologies.

    59

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    60/61

    Only 9% of Linked Open Data datasets include machine-readable license information.

    (http://ivan-herman.name/2011/03/29/ldow2011-workshop/)

    Some links for further reading:

    http://www.opendefinition.org/guide/data/

    60

  • 8/6/2019 Evolution Towards Web30 Semantic Web 110531013826 Phpapp01

    61/61

    61