1
From Books to Xanadu to Semantic Publishing
Scholars Communicating and Using Knowledge
Prof. Dr. Stefan GradmannHumboldt-Universität zu Berlin / School of Library and Information [email protected]
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 2
… à la carte: A Three-Course Menu
Knowledge: what is it, actually?… communicating: on the evolution of publishing knowledge… using: towards future knowledge based heuristics
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 3
Hors d'OeuvreKnowledge
(as part of the DIKT hierarchy)
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 4
4
(Very dirty) data
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 5
5
Data + Pattern: Information
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 6
6
Information + Context: Knowledge
..., 1941, 1943, , 1947, 1949, ...
1939 -
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 7
7
... and creative thinking
http://itunes.apple.com/de/album/dave-brubecks-greatest-hits/id157427923
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 8
8
DIKT: a Visualisation
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 9
9
DIKT: A Closer Look (1)Data
discrete, atomistic portions of 'givens' without inherent structure or necessary relationship between them. Data have no meaning in themselves.Phonetical level in linguistics
InformationData + patternsMeaningful dataPhonological / lexical level in linguistics
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 10
10
DIKT: A Closer Look (2)Knowledge
Information as part of a context and useful in this contextSocial or semantic contextContextualisation enables (simple!) interpolative and deterministic reasoning.Syntactic level in linguistics
ThinkingReplace the rich but diffuse concept 'wisdom' with 'thinking'Mental activity we cannot (entirely) confer to machinesNon-deterministicSemantic level in linguistics (‘wisdom’ would probably be on pragmatic level)=> Knowledge = Information + Context
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 11
Main Course… Communicating Knowledge
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 12
Linear Document Continuum ...… in the Gutenberg galaxy
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 13
Linear Document Continuum ...… in emulation mode
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 14
Linear Document Continuum ...… going digital (entering Turing galaxy)
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 15
Decreasing functional determination by traditional cultural techniquesDisintegration of the linear / circular functional paradigmaErosion of the monolithic document notion in hypertext paradigms
Web Based Scholarly Working Continuum ...… a triple paradigm shift: Beyond Documents
Linked Open Europeana: Semantics for the Citizen. Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 16
The Web of Documents
InformationManagement:A Proposal (TBL, 1989)
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 17
Ted Nelson's Xanadu: the document web radicalized ...
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 18
… and extended with a web of 'things'
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 19
… and 'publication' aggregations combining 'documents' and 'things'
Where do resource aggregations 'start'? Where do they 'end'? And what constitutes document boundaries?? And which node was connected to which one at a given time???
A
B
C
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 20
Machines can reason on triple sets!
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 21
Some reasoning preconditions ...
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 22
… and an automated inference!
There is quite some potential for generating scholarly heuristics here!
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 23
The use of InferencesCitation: Citation: van Haagen HHHBM, et al. (2009) Novel Protein-Protein Interactions Inferred from Literature van Haagen HHHBM, et al. (2009) Novel Protein-Protein Interactions Inferred from Literature Context. PLoS ONE 4(11): e7894. Context. PLoS ONE 4(11): e7894. doi:10.1371/journal.pone.0007894 / Example provided by Jan Velteropdoi:10.1371/journal.pone.0007894 / Example provided by Jan Velterop
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 24
LoD: Billions of Triples … … and Semantic Publishing!
http://richard.cyganiak.de/2007/10/lod/lod-datasets_2010-09-22_colored.html
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 25
Semantic Publishing as Defined by ShottonShotton et al. (2009b) define semantic publication to include anything that
enhances the meaning of a published journal article, facilitates its automated discovery, enables its linking to semantically related articles, provides access to data within the article in actionable form, orfacilitates integration of data between articles.
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 26
Behind the Scene
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 27
Semantic Enrichment ToolsGeneric:
OpenCalais ()Temis ()
Specialised:Bio Taxon Finder ( ml_services)ConceptWebAlliance () (Biomedical, Jan Velterop)
Good critique by Roderic Page: “linking terms to HTML pages doesn't get us much further. Great for humans, not so good for computers.”Too much focus on journal article format!
→ We need a little more!
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 28
Books: the Liquid Version“Turning inked letters into electronic dots that can be read on a screen is
simply the first essential step in creating this new library. The real magic will come in the second act, as each word in each book is
cross-linked, clustered, cited, extracted, indexed, analyzed, annotated, remixed, reassembled
and woven deeper into the culture than ever before. In the new world of books, every bit informs another; every page reads all the other pages.”Kevin Kelly, The New York Times Magazine, May 14, 2006
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 29
Semantic Micro-Content: PAUXA Semantic Wiki, not based on static HTML pages, but instead consisting of dynamic documents, provided at runtime from semantic microcontent (“PAUX-Objects”), semantically linked by “PAUX-Links”Microcontent elements have HTTP URIs!→ PAUX documents can be published as Linked (Open) Data aggregations with maximum granularity: down to word level.PAUX creates “liquid books”More at
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 30
Granular Semantic Publishing: Paux (1)
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 31
Very Granular Semantic Publishing: Paux (2)
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 32
Semantic Publishing: Paux (3)
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 33
Linked Semantic Publishing: Paux (4)
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 34
Linked Semantic Publishing: Paux (5)
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 35
Social Semantic Publishing: Paux (6)
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 36
Paux live (1): Outline & Sentences
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 37
Paux live (2): Sentence & Linking Options
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 38
Paux live (3): Word & Hyperlinks
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 39
Paux live (4): Word & Link to Sentence
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 40
Data = Publication
Distinction data vs. publication will get increasingly obsolete in semantic publishing environments …… at least in the STM sector.The move into semantic publication will be much slower in the SSH because of
fuzzy and unstable terminologyfuzzy linking semantics hard to formalise consistentlyclose relation between complex document formats and scholarly discourse
Current examples are mostly from the medical and bio-medical area as a consequence.=> More from Jan Velterop!
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 41
Dessert … using knowledge
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 42
“What do you do with a million books?” (G. Crane)
DL view: digitisation and (more and more) semantic publishing increase by at least one or even more orders of magnitude
ScaleLinguistic heterogeneity of contentGranularity of objectsNoise (encoding and semantic)Audience
They may lead to a dramatic decrease of the number of collections and distributorsThey render obsolete the very notion of a 'collection' ...… as well as the notion of a 'catalogue'→ Do we need more than one Digital Library in such a setting?
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 43
Re “What do you do with a million books?” (G. Crane)
Scholarly view: digitisation and (increasingly) semantic publishing result in
growing quantityincreased complexity
Well beyond scholarly processing capacity (=reading faculty)Multiplication of collections or distributors is annoying → as few as possible. Ideally just one (?)Scientists and Scolars will badly need help in these areas:
Semantic abstracting, named entity recognition for “strategic reading” (Renear)Contextualisation of information objectsRobust reasoning and inferencing yielding digital heuristics
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 44
Philospace: ontology based annotation as Linked Open Data
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 45
SwickyNotes: ontology selection
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 46
All Cretans are liars … annotated!
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 47
Perseus
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 48
All Cretans are liars … in Perseus!
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 49
→ Lidell-Scott … and further!
→ Europeana→ Wordnet, OpenCalais, Geonames …: information in context!!!
The guiding paradigm is not so much XML-treelike structures but rather RDF-graphs in network structures
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 50
ConclusionMuruca / Philospace is a good example of
Semantic publishing of knowledgeIn the sense of sharing information in contextAs a basis for scholarly semantic heuristics
A precondition of such an approach is the atomisation of the hitherto monolithic document object in the Linked Open Data paradigm – not so much in the sense of erosion but rather in the sense of de-construction.We're only at the beginning of this process!Adresses:
!
Scholars Communicating and Using Knowledge / Prof. Dr. Stefan GradmannTexts and Literacy in the Digital Age / Den Haag 17-12-2010 51
Suggested ReadingGregory Crane (2006): What Do you Do with a Million Books? In: Dlib Magazine, Vol. 12, March. (http://www.dlib.org/dlib/march06/crane/03crane.html)David Shotton (2009a): Semantic Publishing. The coming revolution in scientific journal publishing. Learned Publishing Volume 22, No 2, p. 85–94, April 2009; doi:10.1087/2009202David Shotton et al. (2009b): Adventures in Semantic Publishing: Exemplar Semantic Enhancements of a Research Article (http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1000361)Barend Mons, Jan Velterop: Nano-Publication in the e-science era ( http://www.surffoundation.nl/SiteCollectionDocuments/Nano-Publication%20-%20Mons%20-%20Velterop.pdf)Alan Renear, Carol Palmer (2009): Strategic Reading, Ontologies and the Future of scientific Publishing. In: Science, August 2009, p. 828 – 832.
Thank you for your patience and attention
Top Related