KnowIT, semantic informatics knowledge base
-
Upload
laurent-alquier -
Category
Technology
-
view
14 -
download
0
description
Transcript of KnowIT, semantic informatics knowledge base
knowITMapping out Informatics
systems
Laurent Alquier
Keith McCormick
Ed Jaeger
About
• Laurent Alquier• Software engineer, Project lead• Johnson & Johnson Pharmaceutical Research & Development, L.L.C• [email protected]
Could you answer these questions ?
• Can you give us a list of all of your applications, related servers and stakeholders and send us an update every six months ?
• All Linux servers need to be patched this weekend. Can you send an outage announcement with a list of affected applications by tomorrow ?
• Is this server still in use ? Can we retire it ?
• What is the meaning of DRU ?
(Based on real questions)
Systems knowledge
knowIT in a nutshell
• A collaborative database– Semantic wiki
• Capture knowledge about informatics systems– Information Systems components
• Applications, Servers, Data sources, plugins– Map relationships between components– Capture Business context around them
• Organizations, Companies, Locations– Document known issues, procedures,
processes
Goals
• Answer recurring questions– Subject Matter Experts lists for
Application Support– Application / License rationalization– Outage communications
• Increase knowledge retention– Many ways to contribute
• Facilitate “Transfer In / Transfer Out”– Capture knowledge from experts
before they leave– Facilitate learning for new resources
• Enable self service– Many ways to search and explore
Pragmatic approach
• Bottom up knowledge management in a corporate, R&D environment• Search is not enough
– Complementary to a document library with search index– Capture details about individual components of systems– Rely on queries as much as search
• Change will happen– Plan for future integration and migration from the start– Import content from several sources– Export content to several formats
• “Know your content, respect your users. “ – E.Tufte– Accept incomplete content– Evolve the data model as necessary– Let real data, use cases drive requirements
• Above all, remain flexible
Evolution
• Started as disconnected files• Turned into a relational database
– Rigid design– Lack of collaboration tools
• Solution: Collaborative Database using a Semantic Wiki– Collaborative features and flexibility of wiki– Structure from Semantic annotations
Collaborative database
• Flexible yet structured content management – Collaborative data model – Discussions, comments, community editing
• Knowledge management tools – Redirections, wanted pages– Automated maintenance tasks
• Background jobs to enforce consistency and updates
– Monitoring tools, change tracking• Modular and extensible design
– Templates– Open source components
Semantic Media Wiki
• Based on Media Wiki– Proven platform (Wikipedia)– Redirect, wanted pages, templates, API, bots– Active development, commercial support– No licensing fee (PHP, Mysql)
• Structure from Semantic annotations– Inline annotations– Supports forms and direct annotations– Map complex relationships between objects – Allow both Search and Queries– Multiple input / output formats– Compatible with Semantic Web integration
• Semantic Web in a bottle
Semantic Annotations
• Tags with meaning• Syntax
– Triple: Page -> Property -> Value• [[Has support contact::Help Desk]]
• Data types– Page, URL, Date, String, Text, Number, Geo-location– Custom units for Number
• Browse properties– Summary of all properties for a page
Relationships
• Defined as links to other pages– Enhanced with semantic properties
• Tracking lists of things is not enough– Knowledge comes from
understanding relationships• SMW assisted Ontology design
Wiki ? What wiki ?
• Focus on content, not technology• Occasional users less intimidated when wiki tools are not visible
– But keep wiki tools available to advanced users• Use forms to standardize data capture
– Make semantic annotations invisible using forms and templates– Enforce (some) naming conventions
• Auto-completion• Automated page names
• Be ready to provide help with difficult tasks– Provide guidance and training– Front loading wiki with data users care about
Content Migration
• From relational tables to Categories and Pages– Review data model, drop unnecessary attributes– Create forms, templates, properties in Semantic MediaWiki– One category per page
• Separate ‘semantic categories’ from ‘supporting categories’• Extract old content into tabular form• Review, clean up, correct
– Unique titles (Disambiguation)– Special characters in titles
• Load pages in bulk– using PHP API (bulkinsert.php)
• Consider specialized import forms if content needs detailed review– Example: Support articles
Queries
• Visualize structure of content– Ad-hoc reports– Interactive queries (Exhibit)– Automate system configuration pages– Architectural layers
• Business, Functional, Process, Data, Applications, Physical
– Network diagrams• Concepts
– Saved queries, dynamic categories
Enhanced Search
• Default search replaced by Sphinx Search extension
• Faceted search – Drill down by properties– Search results grouped by Category
• Semantic search– Semantic summary instead of excerpt
• Customized by Category– Annotations used to improve results
• Aliases, keywords• Related terms• Selection of default category
• Feedback option– Ask a question
Input flexibility - Data capture
• Import – Manually using Forms– Remote CSV files, databases, LDAP– FOAF format to retrieve and provide
vocabularies– OWL DL ontologies can be imported
• Explicit statements only – no support for reasoning
• Query remote sources– Linked data import
• SMW+ can enrich page annotations with queries across multiple sources
• Supports OpenCalais, DBPedia, RSS feeds
Output flexibility - Data integration
• Export– HTML, PDF, CSV, XML, Email, Maps (Yahoo, Google, Open Layers), Timeline
(Simile), Google graphs, vCard, iCalendar• Machine readable
– Default RSS feed replaced by #ask query for recent content– RDF view for each page– RDFa, CSV index, FOAF files, Web Services (SMW+)– Ontology and content export
• RDF dumps / SPARQL endpoint available• Follows Linked Data principles
– One page per entity– One HTTP URI for each entity– RDF information available from each page– RDF statements are browsable
Familiar look and feel
• Consistent with other intranet sites, familiar interface– Integration with MS SharePoint look and feel using RILPoint theme– Login using global directory
Make basic tasks explicit
• Search, Explore, Contribute– On main page and on side bar
Consistent navigation for every pages
• ‘Table of Content’ links – Browse content
• Using Semantic Drilldown– Categories
• Using Nice Categories List for recursive tree view– Topic
• #ask query for pages with Topic defined as a property– A-Z index / Glossary
• Using a mix of Table of Content template, #urlget and #ask queries• Single link to add New content
– With list of forms available
Reduce clutter
Advanced tasks moved to the bottom of pages• Maintenance tasks• Upload file• Page tools• RDF link• Browse properties
UI Simplification – Special Pages
Custom made administrative tasks page
UI Simplification – Recent changes
Simplified Recent changes using Dynamic Page List extension
UI Customization – Category:Location
Customization of categories according to page type
• Maps for locations• Timelines for events• A-Z index for people
UI Customization – Category:Events
Status - Usage
• After a year – 2900 pages of content (4600 pages total )– 31 registered users ( 5 active contributors )– Between 15 and 75 updates a day– 130 unique visitors/month– 400 visits / 600 searches a month
• Entering phase of growing interest
Status - Content
• Data imported from old system except for Articles and Persons
• Built an ontology of IT systems components
• 550+ Applications, 90+ Databases and 280+ Servers portfolio
• mostly RED systems at this point
• 145 data sources
• Semi-automated generation of Data landscape
• A Glossary of 950+ acronyms and definitions
• imported from multiple sources within J&J and outside
• About 170 support articles, how-to and FAQs
• Another 400 old articles pending review
• 340+ Organizations
• Including 44 J&J Operating Companies
• Google Maps of J&J PRD sites
Features
• KnowIT currently includes: – An IT systems portfolio management (inventory) – A Configuration management tool for these systems (components and relationships) – A Communication component (calendar / timeline of announcements, outages and training sessions) – A Question / feedback list (similar to WikiAnswers) – A Logging mechanism (to track events, outages) – A Service Account Password expiration management (with notification by RSS and eMail) – Semantic / faceted search results – Dynamic maps of known locations (with built-in form to driving directions) – A Self service help system (knowledge base of solutions) – And an Advanced glossary (terms organized by domains, with synonyms, related terms, etc)
• Future directions– Advanced bulk manipulations– Dynamic visualizations of relationships network– Automated annotations using internal and external sources– Improved Semantic search
Observations from day to day use
• SMW is structured yet flexible– Allows for exceptions, changes as well as standardization
• SMW doesn’t get in the way– New content can be added, edited very quickly
• Remember to monitor response time of page edits, search– Use PHP cache, optimization strategies to keep wiki as fast as possible
• Keep a single structure of ‘semantic categories’– Separate from other categories– Use semantic properties for complex categorizations of pages
• Keep realistic expectations– A long way to go before shared ownership and fully documented systems
Acknowledgements
• We would like to thank current and past contributors for their patience, ideas and support :– Jim Gainor– Brian Wegner– Deborah Yates– David Epstein– John Baum– Lisa Valetta
– Dimitris Agrafiotis– Mario Dolbec– Brian Johnson – Emmanouil Skoufos.
Resources
• Semantic MediaWiki– http://semantic-mediawiki.org
• Referata tips for SMW– http://smw.referata.com/wiki/Special:BrowseData/Tips
• Wiki Patterns– http://www.wikipatterns.com/display/wikipatterns/Wikipatterns
• Sphinx search extension– http://www.mediawiki.org/wiki/Extension:SphinxSearch
• RILPoint – SharePoint theme for MediaWiki– http://www.rilnet.com/en/rilpoint-sharepoint-look-alike-drupal-and-mediawiki-skin
• Gruff – Triple store browser for AlleroGraph (Relationships graph)– http://www.franz.com/agraph/gruff/
• Cytoscape – Network graph– http://www.cytoscape.org/