Becta Vms

Post on 16-May-2015

1.394 views 4 download

Tags:

description

Presentation on Becta Vocabulary management systems by Mike Taylor at the CETIS MDR SIG meeting on 2008-02-12

Transcript of Becta Vms

Using standards to makevocabularies available.

The Becta VMS(Vocabulary Management Service)

Mike Taylor<mike@miketaylor.org.uk>

Giraffatitan brancai reconstruction from Paul (1988)

Contents

Vocabularies

Becta

The Becta Vocabulary Management Service

The Zthes XML format

The Zthes web service

Contents

Vocabularies

Becta

The Becta Vocabulary Management Service

The Zthes XML format

The Zthes web service

So what?

What now?

Vocabularies

Vocabularies are sets of terms used to tag documents.

Their use increases both recall and precision of searching.

At the simplest level, all Flickr tags form a vocabulary.

Richer vocabularies have semantics and structure.

Thesauri, taxonomies, ontologies, authority lists andcontrol lists are all more or less the same thing asvocabularies. (Purists will hate me for saying that.)

Semantics and structure

Terms may carry scope notes.

Terms may be listed with synonyms.

Links may exist between terms:BT (broader term) e.g. cat BT vehicleNT (narrower term) e.g. animal NT dogUF (use for, preferred term) e.g. dog UF houndUSE (non-preferred term) e.g. hound USE dogRT (related term) e.g. vehicle RT travel

Mappings to other languages are possible.

(Some semantics and structure can be induced byusage patterns in unstructured vocabularies.)

Sample terms from a vocabulary

dog:UF hound, canineBT animalNT dachsund, dalmatian, poodleScope note: includes domestic dogs only;

wolves and African hunting dogs arelisted separately.

animal:UF creature, beast, bruteBT organismNT dog, cat, Brachiosaurus altithorax, slugRT life

Searching with a vocabulary

Two main ways to use a vocabulary:

1. Visible to the user. Can be browsed to findsuitable search terms.

2. Behind the scenes: non-preferred terms mappedto preferred terms or synonyms expanded.

Expansion of query terms can include expansionto broader and narrower terms, or translated terms.

Relevance ranking can take term-closeness into account.

Becta

British Educational Communications and Technology Agency.

An agency of the Department of Education and Skills.

Oversees procurement of IT equipment for schools.

In charge of e-learning strategy.

Becta VMS

Creating vocabularies is a pain.

Tools are expensive.

Becta needed to facilitate vocabulary creationfor Curriculum Online.

Created the Vocabulary Management System (VMS)-- Studio (not available without training)-- Bank: http://bank.vocman.com/-- Spine

Vocabulary bank

Vocabulary bank

Downloaded XML<Zthes xmlns:dc='http://purl.org/dc/elements/1.1'> <thes> <dc:title>Early Years Foundation Stage</dc:title> <dc:description>Curriculum guidance for the Foundation Stage in England</dc:description> <dc:date>22/10/2007</dc:date> <dc:identifier>eyfs</dc:identifier> <dc:language>En-GB</dc:language> <thesNote label='version'>1.0</thesNote> <thesNote label='globallyUniqueId'>1001-eyfs</thesNote> <thesNote label='authority' vocab='0001-Authority'>QCA</thesNote> </thes> <term> <termId>000639</termId> <termName>Early Support</termName> <termType>PT</termType> <termLanguage>En-GB</termLanguage> <termSortKey>3</termSortKey> <termNote label='globallyUniqueId'>1001-000639</termNote> <termNote label='authority' vocab='0001-Authority'>QCA</termNote> <termNote label='source'>1001-eyfs</termNote> <termNote label='curriculumType' vocab='0001-CurriculumType'>category2</termNote> <relation> <relationType>BT</relationType> <termId>000635</termId> <termName>Inclusive Practice</termName> <termType>PT</termType> <termLanguage>En-GB</termLanguage> </relation> [...] </term> [...]

Downloaded XML: the vocabulary<Zthes xmlns:dc='http://purl.org/dc/elements/1.1'> <thes> <dc:title>Early Years Foundation Stage</dc:title> <dc:description>Curriculum guidance for the Foundation Stage in England</dc:description> <dc:date>22/10/2007</dc:date> <dc:identifier>eyfs</dc:identifier> <dc:language>En-GB</dc:language> <thesNote label='version'>1.0</thesNote> <thesNote label='globallyUniqueId'>1001-eyfs</thesNote> <thesNote label='authority' vocab='0001-Authority'>QCA</thesNote> </thes> <term> <termId>000639</termId> <termName>Early Support</termName> <termType>PT</termType> <termLanguage>En-GB</termLanguage> <termSortKey>3</termSortKey> <termNote label='globallyUniqueId'>1001-000639</termNote> <termNote label='authority' vocab='0001-Authority'>QCA</termNote> <termNote label='source'>1001-eyfs</termNote> <termNote label='curriculumType' vocab='0001-CurriculumType'>category2</termNote> <relation> <relationType>BT</relationType> <termId>000635</termId> <termName>Inclusive Practice</termName> <termType>PT</termType> <termLanguage>En-GB</termLanguage> </relation> [...] </term> [...]

Downloaded XML: a term<Zthes xmlns:dc='http://purl.org/dc/elements/1.1'> <thes> <dc:title>Early Years Foundation Stage</dc:title> <dc:description>Curriculum guidance for the Foundation Stage in England</dc:description> <dc:date>22/10/2007</dc:date> <dc:identifier>eyfs</dc:identifier> <dc:language>En-GB</dc:language> <thesNote label='version'>1.0</thesNote> <thesNote label='globallyUniqueId'>1001-eyfs</thesNote> <thesNote label='authority' vocab='0001-Authority'>QCA</thesNote> </thes> <term> <termId>000639</termId> <termName>Early Support</termName> <termType>PT</termType> <termLanguage>En-GB</termLanguage> <termSortKey>3</termSortKey> <termNote label='globallyUniqueId'>1001-000639</termNote> <termNote label='authority' vocab='0001-Authority'>QCA</termNote> <termNote label='source'>1001-eyfs</termNote> <termNote label='curriculumType' vocab='0001-CurriculumType'>category2</termNote> <relation> <relationType>BT</relationType> <termId>000635</termId> <termName>Inclusive Practice</termName> <termType>PT</termType> <termLanguage>En-GB</termLanguage> </relation> [...] </term> [...]

Downloaded XML: a relation<Zthes xmlns:dc='http://purl.org/dc/elements/1.1'> <thes> <dc:title>Early Years Foundation Stage</dc:title> <dc:description>Curriculum guidance for the Foundation Stage in England</dc:description> <dc:date>22/10/2007</dc:date> <dc:identifier>eyfs</dc:identifier> <dc:language>En-GB</dc:language> <thesNote label='version'>1.0</thesNote> <thesNote label='globallyUniqueId'>1001-eyfs</thesNote> <thesNote label='authority' vocab='0001-Authority'>QCA</thesNote> </thes> <term> <termId>000639</termId> <termName>Early Support</termName> <termType>PT</termType> <termLanguage>En-GB</termLanguage> <termSortKey>3</termSortKey> <termNote label='globallyUniqueId'>1001-000639</termNote> <termNote label='authority' vocab='0001-Authority'>QCA</termNote> <termNote label='source'>1001-eyfs</termNote> <termNote label='curriculumType' vocab='0001-CurriculumType'>category2</termNote> <relation> <relationType>BT</relationType> <termId>000635</termId> <termName>Inclusive Practice</termName> <termType>PT</termType> <termLanguage>En-GB</termLanguage> </relation> [...] </term> [...]

The Zthes formatAn open, freely available, specification:

http://zthes.z3950.org/

Very simple – no attempt to generalise.

In use by various organisations in different domains:Becta (education)Synapse/Factiva (business intelligence)ELVIS/Decomate II/Elise II (European projects)Natural History Museum (biological taxonomy)OCLC (libraries)

Was considered (along with SKOS and MARC authorities)by the BS 8723-5:2007 part 5 committee.

Defeated by NIH syndrome.

The Z in Zthes ... some history

Zthes started life as a Z39.50 profile in 1999.(ANSI/NISO Z39.50 is a venerable search/retrieve standard.)

Zthes was quickly expanded by the addition of an XML format.

An SRU profile for Zthes followed in 2003.(SRU is Search/Retrieve via URL.)

XML format and SRU profile are currently at v1.0 (2006).

Some small additions on the way to support OCLC's use.

Zthes SRU in the Becta VMS

Requests are REST-like URLs:

http://bank.vocman.com/bank-webapp/sru/CurrentTermsoperation=SearchRetrievemaximumRecords=10recordSchema=zthesquery=zthes.relType="BT" and

zthes.termGuid="1000-KSWO-0005"

Search for records related by “BT” (broader term) to the termwith identified “1000-KSWO-0005”, and return the first ten.

query contains a CQL query: simple but powerful.

(This URL omits SRU's version parameter – naughty!)

Zthes SRU response

<srw:searchRetrieveResponse xmlns:srw='http://www.loc.gov/zing/srw/'> <srw:version>1.1</srw:version> <srw:numberOfRecords>1</srw:numberOfRecords> <srw:records> <srw:record> <term xmlns:k-int='http://www.k-int.com/' xmlns:dc='http://purl.org/dc/elements/1.1/'> <termId>KSWO-0005</termId> <termName>Working in groups</termName> <termType>PT</termType> <termNote label='source'>1000-QCA Metadata Standard: XTags</termNote> <termNote label='authority' vocab='0001-Authority'>QCA</termNote> <termNote label='globallyUniqueId'>1000-KSWO-0005</termNote> <k-int:termRevisionNumber>0</k-int:termRevisionNumber> <k-int:termInstanceId>8931</k-int:termInstanceId> <relation> <relationType>BT</relationType> <termId>KSWO</termId> <termName>Key Skills: working with others</termName> </relation> </term> </srw:record> </srw:records></srw:searchRetrieveResponse>

Zthes SRU response: vehicle

<srw:searchRetrieveResponse xmlns:srw='http://www.loc.gov/zing/srw/'> <srw:version>1.1</srw:version> <srw:numberOfRecords>1</srw:numberOfRecords> <srw:records> <srw:record> <term xmlns:k-int='http://www.k-int.com/' xmlns:dc='http://purl.org/dc/elements/1.1/'> <termId>KSWO-0005</termId> <termName>Working in groups</termName> <termType>PT</termType> <termNote label='source'>1000-QCA Metadata Standard: XTags</termNote> <termNote label='authority' vocab='0001-Authority'>QCA</termNote> <termNote label='globallyUniqueId'>1000-KSWO-0005</termNote> <k-int:termRevisionNumber>0</k-int:termRevisionNumber> <k-int:termInstanceId>8931</k-int:termInstanceId> <relation> <relationType>BT</relationType> <termId>KSWO</termId> <termName>Key Skills: working with others</termName> </relation> </term> </srw:record> </srw:records></srw:searchRetrieveResponse>

Zthes SRU response: payload

<srw:searchRetrieveResponse xmlns:srw='http://www.loc.gov/zing/srw/'> <srw:version>1.1</srw:version> <srw:numberOfRecords>1</srw:numberOfRecords> <srw:records> <srw:record> <term xmlns:k-int='http://www.k-int.com/' xmlns:dc='http://purl.org/dc/elements/1.1/'> <termId>KSWO-0005</termId> <termName>Working in groups</termName> <termType>PT</termType> <termNote label='source'>1000-QCA Metadata Standard: XTags</termNote> <termNote label='authority' vocab='0001-Authority'>QCA</termNote> <termNote label='globallyUniqueId'>1000-KSWO-0005</termNote> <k-int:termRevisionNumber>0</k-int:termRevisionNumber> <k-int:termInstanceId>8931</k-int:termInstanceId> <relation> <relationType>BT</relationType> <termId>KSWO</termId> <termName>Key Skills: working with others</termName> </relation> </term> </srw:record> </srw:records></srw:searchRetrieveResponse>

So what?

So what?

The advantage that all web services bring:loose coupling.

As useful as the Becta VMS Bank is, it is not theonly useful application of the vocabularies.

Using the Zthes/SRU web service, anyone can makeapplications that search and navigate vocabularies.

(And they should work with other Zthes/SRU vocabularies.)

So what?

The advantage that all web services bring:loose coupling.

As useful as the Becta VMS Bank is, it is not theonly useful application of the vocabularies.

Using the Zthes/SRU web service, anyone can makeapplications that search and navigate vocabularies.

(And they should work with other Zthes/SRU vocabularies.)

I will not insult your intelligence by using the word “mashup”.

What now?

Becta has to demonstrate that its facilities are useful ...

... which means it has to make them useful.

– Do these facilities help you?– If so, how might you use them?– If not, could they be made useful?– How?

Feedback, please!– Talk to me.– Email me on <mike@miketaylor.org.uk>– http://www.surveymonkey.com/s.aspx

?sm=YJt7RtxHmJQEgQFvXHZSTQ%3d%3d