Structural Semantics for Accessibility and Device Independence

25
SADIe Structural-Semantics for Accessibility and Device Independence Darren Lunn

description

Presentation describing the SADIe transcoding platform to the Information Management Group (IMG)

Transcript of Structural Semantics for Accessibility and Device Independence

Page 1: Structural Semantics for Accessibility and Device Independence

SADIeStructural-Semantics for Accessibility

and Device Independence

Darren Lunn

Page 2: Structural Semantics for Accessibility and Device Independence

The Web…

• The World’s largest repository of information

• Designed with a focus on presenting information in a visual manner● Images● Animations● JavaScript

• Some knowledge is only available implicitly from how the page looks

Page 3: Structural Semantics for Accessibility and Device Independence

Implicit Knowledge

= Advertisement

= Banner

= Main Content

= Menu

Page 4: Structural Semantics for Accessibility and Device Independence

Assistive Technologies

• Visually impaired users use assistive technologies, e.g. Screen Readers● Render pages sequentially in audio● Achieved by accessing the underlying HTML code

• But focus on visual presentation rather than content hampers this● Particularly if attention is not paid to coherent design● Tags and markup can be abused (e.g. using <h2> for large,

bold, rather than headers)● Subtleties of visual presentation can be lost

Page 5: Structural Semantics for Accessibility and Device Independence

CNN Example

Page 6: Structural Semantics for Accessibility and Device Independence

Assistive Technologies

• Traversal of content is in a serial “top-to-bottom”, “left-to-right” manner.● Based on the underlying HTML code.

• Important information may not be encountered until later on.

• Also, information such as menus or navigation may be repeated for every page on a site● This can prove tiresome if the user has to wait for the reader

to read the menu each time a new page is visited.

• Chunked pages and non-linear presentation further complicate matters

Page 7: Structural Semantics for Accessibility and Device Independence

Existing Solution: Transcoding

• A method of adapting and reformatting Web content so that it is suitable for a wide range of client devices

• Heuristic Transcoding - Uses general rules and heuristics to find areas of the web page

• Semantic Transcoding - Uses annotations to add metadata to the Web page in order to explicitly state the meaning of the elements

Page 8: Structural Semantics for Accessibility and Device Independence

Heuristic Transcoding

• Use general rules and heuristics to find areas of the web page

• Once an area is found, then modify it in some way

• EgIf (row at the top of table && Number of characters between <a> tags > Number of characters between <p> tags)

then (Element is a page menu so do something)

Page 9: Structural Semantics for Accessibility and Device Independence

Heuristic Transcoding

<table cellspacing="0" cellpadding="0" border="0" class="cnnCeilnav"><tr valign="middle" height="22" > <td><a href="/">Home</a></td> <td><a href="/WORLD/">World</a></td> <td><a href="/US/">U.S.</a></td> <td><a href="/WEATHER/">Weather</a></td> <td><a href="http://money.cnn.com/index.html">Business</a> . . . </tr>

<ul> <li><a href="/">Home</a></li> <li><a href="/WORLD/">World</a></li> <li><a href="/US/">U.S.</a></li> <li><a href="/WEATHER/">Weather</a></li> <li><a href="http://money.cnn.com/index.html">Business</a></li>

. . .

</ul>

Page 10: Structural Semantics for Accessibility and Device Independence

Heuristic Transcoding

• General enough to be applied to a large number of web pages● All CNN pages follow this pattern, as do other pages that

have a similar layout template

• Can be inaccurate if the page is slightly different from the pre-existing rules● Eg CNN inserts an additional row containing advertisements

Page 11: Structural Semantics for Accessibility and Device Independence

Semantic Transcoding

• Uses annotations to add metadata to the Web page in order to explicitly state the meaning of the elements

• Eg

<menu> <table cellspacing="0" cellpadding="0" border="0" class="cnnCeilnav">

<tr valign="middle" height="22" > <td><a href="/">Home</a></td> <td><a href="/WORLD/">World</a></td> <td><a href="/US/">U.S.</a></td> <td><a href="/WEATHER/">Weather</a></td> <td><a href="http://money.cnn.com/index.html">Business</a> . . . </tr>

</menu>

Page 12: Structural Semantics for Accessibility and Device Independence

Semantic Transcoding

• Very accurate● We can modify the page layout but as long as the annotations

remain, the transcoding will work.

• Every Web page must be annotated limiting the number of pages that can be transcoded● Time consuming● Issues of document ownership

Page 13: Structural Semantics for Accessibility and Device Independence

CSS

• Cascading Style Sheets support the separation of presentation from content● Information about fonts, colour, positioning etc is held in the

style sheet.

• Style Sheets often have some implicit semantics● This semantics is encoded in the names of the elements

rather than in some formal structure.● Use of terms like header, footer or nav● Layout and presentation can add implicit meaning

Page 14: Structural Semantics for Accessibility and Device Independence

SADIe

• Semantics are implicitly encoded within the visual presentation of the Web page

• Cascading Style Sheets define the visual presentation of the pages within a Website

• Defining the role of the Cascading Style Sheet element, by association, defines the role of the Web page element

• Gain the best of both worlds● Accurate transcoding in the same manner as Semantic

Transcoding● Element definitions of a single CSS can be applied to multiple

Web pages in a manner similar to Heuristic Transcoding

Page 15: Structural Semantics for Accessibility and Device Independence

Annotating The CSS

Upper Level Ontology Extended Ontology

SADIe Application

cnnCeilnav

cnnBodyText

cnnBottomNav

cnnCSS

Page 16: Structural Semantics for Accessibility and Device Independence

SADIe Implementation

• Implemented as a proxy● All browsing requests from the client pass through the proxy,

where transformation takes place. ● Proxy rewrites HTML pages to provide accessible version of

content

• Allows users to:● Defluff – Removing non-essential elements● Re-order – Promoting elements that are considered important

to the top of the page● Toggle Menus – Show/hide navigational menus

SADIe Application

Page 17: Structural Semantics for Accessibility and Device Independence

SADIe Transcoder

Page 18: Structural Semantics for Accessibility and Device Independence

SADIefied CNN

Page 19: Structural Semantics for Accessibility and Device Independence

Evaluation

• We want to show that using SADIe decreases the time it takes to find information on the page

• Four methods of testing information retrieval on Web pages:● Simple Fact Question: Involves the user finding a fact on the

that is either true or false. ● Judgement Question: Involves the user viewing a Website

and providing a judgement● Comparison Of Fact Questions: Involves the user finding a

series of facts and then answering a question that is either true or false.

● Comparison Of Judgement Questions: Involves the user viewing a Website, comparing the facts and reaching a conclusion.

Page 20: Structural Semantics for Accessibility and Device Independence

Evaluation Hypothesis

• H0:– The time it takes to complete a fact based task on a Webpage is the same regardless of whether the page that is used is SADIefied.

• H1:– The time it takes to complete a fact based task on a Web page using a SADIefied page is less than the time it takes to complete a task using a non-SADIefied page.

Page 21: Structural Semantics for Accessibility and Device Independence

Evaluation Methodology

• 20 pages that had similar content that was predominantly text based● News e.g. CNN, BBC, New York Times…● Blogs e.g. Blogger, Xanga…

• Asked the user to find facts that were as similar for each page possible● Eg for news sites “What is the headline of the first story?”

• The user was presented with a page one at a time, some of which were SADIefied

• We timed how long it took the user to answer the question

Page 22: Structural Semantics for Accessibility and Device Independence

Evaluation Results

• So far we have evaluated SADIe with a single user

• Results are encouraging and are significant using Randomization Testing…

• … but we would like more users to support our results.

Page 23: Structural Semantics for Accessibility and Device Independence

Further Work

• This is still preliminary work, and much remains to do

• Analysis of coping strategies● Informing our transformations and transcodes

• (Semi)-Automation of mappings for stylesheets

• Richer upper level ontology● Currently the ontology is essentially a taxonomy

• More User Evaluations

Page 24: Structural Semantics for Accessibility and Device Independence

Conclusions

• Browsing the Web can be difficult for those who are visually impaired

• SADIe can apply transcoding by using implicit information extracted from the CSS

• Initial evaluation results are promising and show that SADIe can help visually impaired users reach content more quickly

• More work still needs to be done

Page 25: Structural Semantics for Accessibility and Device Independence

Questions?

http://www.cs.manchester.ac.uk/img/sadie