Paper 317

download Paper 317

of 6

Transcript of Paper 317

  • 8/11/2019 Paper 317

    1/6

    FingerReader: A Wearable Deviceto Support Text Reading on the Go

    Roy Shilkrot1

    [email protected] Maes1

    [email protected]

    Jochen Huber1,2

    [email protected] Suranga C. Nanayakkara2

    [email protected]

    Connie K. Liu1

    [email protected]

    1 MIT Media Lab75 Amherst streetCambridge, MA 02139 USA

    2 Singapore University ofTechnology and Design20 Dover DriveSingapore, Singapore

    Permission to make digital or hard copies of part or all of this work forpersonal or classroom use is granted without fee provided that copies are notmade or distributed for profit or commercial advantage and that copies bearthis notice and the full citation on the first page. Copyrights for third-partycomponents of this work must be honored. For all other uses, contact theowner/author(s). Copyright is held by the author/owner(s).CHI 2014, April 26May 1, 2014, Toronto, Ontario, Canada.ACM 978-1-4503-2474-8/14/04.http://dx.doi.org/10.1145/2559206.2581220

    AbstractVisually impaired people report numerous difficulties withaccessing printed text using existing technology, includingproblems with alignment, focus, accuracy, mobility and

    efficiency. We present a finger worn device that assists thevisually impaired with effectively and efficiently readingpaper-printed text. We introduce a novel, local-sequentialmanner for scanning text which enables reading singlelines, blocks of text or skimming the text for importantsections while providing real-time auditory and tactilefeedback. The design is motivated by preliminary studieswith visually impaired people, and it is small-scale andmobile, which enables a more manageable operation withlittle setup.

    Author Keywords

    Assistive technology; Text reading; Wearable camera;Finger worn interface;

    ACM Classification KeywordsK.4.2 [Social Issues]: Assistive technologies for personswith disabilities; B.4.2 [Input/Output Devices]: Voice;I.4.8 [Scene Analysis]

    IntroductionAccessing text documents is troublesome for visuallyimpaired (VI) people in many scenarios, such as reading

  • 8/11/2019 Paper 317

    2/6

    text on the go and accessing text in less than idealconditions (i.e. low lighting, columned text, unique pageorientations, etc.) Interviews we conducted with VI usersrevealed that available technologies, such as screenreaders, desktop scanners, smartphone applications,eBook readers, and embossers, are commonlyunder-utilized due to slow processing speeds or pooraccuracy. Technological barriers inhibit VI peoplesabilities to gain more independence, a characteristicwidely identified as important by our interviewees.

    In this paper, we present our work towards creating awearable device that could overcome some issues thatcurrent technologies pose to VI users. The contribution istwofold:

    First, we present results of focus group sessions withVI users that uncovered salient problems withcurrent text reading solutions and the usersideographs of future assistive devices and theircapabilities. The results serve as grounds for ourdesign choices.

    Second, we present the concept of local-sequentialtext scanning, where the user scans the textprogressively with the finger, which presents an

    alternative solution for problems found in existingmethods for the VI to read printed text. Throughcontinuous auditory and tactile feedback, our deviceallows for non-linear reading, such as skimming orskipping to different parts of the text, without visualguidance. To demonstrate the validity of our designwe conducted early user studies with VI users toassess its real-world feasibility as an assistive readingdevice.

    Related WorkGiving VI people the ability to read printed text has beena topic of keen interest in academia and industry for thebetter part of the last century. The earliest attainableevidence of an assistive text-reading device for the blind isthe Optophone from 1914[6], however the more notableeffort from the mid 20th century is the Optacon [10], asteerable miniature camera that controls a tactile display.In table1 we present a comparison of recent methods fortext-reading for the VI based on key features: adaptationfor less-than-perfect imaging, target text, UI tailored forthe VI and method of evaluation. We found a generalpresumption that the goal is to consume an entire block

    of text at once, while our approach focuses on local textand gives the option of skimming over the text as well asreading it thoroughly. We also handle non-planar andnon-uniformly lit surfaces gracefully for the same reasonof locality, and provide truly real-time feedback.

    Prior work presents much of the background on fingerworn devices for general public use [12,15], although inthis paper we focus on a wearable reading device for theVI.

    Assistive mobile text reading products

    Academic effort is scarcely the only work in this space ofassistive technology, with end-user products readilyavailable. As smartphones b ecame todays personaldevices, the VI adopted them, among other things, asassistive text-reading devices with applications such as thekNFB kReader [1], Blindsights Text Detective[5].Naturally, specialized devices yet exist, such as ABiSeesEyePal ROL[3], however interestingly, the scene ofwearable assistive devices is rapidly growing, withOrCams assistive eyeglasses [2] leading the charge.

  • 8/11/2019 Paper 317

    3/6

    Publication Year Interface Target Feedback Adaptation Evaluation Reported AccuracyEzaki et al. [7] 2004 PDA Signage ICDAR 2003 P 0.56 R 0.70

    Mattar et al. [11] 2005 Head-worn Signage Color, Clutter Dataset P ?.?? R 0.901

    Hanif and Prevost[8] 2007 Glasses, Tactile Signage 43-196s ICDAR 2003 P 0.71 R 0.64SYPOLE[14] 2007 P DA Products, Book cover 10-30s Warping, Lighting V I users P 0.98 R 0.901

    Pazio et al. [13] 2007 Signage Slanted text ICDAR 2003Yi and Tian [18] 2012 Glasses Signage, Products 1.5s Coloring VI users P 0.68 R 0.54Shen and Coughlan[16] 2012 PDA, Tactile Signage

  • 8/11/2019 Paper 317

    4/6

    FingerReader: A wearable reading deviceFingerReader is an index-finger wearable device thatsupports the VI in reading printed text by scanning withthe finger (see Figure1c). The design continues the workwe have done on the EyeRing [12], however this workfeatures novel hardware and software that includes hapticresponse, video-processing algorithms and different outputmodalities. The finger-worn design helps focus the camera

    (a) Old prototype

    (b) New prototype

    (c) Ring in use

    Figure 1: The ring prototypes.

    at a fixed distance and utilizes the sense of touch whenscanning the surface. Additionally, the device does nothave many buttons or parts in order to provide a simpleinterface for users and easily orient the device.

    Hardware detailsThe FingerReader hardware expands on the EyeRing byadding multimodal feedback via vibration motors, a newdual-material case design and a high-resolution mini videocamera. Two vibration motors are embedded on the topand bottom of the ring to provide haptic feedback onwhich direction the user should move the camera viadistinctive signals. The dual material design providesflexibility to the rings fit as well as helps dampen thevibrations and reduce confusion for the user (Fig. 1b).Early tests showed that users preferred signals withdifferent patterns, e.g. pulsing, rather than vibrating

    different motors, because they are easier to tell apart.

    Software detailsTo accompany the hardware, we developed a softwarestack that includes a text extraction algorithm, hardwarecontrol driver, integration layer with Tesseract OCR[17]and Flite Text-to-Speech (TTS)[4], currently in astandalone PC application.

    The text-extraction algorithm expects an input of aclose-up view of printed text (see Fig 2). We start withimage binarization and selective contour extraction.

    Thereafter we look for text lines by fitting lines to tripletsof pruned contours; we then prune for lines with feasibleslopes. We look for supporting contours to the candidatelines based on distance from the line and then eliminateduplicates using a 2D histogram of slope and intercept.Lastly, we refine line equations based on their supportingcontours. We extract words from characters along theselected text line and send them to the OCR engine.Words with high confidence are retained and tracked asthe user scans the line. For tracking we use templatematching, utilizing image patches of the words, which weaccumulate with each frame. We record the motion of theuser to predict where the word patches might appear next

    in order to use a smaller search region. Please refer to thecode1 for complete details.

    When the user veers from the scan line, we trigger atactile and auditory feedback. When the system cannotfind more word blocks along the line we trigger an eventto let users know they reached the end of the printed line.New high-confidence words incur an event and invoke theTTS engine to utter the word aloud. When skimming,users hear one or two words that are currently under theirfinger and can decide whether to keep reading or move toanother area.

    Our software runs on Mac and Windows machines, andthe source code is available to download1. We focused onruntime efficiency, and typical frame processing time onour machine is within 20ms, which is suitable for realtimeprocessing. Low running time is important to supportrandomly skimming text as well as for feedback, for theuser gets an immediate response once a text region isdetected.

    1 Source code is currently hosted at: http://github.com/royshil/SequentialTextReading

    http://github.com/royshil/SequentialTextReadinghttp://github.com/royshil/SequentialTextReadinghttp://github.com/royshil/SequentialTextReadinghttp://github.com/royshil/SequentialTextReading
  • 8/11/2019 Paper 317

    5/6

    Figure 2: Our software in midst of reading, showing the detected line, words and the extractedtext

    Evaluation

    We evaluated FingerReader in a two-step process: anevaluation of FingerReaders text-extraction accuracy anda user feedback session for the actual FingerReaderprototype from four VI users. We measured the accuracyof the text extraction algorithm in optimal conditions at93.9% ( = 0.037), in terms of character misrecognition,on a dataset of test videos with known ground truth,which tells us that part of the system is working properly.

    User FeedbackWe conducted a qualitative evaluation of FingerReaderwith 4 congenitally blind users. The goals were (1) to

    explore potential usability issues with the design and (2)to gain insight on the various feedback modes (audio,

    haptic, or both). The two types of haptic feedbacks were:fade, which indicated deviation from the line by graduallyincreasing the vibration strength, and regular, whichvibrated in the direction of the line (up or down) if acertain threshold was passed. Participants wereintroduced to FingerReader and given a tablet with textdisplayed to test the different feedback conditions. Eachsingle-user session lasted 1 hour on average and we usedsemi-structured interviews and observation as datagathering methods.

    Each participant was asked to trace through three lines oftext using the feedbacks as guidance, and report theirpreference and impressions of the device. The resultsshowed that all participants preferred a haptic fadecompared to other cues and appreciated that the fadecould also provide information on the level of deviationfrom the text line. Additionally, a haptic responseprovided the advantage of continuous feedback, whereasaudio was fragmented. One user reported that when [theaudio] stops talking, you dont know if its actually thecorrect spot because theres no continuous updates, sothe vibration guides me much better. Overall, the users

    reported that they could envision the FingerReaderhelping them fulfill everyday tasks, explore and collectmore information about their surroundings, and interactwith their environment in a novel way.

    Discussion and SummaryWe contribute a novel concept for text reading for the VIof a local-sequential scan, which enables continuousfeedback and non-linear text skimming. It is implementedin a novel tracking-based algorithm that extracts textfrom a close-up camera view and a finger-wearable device.

  • 8/11/2019 Paper 317

    6/6

    FingerReader presents a new way for VI people to readprinted text locally and sequentially rather than in blocks

    like existing technologies dictate. The design is motivatedby a user needs study that shows the benefit in usingcontinuous multimodal feedback for text scanning, whichagain shows in a qualitative analysis we performed. Weplan to hold a formal user study with VI users that willcontribute an in-depth evaluation of FingerReader.

    AcknowledgementsWe wish to thank the Fluid Interfaces and AugmentedSenses research groups, K.Ran and J.Steimle for theirhelp. We also thank the VIBUG group and all VI testers.

    References[1] KNFB kReader mobile, 2010.

    http://www.knfbreader.com/products-kreader-mobile.php.

    [2] OrCam, 2013. http://www.orcam.com/.[3] ABiSee. EyePal ROL, 2013.

    http://www.abisee.com/products/eye-pal-rol.html.[4] Black, A. W., and Lenzo, K. A. Flite: a small fast

    run-time synthesis engine. In ITRW on SpeechSynthesis(2001).

    [5] Blindsight. Text detective, 2013.

    http://blindsight.com/textdetective/.[6] dAlbe, E. F. On a type-reading optophone. Proc. of

    the Royal Society of London. Series A 90, 619(1914), 373375.

    [7] Ezaki, N., Bulacu, M., and Schomaker, L. Textdetection from natural scene images: towards asystem for visually impaired persons. InICPR(2004).

    [8] Hanif, S. M., and Prevost, L. Texture based textdetection in natural scene images-a help to blind andvisually impaired persons. In CVHI(2007).

    [9] Kane, S. K., Frey, B., and Wobbrock, J. O. Accesslens: a gesture-based screen reader for real-world

    documents. In Proc. of CHI, ACM (2013), 347350.[10] Linvill, J. G., and Bliss, J. C. A direct translation

    reading aid for the blind. Proc. of the IEEE 54, 1(1966).

    [11] Mattar, M. A., Hanson, A. R., and Learned-Miller,E. G. Sign classification for the visually impaired.UMASS-Amherst Technical Report 5, 14 (2005).

    [12] Nanayakkara, S., Shilkrot, R., Yeo, K. P., and Maes,P. EyeRing: a finger-worn input device for seamlessinteractions with our surroundings. InAugmentedHuman (2013).

    [13] Pazio, M., Niedzwiecki, M., Kowalik, R., andLebiedz, J. Text detection system for the blind. InEUSIPCO(2007), 272276.

    [14] Peters, J.-P., Thillou, C., and Ferreira, S. Embeddedreading device for blind people: a user-centereddesign. In ISIT(2004).

    [15] Rissanen, M. J., Vu, S., Fernando, O. N. N., Pang,N., and Foo, S. Subtle, natural and sociallyacceptable interaction techniques for ringterfaces:Finger-ring shaped user interfaces. In Distributed,Ambient, and Pervasive Interactions. Springer, 2013,5261.

    [16] Shen, H., and Coughlan, J. M. Towards a real-timesystem for finding and reading signs for visuallyimpaired users. In Computers Helping People withSpecial Needs. Springer, 2012, 4147.

    [17] Smith, R. An overview of the tesseract OCR engine.In ICDAR(2007), 629633.

    [18] Yi, C., and Tian, Y. Assistive text reading fromcomplex background for blind persons. InCamera-Based Document Analysis and Recognition.Springer, 2012, 1528.