Building a phonics engine for automated text guidance

Post on 14-Apr-2017

348 views 0 download

Transcript of Building a phonics engine for automated text guidance

Building a Phonics Engine for Automated Text GuidanceDominik LukešDyslexia Action

Chris LitsasNTUA

www.ilearnrw.eu

Outline

• Struggling readers needs• Linguistic background• Phonics engine need• Phonics engine specification• Phonics engine implementation• Phonics engine applications• Next steps

Needs of dyslexic people• Identifying the syllables in a word• Recognising the structure of words (stem,

prefix, suffix)• Highlighting typical or repeated patterns of

English orthography• Identifying phoneme/grapheme

correspondence• Learning the pronunciation of a word• Learning the meaning of a word

Linguistic backgroundDearest creature in creationStudying English pronunciation, I will teach you in my verse Sounds like corpse, corps, horse and worse.Though the difference seems little,We say actual, but victual, Seat, sweat, chaste, caste, Leigh, eight, height, Put, nut, granite, and unite.

Gerard Nolst Trenité - The Chaos (1922)

Linguistic background• tough, though, through, bough, thought,

cough, hiccough• hosp.i.tal vs. hos.pit.al• kitt.en vs. kit.ten• walked, stopped, faked, tried• exgirlfriend vs. exigent vs. exit• English vs. Greek

Phonics engine need

• Finding all examples of ‘a’ spelled to rhyme with ‘hay’ in a text or a corpus.

• Sorting words by their phoneme/grapheme ratio.

• Identifying appropriate syllable boundaries in the written form of a multi-syllable word based on knowledge of the syllable boundaries in pronunciation

Phonics engine specification

• provide automated guidance to students and teachers reading texts (using highlighting as well as explicit information)

• generate more extensive word lists for practice activities within the serious games

• provide information about word structure to the game engine

Phonics engine implementation

• Profile of phonic difficulties• Annotated phonic dictionary• Look up routines

Phonics profile - categoriesBased on a modified and expanded version of Dyslexia Action Literacy Programme• Consonants (49)• Vowels (71)• Blends and letter patterns (131)• Syllables (13)• Suffixes (92)• Prefixes (42)• Confusing letters (15)

Phonics profile (JSON){"descriptions":["a-æ"],"problemType":"LETTER_EQUALS_PHONEME","humanReadableDescription":"a=æ (at) <> Pronounce a as æ. For example: at, as, and","cluster":3,"character":"Short vowel"}

Phonic dictionaryWord form: feelingsRelated stem: feelingPronunciation: ˈfiː.lɪŋzPhoneme/Grapheme Mapping: f-f,ee-iː,l-l,i-ɪ,ng-ŋ,s-z

Orthographic syllabification: fee.lings

Number of letters: 8Number of phonemes: 6Number of syllables: 2Frequency band: 4Suffix type: SUFFIX_ADDSuffix form: sPrefix type: PREFIX_NONEPrefix form: NULL

Building the phonic dictionary• 5,000 most frequent words based on COCA• Generated derived forms by reversing

hunspell• Used online tool to generate pronunciation• Create rules for matching pronunciation with

spelling patterns• Create rules for displaying• Mark suffixes and prefixes and types• Adjust frequencies• Manual fine tuning (lots of regex)

Phonics engine applications

• Phonics aware reader• Game support – generating word lists• Game support – provide word structure• Game support – link word structures to profile• Text classification tool• Online text annotation tool

Phonics aware reader

Phonics aware reader

Game support

Online text tools

Online text tools

Online text tools

Online text tools

Next steps• Bigger dictionary with more information on

words• Finetuning of look up routines• More sophisticated highlighting routines• More sophisticated NLP

– PoS– Sentence structure– Semantics

• WordNet, Framenet• Named Entities• Collocations

www.ilearnrw.eu