Orthography Development: The ‘ M idwife ’ Approach
description
Transcript of Orthography Development: The ‘ M idwife ’ Approach
Orthography Development: The ‘Midwife’ Approach
Mike Cahill Keren Rice
Colleen Fitzgerald Gwen Hyslop Kristine Stenzel
1
COLANG2014Institute on Collaborative Language Research
Contents of Power Point• Introductory discussion (slides 5-9)• Introduction to ‘Midwife’ approach (slides 10-28)• Overview of linguistic issues (slide 29)• Dealing with allophones (slides 30-42)• Dealing with allomorphs (slides 43-48)• Suprasegmental problems (slides 49-52)• “New” sounds: Dene and Kurtöp (slides 53-79)• Variation and standardization (slides 80-97)• Review of Methodology (slides 98-101)• Further issues (102-119)• A final political example (120-129)• Summary (slides 130-131)• References/Contact info (slides 132-133) 2
• These slides were developed for a course at InField, taught in 2008 by Keren Rice and Kristine Stenzel, in 2010 by Gwen Hyslop and Keren Rice, in 2012 by Colleen Fitzgerald and Keren Rice, and now in 2014 by Keren Rice and Mike Cahill.
• All have first-hand experience in orthography development (detailed at the end of this presentation).
Some background
3
Our Goals:
1. Discuss important questions and parameters (socio-political, technical-linguistic, psycho-cognitive) related to orthography development
2. Consider an approach to orthography development (o.d.) based on community involvement, writing practice, and analysis
3. Provide opportunities for hands-on analysis 4. Exchange experiences, brainstorm, expand
resources 4
Initial Discussion Questions
• What is an orthography?
5
• How would you define the role of the linguist in the process of orthography development?
• What do you think a language community expects from the linguist and from the orthography development process in general?
• What are the features of a ‘good’ orthography and what kinds of things do we need to know in order to develop one?
What is an orthography? Some thoughts for discussion
• Agreed upon system to represent sounds/words/concepts of a language
• Practical tool for communication• …
6
What is the role of the linguist?
• Facilitator• Mediator• …
7
What does the community expect from the linguist?
• Intervention around different spellings and competing orthographies
• Expertise and connections that are not present in the community
• Legitimacy of the language• …
8
What are features of a ‘good’ orthography?
• Easy to learn and to produce• Minimize number of characters, maximize what they represent
• Culturally relevant• Transfer from matrix language• Visually contrastive• …
9
The ‘Midwife’ approach
What is it?
10
The ‘midwife’ approach to the development of an orthography• overall goal: to approach o.d. as a process
• based on exchange and integration of knowledge and experiences of linguist and language community (LC)
• with LC as active participant, sharing ‘joint responsibility’ for final outcome
• methodology: practice of writing and analysis of the language feed into each other
• linguist’s role: facilitator/guide in the practice - analysis dialectic • What kind of practice can help identify and focus the issues so that the
analysis becomes more clear? • What kinds of appropriate metaphors can be useful tools? 11
Basic principles of the approach• notion of o.d. as a process whereby members of a
language community (LC) come to analyze aspects of their own language and develop a new practice: writing
• during the process (which may continue over an extended period of time), orthographic variation is ok
• continuous and reflective practice (LC writing and reading) is always the primary input to language-analysis activities
• LC linguistic knowledge and social interpretations are also a fundamental input
12
An overview of the ‘midwife’ approach to orthography
development
Getting started 1: Discussion with LCGetting started 2: Types of writing systemsGetting started 3: Learnability
13
Getting started in developing a writing system – 1: discussion with the LC
Why do we need to study our own language in order to think about writing it?
Discussion:How are oral language and written language similar and how are they different?
14
Written Language
Communication between people in different places/times (extended).
Requires more complete forms and additional symbols to aid understanding – tools to make sure that the writer’s message will reach the reader intact.Tends to reflect stable forms, changes more slowly.
•May include (or not) variation that represents differences between groups of speakers of the same language, especially during initial phases;•May be unified (or not) as a result of process of practice, analysis (discussion of variations and what they represent), and political decision-making.
Oral Language
•Communication between people in same place/time (immediate).
Allows for reductions, use of body language and abbreviated deictic references, because misunderstandings / doubts can be resolved then and there.
Is where innovations and change appear first.
Always includes more types of variation, which may show different origins, group affiliations, or contexts requiring different registers (e.g. formal/informal).
15
15
Getting started – 2: presentation/discussionof types of writing systems and the symbols they use What do symbols represent in different types of writing
systems?1. ‘Morphographic’ / ‘Logographic’ representations of
words or morphemes
16
2. ‘Phonographic’ systems: representations of syllabic combinations
Cree
17
3. ‘Alphabetic’ representations of individual sounds
The traditional thought in o.d. is that each symbol in an alphabet should represent a phonological segment, (ideally) corresponding (as directly as possible) to the phonemes of the language
Consonantal alphabets: symbols represent consonants
Full alphabets: symbols represent consonants and vowels (e.g. Greek and Latin alphabets) 18
Mayan
make by he light-fire Itzamna our Godfriction
‘Our God Itzamna made his fire using friction.’
19
All orthographies change over time
Roman Alphabet (2,600+ years old)
20
Getting started – 3: discussion of ‘Learnability’
Who is the writing system for? Will it be used primarily by native speakers? Learners of the language?
21
What kinds of orthographic features might help increase ‘learnability’ for each of these target groups?
An important assumption
• There is a writing system to begin from. (For instance, learners are literate in another language such as English or Spanish.)
22
• In such cases, familiarity with an existing system will probably lead the LC to adopt a similar type of orthographic representation, but will require analysis so that they can recognize where adjustments need to be made.
A ‘getting started’ exercise for the LC
This type of exercise works well in workshop-type situations, and will likely provide activities for many days of work. It is a good way to get a large variety of members of the LC involved in the discussions. If activities are organized in groups, literate and non-literate individuals and speakers with varying degrees of fluency can have input.
23
A. LC participants choose a theme (or themes) and write short texts (individually or in groups)
B. Participants exchange texts to read, making lists of doubts they encounter or alternative ways to write specific words 24
C. Participants present their doubts and suggestions to the entire group – this is the data that will guide the analysis and inform decision-making
25
What kinds of information are likely to be revealed by this initial exercise?
In terms of orthographic symbols, that:
• Various symbols are being used for the same sound
• No symbol is available for a sound in the language
Both cases may result from the effect of literacy in a different language or from alternate existing orthographies. Recognition of where the problems lie is a first step in analysis.
26
Sets of examples of important phonological elements and indications as to their ‘functional loads’
Evidence of allophonic variation
Indication of variation between speakers of different ages or from different regions
In terms of phonology:
In terms of morphology: questions as to word boundaries and other morphological
issues such as what to do about compound words or complex constructions
27
27
D. As the participants present the results of these activities, the linguist should be able to recognize and group together the different categories of ‘doubts’ and begin to think about how to work on them with the LC
E. Subsequent activities should focus on individual issues, analyzing them with the LC so that informed decisions can be made collectively
Continuing the exercise . . .
28
28
Linguistic issues: what to do about . . .
• Allophones• Allomorphs• Suprasegmentals• Sounds in the language that are not represented in a known writing system
29
29
Representing allophones
30
Allophones in English
pool [ph]spool [p]
Allophones have the same representation in the orthography.
31
31
An example of allophones and their representation in the orthography in o.d.:
Kotiria (Eastern Tukanoan) [d] and [r]In this language, as onset consonants, these sounds occur in complementary distribution: [d] word-initially and [r] word-internally
dukuri ‘manioc roots’duhire ‘you/he/she/they sat’diero ‘a dog’ What decision was made in this case?
32
32
Analysis with the LC
a) Participants in the language workshop compiled a list of words containing the two sounds from their own written texts
b) All occurrences of the sounds were highlighted, so that participants could visually observe their distribution
c) Participants were asked if they could think of other words with different sounds in the positions of [d] and [r] (in other words, to find minimal pairs), leading to analysis and recognition of /d/ as a ‘basic sound’ (phoneme) and [r] as a ‘variant’ (allophone)
33
33
d) Once speakers had observed and analyzed for themselves that [r] was a variant of /d/ in a specific position, it was possible to discuss whether or not to represent it with a different symbol
e) Collectively, several of the texts were re-written using only ‘d’ and speakers were asked to evaluate how they felt, as writers and readers, about the use of a single symbol
34
34
Coming to conclusionsf) While recognizing /d/ as underlying sound, use of the symbol ‘d’
in both positions felt uncomfortable to the participants. They argued that it contradicted a well-established surface distribution of sounds, making the written and spoken versions of the language look too different. Additionally, use of ‘d’ in word-internal position made the written texts look like they represented the pronunciation of closely-related languages in the family, in which the d-r distribution does not occur.
35
35
g) Thus, the LC has opted to use different symbols for the ‘d’ and ‘r’ sounds in the orthography, a decision informed by linguistic analysis but respectful of input from the LC as the end users of the system.
What if…?
• Kɔnni (Gur, northern Ghana) has a similar distribution of [d] and [r]:
• dàáŋ ‘stick’ dígí ‘to cook’• kʊrʊbâ ‘bowl’ chʊrʊ ‘husband’
• These appear to be allophones of /d/.
But, some complications:[d] is intervocalic when it’s• lexeme-initial (in a compound word) jùò-dìkkíŋ ‘cooking room’ (cf. digi ‘to cook’)
• in borrowed words and ideophones kòdú ‘banana’ (Twi) bìn-dúdù ‘dung-beetle’• Discuss: Does this make a difference? What other questions would you ask?
Factors to check:• Speakers’ preferences• Other neighboring languages• Any other linguistic or psycholinguistic evidence?• …• Almost totally illiterate group, not informed enough to express a
preference• All related languages have both <d> and <r>. Sometimes separate
phonemes, sometimes not. Also, influence of English.
• And…
A test…• Other voiced stops lenite intervocalically; couldn’t
/d/ also? Stops occur in careful speech, fricatives in casual.
• bɔbɪ ~ bɔβɪ ‘to tie’• hɔgʊ ~ hɔɣʊ ‘woman’• However, Kɔnni speakers can tell the difference in [d]
and [r], and corrected my pronunciation when I attempted *[hààdɪŋ] rather than [hààrɪŋ]’boat’
• Conclusion: /d/ and /r/ have recently become separate phonemes, and conforming to other languages, are written with two symbols.
Another Allophone Example• Choctaw, a Native American language in Mississippi and Oklahoma has had a number of orthographies.
• The language has three vowel phonemes: /i o a/,which can be short, long, or nasalized.
• However, the writing systems often use six symbols, following how the language was written in the 19th century, associated with Cyrus Byington.
40
40
Choctaw Vowels
• Two of the allophones of phoneme /a/ get represented in the writing system. Allophone [a] tends to appear in open syllables, written as a.• chaha 'tall'• taloowa 'sing'
• The other allophone, [ə], tends to appear in short closed syllables and is written using a symbol not used as a vowel in English, ν. • anνmpa 'word, language'• kνllo 'hard'
41
(from Alphabet links at Choctaw Language School online)
41
Conclusions from the Choctaw allophones
• This example using allophones shows that sometimes the choice is made to write allophones.
• Not all Choctaw vowel allophones are represented with unique orthographic symbols, but some are.
• We will see some parallels in the upcoming allomorph examples from English, where some of the variation can be chosen to be written overtly in the writing system.
42
42
Questions of allomorphy
43
43
Problems of allomorphy
Shallow vs. deep orthographies
Shallow: close to pronunciation
Deep: preserves graphic identity of meaningful elements
44
44
English allomorphy
A combination of deep:cats [s] dogs [z]
And shallow:intangible [n] impossible [m]
45
45
Allomorphs: Dene voicing alternations
sa ‘watch’ sezá ‘my watch’xa ‘hair’ seghá ‘my hair’shá ‘knot’ sezhá ‘my knot’
Shallow orthography? sa sezá(“phonetic”)
Deep orthography? sa sesá(“phonemic”)
46
46
The process 1
An orthography standardization committee was established to make decisions about the orthography.A few decisions involved symbols; most involved spelling conventions.The committee considered basic principles – audience, goals of writing, transfer from English, …
47
47
The process, continued
The committee identified areas of concern with the different choices.People experimented with the different ways of writing words with these alternations.Decision: shallow orthographyWhy? Easier to figure out from the pronunciation
48
48
Beyond the segment: suprasegmental problems
49
49
Nasalization in Tukanoan languages
In Tukanoan languages, nasalization is a property of the morpheme rather than of individual segments, thus it functions as a suprasegment and the question quickly arises as to how it should be represented in the orthography of these languages
50
50
Analysis of nasalization with the LC:
‘raincoat’ nasalization – ‘covers’ individual segments (e.g. in Portuguese)
‘umbrella’ nasalization – ‘covers’ the entire morpheme (Tukanoan languages)
Finding metaphors to help speakers understand how nasalization works in different kinds of languages . . .
51
51
Some nasalization proposals
Over time, after analyzing and understanding how nasalization operates in Tukanoan languages, a number of different proposals for how to mark nasalization were ‘tested’ by participants in language workshops. In each case, writing and reading exercises using the different possibilities were proposed, practiced, and then evaluated. Eventually, it was collectively decided that:•Morphemes with nasal consonants (m, n, ñ) require no further marking, the nasal C being sufficient to identify the morpheme as +nasal•In morphemes with no nasal C , the first vowel is marked with a tilda: v to indicate the morpheme as +nasal
52
52
‘New’ sounds: sounds not distinctly represented
in a known writing system
53
53
‘New’ sounds 1: Dene mid front open and closed vowels
•Early orthography used the symbol {e} for both an open and closed front vowel.
•Both these vowels exist in some dialects.
54
54
The process• A question: Should these vowels be differentiated?
• The answer: yes!• Why?
•Accurate representation of sounds of the dialect•Ease of reading
55
55
The process: What symbol to use?• A new symbol is needed.• The open vowel is more common than the closed vowel.• Choice: symbol {e} for the open vowel; schwa (‘upside
down e’) for the closed vowel• This choice was made because the open vowel is more
common, and it meant fewer changes in how people were already writing.
• This decision was a surprise for some of the linguists involved, but people liked it because they knew that schwa was used in linguistics.
56
56
‘New’ sounds 2: The case of Kurtöp• Kurtöp is a Tibeto-Burman language of Bhutan• About 15,000 speakers• Speakers who are literate are usually familiar with 1) English and 2) Dzongkha
• Roman orthography was a natural product but the ’Ucen system was suggested by community and Dzongkha Development Commission
57
57
Which versions of ‘Ucen?
<tshugs.yig> tshui
<mgyogs.yig> joyi
•We opted to begin with joyi, since it was what children learnedand was purely Bhutanese (as opposed to tshui, which is shared with Tibetan).
58
58
The ’Ucen syllable
In the Classical Tibetan Orthography,an abugida derived from Brahmi, and devised in 632 AD, syllables are represented according to this diagram.
The “R” represents a simple onset, or in the case of an onset-less syllable, the vowel. C1, C2, and C4 may be used to add consonants to the onset, making it complex. The V slots are for vowels (i, e, o go above; u goes below). C3 represents a single coda (if present) and C5 makes a complex coda (rarely occurs).
59
59
The ’Ucen syllable
<bsgrubs>
For example, this is how the Classical Tibetan word /bsgrubs/ was written. The complex onset is represented by <b> in C1 position, <s> in the C2 position, <g>in the root position, and <r> in the C4position. The vowel /u/ is represented below the C4. <b> in C3 and <s> in C5indicate the complex coda.
60
60
The ’Ucen syllable
Traditionally, there is a fixed number of symbols available for each slot. C1 may be one of five symbols.; R may be one of 30; C2 may be one three (one of which is modified from its occurrence elsewhere); C3 draws from ten possible symbols; C4 draws from a set of five (mainly) ‘half’ symbols; and C5 may be one of two. The top V may be one of three vowel diacritics and the lower V is reserved for one diacritic.
In Joyi, various combinations of C2 with R, or C4 with R, lead to unique symbols reserved for the exclusive representation of the combination, similar to ‘conjucts’ in devanagari.
61
61
’Ucen and Tibetan• Classical Tibetan phonology had around 28 consonants (labial, dental, palatal velar).
• And complex onsets• And five vowels• No tone
• ’Ucen was designed for this phonology
62
62
’Ucen and Tibetan• However, after almost 1,400 years of change, Lhasa Tibetan (the prescribed standard) has:
• A new series of retroflex consonants• Two new vowels (front high and mid rounded)• High and low tonal registers; level and falling tonal contours
• Changes in voicing/aspiration contrasts• Simplified onsets• Words are NOT pronounced as written!
63
63
’Ucen and Bhutan• The modern use of ’Ucen assumes the 1400 years of
change from Classical Tibetan to modern Lhasa Tibetan.• ’Ucen is used this way in Bhutan; for example, words
with complex onsets in Classical Tibetan are still written as such in modern Tibetan/Dzongkha, but not pronounced as such.
• Representing any pronunciation using ’Ucen entails the reader to infer the sound change.
• There is no way to represent various aspects of the phonology – such as the complex onsets – in the history of Bhutanese education.
64
64
’Ucen and Tibetan
•For example, the spelling <bsgrubs> ispronounced: ɖùp
<bsgrubs>
65
65
’Ucen and Kurtöp• Kurtöp is not a descendent of Classical Tibetan.• The phonology of Kurtöp is different from the phonology of Classical Tibetan or Dzongkha.
• Kurtöp tone, vowel length, and complex onsets are particularly difficult to represent.
• The following is an illustration of how we chose to represent complex onsets.
66
66
Kurtöp phonology
Kurtöp complex onsets
67
67
The problem
<pr-> is pronounced as a voiceless retroflex, butin Kurtöp /pra/ = ‘monkey’
68
68
Midwife process• So what do you do with the previously unwritten Kurtöp?
• We presented ideas to a small group of literate Kurtöp speakers;
• Consulted local teachers• Consulted highly educated speakers of related languages with similar phonologies
69
69
Midwife process
Idea 1: Use ’Ucen in a way similar to Roman.
But the following problem developed: How to represent vowels other than /ɑ/?
<pra>
70
70
Midwife processThis would be confusedwith /lé/ in Dzongkha/Tibetanconventions
This leads people to tend to pronounce theword correctly, but does not follow thetraditional conventions and is unattractive.
<ble>
<bele>
71
71
Midwife process
• In 2009 we organized a workshop with the Dzongkha Development Commission, Scott DeLancey, local leaders and interested community members to address all the issues
72
72
Proposed solution
•We will add ‘half’ letters to be used directly below the root consonant.•Based on existing (but rarely used) conventions establishedin Tibetan to represent different languages.•Should not affect Dzongkha transference issues•Aesthetically pleasing•Kurtöp speakers find it intuitive and easy to read
73
73
Proposed solution – not whole slide
•Existing computer fonts do not allow the needed combinations
•Chris Fynn, DDC font developer, agreed to adapt the Bhutan ’Ucen fonts (joyi and tshui) to accommodate the new combinations
• In addition to the complex onsets, the adapted fonts will be able to mark tone
74
74
Proposed solution
•Tshui font is finished but the Joyi font has been held up indefinitely for unknown reasons.•In addition to handling the ‘new’ complex onsets, we also have a way (marks above the other symbols in top row) to mark tone, another ‘new’ sound.
75
75
Moving forward (the midwife process continues)
•The Kurtöp/English/Dzongkhadictionary is expected to bepublished in 2013.
•Kurtöp entries will use the newfont and proposed combinations, in Joyi if it is made soon, or else using Tshui.
•Testing will continue…
76
76
Complex scripts• SIL’s “Non-Roman Script Initiative” (NRSI) works to develop
computer solutions for complex scripts. (http://scripts.sil.org/cms/scripts/page.php?item_id=Welcome)
Also see scriptsource.org for a participative site.
What have we seen – and not• Linguistically, we looked at orthography choices with
respect to the implications of representing:• allomorphs, allophones, suprasegmentals, and sounds in the
language that are not represented in a known writing system• With sounds that are not represented in a known
writing system, different choices might be made, with different pros and cons to each choice.
• Let's consider Choctaw, which has a voiceless lateral, IPA symbol /ɬ/.
78
78
Considering implications of symbol choice and the language's phonology
• Using IPA: • Pro: linguistic representation, new representation for unfamiliar sound• Con: font, no transference
• Adapt English symbols:• Pro: familiar symbols• Con: symbols used in unfamiliar ways
• We could imagine lh and hl. Choctaw uses both, in different environments.• lh (pνlhki 'fast') before a consonant and hl before a vowel (hlampko 'strong') • Pro: uses familiar symbols, no font challenges• Con: Confusion with phonemes [h] and [l] with words like (mahli 'wind', asil.hah
'to request')
79
79
The realities of language: variation, and standardization
80
80
Orthographic Variation
Discussion questions: • Is orthographic variation a problem
and if so, why?• What kinds of variation are we likely
to encounter?• What kinds of things can variation
represent?
81
81
Standardization
•What are some of the advantages and disadvantages of ‘standardization’ or ‘unification’ of an orthography?
82
82
Kinds of variation
Variation at a regional level
Variation at a local level
How can these be dealt with?
83
83
Between community variation: an example from Dene dialects
South Slavey Mountain Déline Hare-tthí -pí -kwí -fí ‘head’tth’ih p’ih kw’ih w’i ‘mosquito’tha fa wha wa ‘sand’-dhe -ve -we -we ‘belt’
Should there be a common spelling for the different dialects?
84
84
The process
- Discussion of dialects: systematic differences- Discussion of spelling possibilities
- one spelling for all dialects?- different spellings for each dialect?
85
85
The decision
Write each dialect with its own symbols (e.g., tth’ih in South Slavey and w’i in Hare)
Reasons- transferability from English- dialect identity
86
86
Within community variation: an example from Dene
zha ya ‘snow’zhú yú ‘clothing’-zhíi -yíi ‘inside’
• Some questions to askWhat might underlie this variation? Is the variation really free?
87
87
The first decision
We began with a discussion of variation and the different ways of dealing with it.
The first decision: standardization-Write zh if it is ever used in that
word.-If only y is used, write y.
88
88
And the development over time
This did not work in practice-variation among individuals-no resource materials
Consequence: Both zh and y are used.
Lesson: Early decisions might have to be changed based on practice.
89
89
From related dialects to related languages
• The Dene example shows how different sounds are treated in closely related varieties.
• What choices might be made in representing similar sounds in closely related languages?• One possibility would be to choose the same symbol.
• Another would be to represent the same sound in different ways. This is what has happened in Muskogean languages.
90
90
How to represent similar sounds in closely related languages?
• The Muskogean languages include Muscogee (Creek), Seminole Creek, Choctaw, Chickasaw, Alabama and Coushatta/Koasati.
• All have a phoneme /ɬ/, a voiceless lateral, but the languages make different orthographic choices.
• Choctaw uses lh (pνlhki 'fast') before a vowel and hl before a consonant (hlampko 'strong')
• Chickasaw uses lh consistently (hilha 'dance')• Muscogee (Creek) uses r (rvrŏ fish)• Alabama uses ɬ (ɬaɬo 'fish')• Coushatta uses th (thatho 'fish')• Linguists vary in documentation, mostly lh or ɬ
91
91
Writing and Variation in O'odham• The O'odham varieties include Tohono O'odham (formerly
Papago), 'Akimel O'odham (formerly Pima), and the Mexican variety, Sonoran O'otam.
• Multiple writing systems in use, which were developed in a variety of contexts.• Tohono O'odham Nation and the Salt River community use the Alvarez
and Hale orthography, which was developed as a linguist-native speaker collaboration.
• The Saxton orthography is a practical orthography and was tested out with native speakers, and is used in the Gila River Indian Community.
• The influence of Spanish as a transfer is leading the Sonoran O'otam to consider another option.
• Linguist Madeleine Mathiot uses yet another system in linguistic documentation.
92
92
93
Some differences in the four writing systems for shared sounds
A&H Saxton Mathiot Sonoran proposal
Long vowels a: ah aa aa
Palatals ñ ni ñ ñRetroflexes ḍ ṣ d sh ḍ x sh th
Voiceless vowels ĭ n/a ï n/a
Palatal affricate c ch c ch
Glottal stop ' ' ˀ '
Lateral flap l l l r93
Sounds which vary across dialects• [w] vs. [v]/[ʋ] –
• Alvarez and Hale goes with w• Saxton goes with w• Mathiot goes with v• Sonoran proposal goes with v
• Dialect variation within Tohono O'odham dialects for certain vowel sequences, like io or eo hiosig vs. heosig 'flower'
• These are acknowledged and both end up being used.
94
94
Two types of standardization• “Unilectal” – the most prestigious speech variety is
chosen. The rest adapt to this.• “Multilectal” – some elements are chosen from several
dialects. No dialect is favored.
• What are some advantages and challenges of each?
95
Pros and cons• Unilectal
• Advantage – simplicity. Once the dialect is chosen, don’t have to focus on the others.
• Challenge – picking the dialect! What counts for “most prestigious?”• Appropriate when everyone can agree on “the dialect”
• Multilectal • Advantage – doesn’t favor one group over another.• Challenge – doesn’t represent anyone’s actual speech• Appropriate when no clear “prestige dialect”
96
Standardization
Standardization often emerges as the writing system is used; it may not be the best starting point.
What do potential users want from writing?
97
97
A review of the method
98
98
Methodology: a review
The ‘midwife’ approach views input from LC as fundamental, this input consisting of:
• practice (written material produced by the LC that concretely reveals issues for analysis, discussion, and decision-making)
• LC insights (about the language itself, socio-political issues, and their experiences)• Do members of the LC regularly write/read in any
language? Are writing/reading themselves new experiences for them? How can these new practices be expanded and reinforced?
99
99
• The approach also relies on interwoven activities of analysis leading to periods of experimentation of whatever ‘decisions’ have been agreed upon, with ongoing evaluation by the LC in both the roles of writers and readers.
• The LC may be viewed more broadly, as in Bhutan, in which the government is necessarily involved.
• Throughout the O.D. process, the linguist should build ongoing written record with explanations and examples of the analysis and discussion that went into each decision.
100
100
The role of the Linguist
practice
practice
analysis
choices
analysis
evaluation
monitor and interpret written input
LCorganize analysis
and discuss options
suggest further practice and record decisions
interpret LC feedback,
looking for clues as to:
• the functional loads of phonol.
features•other important cognitive issues
• interference issues
•socio-political issues
101
101
Some further issuesfunctional loadcognitive needssocio-political issuestechnological issueswho is the audience?
102
102
Evaluating the ‘functional load’ of supra-segmental features: examples from Kotiria
In Kotiria, three suprasegmentals are associated to root morphemes : nasalization, glottalization, and tone
• Minimal pairs are found for all three:
waha‘drag/row’
wãhã‘kill’
doa ‘envy’
do’a‘cook’
kóró ‘rain’
kórò‘umbrella’
hu ‘smoke’
hu‘worm’
maa‘stream’
ma’a‘be small’
khòá ‘leave’
khóá‘part/half’
sa’a‘dig’
sã’ã‘electric eel’
waa‘give’
wa’a‘go’
sóà‘grind’
sóá‘rest’
kha‘hawk’
khã‘chop’
wama‘name’
wa’ma‘young/new’
báa‘decompose’
baá‘swim’
103
103
nasalization glottalization tone*** ** *
• + salient• (Roots only)• ++M.Ps• reductions occur in
morphological processes
• marked most, but not all, of the time in spontaneous writing
• ++ salient • (Roots / Suffixes)• ++Min.Pairs• value unaffected by
morphological processes
• always marked in spontaneous writing
• + salient • (Roots, few
Suffixes)• +M.Ps• melody variable
in morphological processes
• not marked in spontaneous writing
However, despite shared phonemic status, each suprasegmental feature has a different functional load. This variation is manifested in spontaneous writing and has been discussed throughout the o.d. process.
104
104
Recognizing cognitive issues:
[dahpo] ‘head’[mahsa] ‘people/beings’[tuhti] ‘to bark’[puhka] ‘blowgun’[dahʧo] ‘day’
dapo masatutipukadacho
In Kotiria root morphemes, internal voiceless Cs are always pre-aspirated, a regular allophonic variation. From the purely linguistic perspective, this aspiration would not need to be represented in the orthography.
Thus, the words could be written as:
105
105
Thus, the words are written as: [dahpo] ‘head’
[mahsa] ‘people/beings’
[tuhti] ‘to bark’
[puhka] ‘blowgun’
[dahʧo] ‘day’
However, given the salience of this aspiration and the fact that when written, it helps readers identify the root morpheme in a word, the decision was made to represent this pre-aspiration in the orthography.
dahpo mahsatuhtipuhkadahcho
+ articulatory salience+ root recognition in reading
106
106
Examples of symbolic-political choices in o.d. in Kotiria Use of the symbol ‘k’ over ‘c/q(ui/e) – a macro-level choice, to
distinguish the writing system of the indigenous language from those of the national languages (Spanish/Portuguese)
Use of the symbol ‘ʉ’ over ‘ɨ’ – a regional-level choice, to differentiate the orthography of a minority indigenous language from that of the locally dominant indigenous language (Tukano proper)
Variation between use of the symbols ‘w/v’ among the Kotiria from different regions – a group-internal choice distinguishing sub-groups within the Kotiria population
107
107
An attempt to standardize
<tʃ> , <ts> , <c> , <ch> , <ky> , <tsch>
• In the 1980’s, the Ghana Alphabet Standardization Committee was formed to standardize the set of symbols that could be used in Ghanaian language alphabets.
• Case: [tʃ] sound was written as:
• Which one to choose? The answer was obvious, both to me (just observing) and to others on the Committee…
• I thought “of course, <ch>. Why?• Committee member said “The choice is obvious: <ky> !”• That was used in his language, Akan, the biggest language in Ghana.
“Obviousness”: depends on your background.
Cognitive/social issues: Phonemic-based systems might prove unpopular
Choosing English-based writing systems over phonemic systemsNavajo code talkers Young and Morgan
dictionarywol-la-chee ‘ant’ wóláchíí’shush ‘bear’ shashmoa-si ‘cat’ mósíklizzie ‘goat’ tliízí
109
109
Familiarity with English-based systems
Eastern Pomo
phonemic local Englishorthography -based orthographykáli caw lee ‘one’do:l dole ‘four’lé:ma leh ma ‘five’
110
110
One more factor• What is the writing system for? What does a writer/reader want from it?
primacy or written text?valuable information about the speaker?symbolic system?something else?
111
111
Writing Systems for Endangered Language Communities
• Issues when literacy is used for second language teaching because of transfer effects.• O'odham has a high central vowel, IPA /ɨ/. All the
writing systems in the U.S. use the symbol e to represent this. The language uses l to represent a flap (IPA /ɺ /), another possible point of confusion.
• Muscogee (Creek) uses r for the voiceless lateral (IPA /ɬ/).
• Can hinder learner awareness of the unique sounds of the endangered language because of literacy in the majority language.
112
112
Parameters: Socio-political• need for community involvement in o.d. process• acceptability of orthography (locally and in larger context)
• relationship with dominant language – use of conventions• symbolic issues (±differentiation)• literacy transference issues (±learnability)
• standardization / variation
113
113
Parameters: Techno-linguisticRepresentation: what to represent, how to represent it,
where to represent it• choice of script, symbols, conventions• identification of phonemes/allophonic
processes/other phonological processes/ morphological processes
• evaluation of functional loads• evaluation of resources where information can be
registered, if not in the orthography itself (practical grammar, dictionary, etc.)
114
114
Parameters: Psycho-cognitive
‘Learnability’ (Orthographic depth) • shallow O: (close to pronunciation)
• + learnability for beginners and non-(fluent) speakers• - readability (may obscure morpheme identities)• harder to standardize dialect variation
• deep O: (preserves graphic ID of meaningful elements)• - learnability for beginners and non-(fluent) speakers• + readability • easier to standardize dialect variation
115
115
More on reading and writing• Underrepresentation – using fewer symbols than phonemes that exist in the language
• Can you think of an example?• Example: Akan (Ghana) has contrastive nasalization
on vowels, contrastive tone, and 9 phonemic vowels. Tone and nasalization are not marked, and 7 vowels are represented in the orthography (developed over a hundred years ago).
• Underrepresentation• What are the general implications for reading?
• Since you can’t distinguish phonemic contrasts, reading is more difficult
• For writing? • Writing could be easier, since you don’t have as many
choices to make• What can complicate this picture?
• Reading can be more difficult, but context often can disambiguate, and fluent readers may be able to cope with this.
• Overrepresentation – using more symbols than phonemes that exist in the language
• Can you think of an example?• Koteria <d>, <r> for /d/.• Choctaw <a>, <v> for /a/.• All cases where different allophones are represented
• Overrepresentation• What are the general implications for reading?
• Need to be taught two symbols for a phoneme, but the shallow orthography can be easier to read
• For writing? • Writing could be harder, since you have to deliberately
think about which symbol to use.• What can complicate this picture?
• The salience of different allophones can make a big difference. If speakers are aware of the allophones, then fewer problems.
More on Politics: SE Asia(condensed from Adams, Larin. 2014. Case studies of orthography decision making in Mainland Southeast Asia. In Cahill & Rice (eds.), Developing Orthography for Unwritten Languages. )
• Scripts are not neutral. Commonly:• i) use a variation of the national script, sometimes by
governmental decree• ii) use a romanized script
• Complications• But languages can cross borders, complicating matters.
Which national script?• Competing religious identities: Buddhist, Christian, Muslim,
Animist. Buddhist and Christian (Protestant and/or Catholic) often have local associations.
Case study: E and H• A man, “J” was sent to the capital to find help in
developing an orthography, starting a literacy program (train teachers, provide production workshops, pay for publishing) and translating the Bible into H. Contacted SIL, as a known organization.
• At first, no way to verify J as legitimate rep of H. (He was.)
• J said the project should include H and E (he said E was a very close dialect of H)
• H had formed a literacy committee• 3 people from H were invited to a literacy workshop
E and H: money• After the workshop, participants given funds to
promote literacy in their villages: teaching non-readers, publication of ‘literate’ by-products such as calendars or brochures for special events.
• Difficult to monitor how these funds are actually used. • One effect of participants returning from the workshop
in the capital with money was to create interest. However, in this case the interest now appears to have been more about money and less about literacy.
E and H: contact by E• The next literacy workshop a few months later
included a new delegation of E speakers. They claimed to represent the E group mentioned by J.
• E had no organizational equivalent of H that could have deputized this delegation of E speakers. However, this was not known and they were treated as co-owners to the language project, in the workshop.
E and H: conflict• During the workshop, differences developed between H and
E. The E deputation demanded their own project (and funding). Attempts to mediate failed.
• In retrospect, the E deputation probably cared more about money than literacy. However, an outside organization like SIL could not know that and instead opted to fund both - legitimizing the E deputation.
• While SIL accepted the E, conflicting information led SIL to seek more objective evidence by surveying the “E” and “H” villages.
E and H: survey• In a survey, one needs willing involvement of the groups. The contact
for the E group eventually agreed to the survey but said that the E villages should be done last. The survey (a wordlist collection, collecting some sociolinguistic data, comprehension testing) proceeded in the H area.
• Surprise: some H villages had a substantial S minority, with only 30% lexical similarity with H and E. H speakers understood S only if they had been raised around S speakers. S was clearly another language.
• During survey some S speakers said that they were really H people and the H were just a splinter of the E people. Soon it was apparent that some E people were trying to influence the survey outcome by running ahead and planting information in H villages with S people.
E and H: the rise of S• The national government has a finite number of categories for
minority groups. H and E both had an official government identity, but S did not. If S took over H’s identity then it would now be identifiable to the government and to NGOs like SIL. So the S went along with the attempt of some E people to skew the survey results.
• Survey found no pure E villages; they are always part of a village whose majority is another ethnic group – M. Further, E children primarily speak M.
• There are a number of H-only villages, and a long-standing cultural committee which has representatives of the major religious groups. The E group has nothing like this. Further, the S are actually a group whose language is like the M language.
E and H: Decline into conflict• For some time both H and E came to literacy workshops—
eventually accompanied by an S group demanding their own language development project separate from the M language.
• What once looked like a viable single language development project had now devolved into 4 different groups, at least 2 of which were probably not represented by legitimate community members.
• High level of conflict between the groups. This conflict was either started or accelerated by beginning a language development project—and the resulting fragmentation actually is creating disunity, delaying literacy for all the groups.
E and H: End of involvement• Given this situation and a growing number of legal and
physical threats against SIL personnel if they did not meet demands of one or more of the groups, SIL decided to cease working with any of the groups.
• Thus, language development that was stimulated by external involvement resulted in accentuating division in a group that needs to work together if it is to survive in the face of a growing national culture.
E and H: Observations1. Unity matters2. Know who you’re working with3. An orthography cannot extend group identity
beyond any pre-existing political or social organization.
4. Literacy and orthographic decisions are often a proxy forum for other social, religious or political issues.
5. Most of the time money creates more problems than it solves.
Summary: the goals
130
130
End goals of the ‘midwife’ approach
For the LC:• a practical orthography that is a comfortable tool for both
writers and readers• a new means of expression developed collectively, with
their own input• empowerment, incorporating skills and resources for
future decision-making
For the Linguist:• an experience where some of the ‘heat’ is taken off, but
where creativity is crucial • a richer analysis, the result of L’s technical knowledge +
LC input
131
131
Some referencesGood starting places:Cahill, Michael, and Keren Rice (eds.) 2014. Developing Orthographies for Unwritten
Languages. Dallas: SIL International.Grenoble, Lenore and Lindsay Whaley. 2006. Orthography. Chapter 6 in Saving
languages. An introduction to language revitalization. Cambridge: Cambridge University Press. 137-159.
Hinton, Leanne. 2001. New writing systems. In Leanne Hinton and Ken Hale (editors). The Green Book of language revitalization in practice. San Diego. Academic Press. 239-250.
Lüpke, Frederike. 2011. Orthography development. In Peter K. Austin and Julia Sallabank (editors). The Cambridge handbook of endangered languages. Cambridge: Cambridge University Press. 312-336.
Sebba, Mark. 2007. Spelling and society: The culture and politics of orthography around the world. Cambridge: Cambridge University Press.
Seifart, Frank. 2006. Orthography development. In Jost Gippert, Nikolaus P. Himmelmann, and Ulrike Mosel. Essentials of language documentation. Berlin: Mouton de Gruyter. 275-299.
A more exhaustive list can be obtained from the CoLang course website.
132
132
About us• Mike Cahill ([email protected]) worked on the Kɔnni orthography in
Ghana in the 1980’s, and has advised on several African languages since, especially in the Gur family.
• Keren Rice ([email protected]) has been working on Dene languages in northern Canada since the 1970’s, and served on an orthography standardization committee in the 1980’s.
• Colleen Fitzgerald ([email protected]) has been working on Tohono O'odham for nearly 2 decades, and on Native languages of Oklahoma since 2009.
• Gwen Hyslop ([email protected]) has been working on languages in Bhutan since 2006, including development of ’Ucen orthographies for Bhutan’s endangered languages.
• Kris Stenzel ([email protected]) has been working on Kotiria and Wa’ikhana, two Eastern Tukanoan languages spoken in northwestern Amazonia since 2000.
• We welcome your feedback/comments/questions!
133
133