Weaving a Web of linguistic diversity

1
__ ~iII IlII1i&lllIlIIl~ _. 1!~1I1l!I~1I11l David Crystal explains how the internet is turni~g out to be a friend to all the world's languages The global English debate Weaving a Web of linguistic diversity David Crystal is honorary professor of linguistics at the UniveTsity of Wales, Bang01: His book Language And The Internet will be published by Cambridge Universit'1J PTess lateT th'isyear There are also practical problems, though a great deal has been done since the mid-90s to address them. First, the Ascii character set still fails to adequately support the array of letter shapes in Arabic, Hindi, Chinese, Korean and the many other languages in the world that do not use the Latin alphabet. The Unicode coding -system, the alternative to Ascii, allows more than 65,000 char- acters; but the implementation of this system is still in its infancy. The Web consortium has an internation- alisation activity looking specifically at different alphabets, se that operat- ing systems can support a page in any alphabet. The future looks good for Web multilingualism. As Ned Thomas commented last year in an editorial for COI;1tact,the bulletin ofthe Euro-- pean Bureau of Lesser Used Lan- guages: "It is not the case ... that all languages will be marginalised on the Net by English. On the contrary, there will be a great demand for multilingual websites, for multi- lingual data retrieval, for machine translation, for voice recognition systems to be multilingual." And 1Yler Chambers, the creator of various Web language projects, agrees: "The future of the intern et is even more multilingualism and cross- cultural exploration and under- standing than we've already seen." I agree. The Web offers a World Wide Welcome for global linguistic diversity. they allow us to see languages as they are. In many cases, the total Web presence, in terms of number of pages, is small. The crucial point is that the languages are out there, even if represented by only a sprinkling of sites. It is the ideal medium for minority languages, given the rela- tive cheapness and ease of creating a Web page, compared with the costs of print, TV or radio. However, developing a significant ('ylwr-pl'('sl'Il('('ls Ilot ('asy.Unlil Iterit- kllllllllssofilllt'lIlt'l p('IWll'lIllol1 11111 coulIlry hullds Up,lllIt! 11('OIlt'SPOIlt! ing mass of ('ollt('1l1('xlsl~111l tlU' local language, the lIlotivlllion 10 switch from English-language sites will be limited to those for whom issues of identity outweigh issues of informa- tion. The future is also dependent on the levels of English-speaking ability in individual countries, and the further growth in those levels. Students at Cairo's al-Azhar University join the small but growing number of Arabic speakers using the internet Photograph: Mohammed AI-Sehiti is illustrated by some 5,000 words, along with proverbs, naming pat- terns and greetings. Another site deals with 87 European minority languages. Some sites are small in content, but extensive in range: one gives the Lord's Prayer in nearly 500 languages. Nobody has yet worked out just how many languages have obtained a modicum of presence on the Web. I have found more than 1,000, It is not difficult to find ('vid('I1('(' of 11 N('t presel1ce [or tl1(' vast lI11dorlty o[ the more frequently used langu<l!-;l's, and for a large number of minority languages too. I would guess that about a quarter of the world's languages have some sort of internet presence. In all these examples we are encountering language presence in a real sense. These are not sites that only analyse or talk about languages; commentators predicting that before long the Web (and the internet as a whole) will be predominantly non- English, as communications infra- structure develops in Europe, Asia, Mrica and South America. A Global Reach survey has estimated that people with internet access in non- English-speaking countries increased from 7m to 136m between 1995 and 2000. In 1998 the total number of new non-Englishwebsites passed the number of new English websites. At a conference on search engine strategies last April, Alta Vista was predicting that by next year less than half of the Web would be in English. English-language author David Grad- dol has predicted an even lower figure in due course, 40%. In parts of the world the local language is already. dominant. According to the Japanese internet author Yoshi Mikami, 90% of Web p<fgesin Japan are now in Japanese. The Web is increasingly reflecting the distribution oflanguage presence in the real world, and many sites pro- vide the evidence. They range from individual businesses doing their best to present a multilingual iden- tity to big sites collecting data on many languages. Under the first heading we encounter such news- papers as the Belgian daily Le Soil', which is represen ted by six languages - French, Dutch, English, German, Italian and Spanish. Under the latter heading we find' such sites as the University of Oregon Font Archive, providing 112fonts in its archives for more than 40 languages. A World Language Resources site lists products for 728 languages. An Mrican resource list covers several local languages; Yoruba, for example, The World Wide Web is an eclectic medium, holding a mirror up to our linguistic nature. Not only does it offer a home to all linguistic styles within a language; it offers a home to all languages - once their commun- ities have a functioning computer technology. And its increasingly multilingual character has been the most notable change since it started out as a totally English medium. For many people the language of the internet is English. "World, Wide, Web: three English words" was the headline of a piece by Michael Specter in the New York Times a few years ago. The article went on to comment: "If you want to take full advantage of the internet there is only one real way to do it: learn English." Specter did acknowledge the arrival of other languages: '~s the Web grows, the number of people on it who speak French, say, or Russian will become more varied and that variety will be expressed on the Web. That is why it is a fundamentally democratic technology. But it won't necessarily happen soon." The evidence is growing that this conclusion was wrong. With the internet's globalisation the presence of other languages has steadily risen. By the mid-90s a widely quoted figure was that about 80% of the Net was in English - a figure supported by the first big study of language distribu- tion on the internet, carried out in 1997 by Babel, ajoint initiative ofthe Internet Society and Alis Technol- ogies. This showed English well ahead, but with several other languages - notably German, Japanese, French and Spanish - entering the ring. Since then the estimates for Eng- lish have been falling, with some

Transcript of Weaving a Web of linguistic diversity

Page 1: Weaving a Web of linguistic diversity

__ ~iII IlII1i&lllIlIIl~ _. 1!~1I1l!I~1I11l

David Crystal explains how the internet is turni~gout to be a friend to all the world's languages

The global English debate

Weaving a Web oflinguistic diversity

David Crystal is honorary professorof linguistics at the UniveTsity ofWales, Bang01: His book LanguageAnd The Internet will be publishedby Cambridge Universit'1J PTess lateTth'isyear

There are also practical problems,though a great deal has been donesince the mid-90s to address them.First, the Ascii character set still failsto adequately support the array ofletter shapes in Arabic, Hindi,Chinese, Korean and the many otherlanguages in the world that do notuse the Latin alphabet. The Unicodecoding -system, the alternative toAscii, allows more than 65,000 char­acters; but the implementation ofthis system is still in its infancy. TheWeb consortium has an internation­alisation activity looking specificallyat different alphabets, se that operat­ing systems can support a page in anyalphabet.

The future looks good for Webmultilingualism. As Ned Thomascommented last year in an editorialfor COI;1tact,the bulletin ofthe Euro-­pean Bureau of Lesser Used Lan­guages: "It is not the case ... that alllanguages will be marginalised onthe Net by English. On the contrary,there will be a great demand formultilingual websites, for multi­lingual data retrieval, for machinetranslation, for voice recognitionsystems to be multilingual."

And 1Yler Chambers, the creator ofvarious Web language projects,agrees: "The future of the intern et iseven more multilingualism and cross­cultural exploration and under­standing than we've already seen."

I agree. The Web offers a WorldWide Welcome for global linguisticdiversity.

they allow us to see languages as theyare. In many cases, the total Webpresence, in terms of number ofpages, is small. The crucial point isthat the languages are out there, evenif represented by only a sprinklingof sites. It is the ideal medium forminority languages, given the rela­tive cheapness and ease of creating aWeb page, compared with the costs ofprint, TV or radio.

However, developing a significant('ylwr-pl'('sl'Il('('ls Ilot ('asy.Unlil Iterit­kllllllllssofilllt'lIlt'l p('IWll'lIllol1 11111coulIlry hullds Up,lllIt! 11('OIlt'SPOIlt!ing mass of ('ollt('1l1('xlsl~111ltlU' locallanguage, the lIlotivlllion 10 switchfrom English-language sites will belimited to those for whom issues ofidentity outweigh issues of informa­tion. The future is also dependent onthe levels of English-speaking abilityin individual countries, and thefurther growth in those levels.

Students at Cairo's al-Azhar University join the small but growing number ofArabic speakers using the internet Photograph: Mohammed AI-Sehiti

is illustrated by some 5,000 words,along with proverbs, naming pat­terns and greetings. Another sitedeals with 87 European minoritylanguages. Some sites are small incontent, but extensive in range: onegives the Lord's Prayer in nearly 500languages.

Nobody has yet worked out justhow many languages have obtained amodicum of presence on the Web. Ihave found more than 1,000, It is notdifficult to find ('vid('I1('(' of 11 N('tpresel1ce [or tl1(' vast lI11dorlty o[the more frequently used langu<l!-;l's,and for a large number of minoritylanguages too. I would guess thatabout a quarter of the world'slanguages have some sort of internetpresence.

In all these examples we areencountering language presence in areal sense. These are not sites thatonly analyse or talk about languages;

commentators predicting that beforelong the Web (and the internet as awhole) will be predominantly non­English, as communications infra­structure develops in Europe, Asia,Mrica and South America. A GlobalReach survey has estimated thatpeople with internet access in non­English-speaking countries increasedfrom 7m to 136m between 1995 and2000. In 1998 the total number ofnew non-Englishwebsites passed thenumber of new English websites.

At a conference on search enginestrategies last April, Alta Vista waspredicting that by next year less thanhalf of the Web would be in English.English-language author David Grad­dol has predicted an even lowerfigure in due course, 40%. In partsof the world the local language isalready. dominant. According to theJapanese internet author YoshiMikami, 90% of Web p<fgesin Japanare now in Japanese.

The Web is increasingly reflectingthe distribution oflanguage presencein the real world, and many sites pro­vide the evidence. They range fromindividual businesses doing theirbest to present a multilingual iden­tity to big sites collecting data onmany languages. Under the firstheading we encounter such news­papers as the Belgian daily Le Soil',which is represen ted by six languages- French, Dutch, English, German,Italian and Spanish. Under the latterheading we find' such sites as theUniversity of Oregon Font Archive,providing 112fonts in its archives formore than 40 languages.

A World Language Resources sitelists products for 728 languages. AnMrican resource list covers severallocal languages; Yoruba, for example,

The World Wide Web is an eclecticmedium, holding a mirror up to ourlinguistic nature. Not only does itoffer a home to all linguistic styleswithin a language; it offers a home toall languages - once their commun­ities have a functioning computertechnology. And its increasinglymultilingual character has been themost notable change since it startedout as a totally English medium.

For many people the language ofthe internet is English. "World, Wide,Web: three English words" was theheadline of a piece by Michael Specterin the New York Times a few yearsago. The article went on to comment:"If you want to take full advantage ofthe internet there is only one real wayto do it: learn English."

Specter did acknowledge thearrival of other languages: '~s theWeb grows, the number of people onit who speak French, say, or Russianwill become more varied and thatvariety will be expressed on the Web.That is why it is a fundamentallydemocratic technology. But it won'tnecessarily happen soon."

The evidence is growing that thisconclusion was wrong. With theinternet's globalisation the presenceof other languages has steadily risen.By the mid-90s a widely quoted figurewas that about 80% of the Net was inEnglish - a figure supported by thefirst big study of language distribu­tion on the internet, carried out in1997 by Babel, ajoint initiative oftheInternet Society and Alis Technol­ogies. This showed English well ahead,but with several other languages ­notably German, Japanese, Frenchand Spanish - entering the ring.

Since then the estimates for Eng­lish have been falling, with some