I18n with PHP 5.3
-
Upload
zendcon -
Category
Technology
-
view
7.159 -
download
5
description
Transcript of I18n with PHP 5.3
PHP Internationalization with ICU
By Stas Malyshev, Zend Technologies
2
What and why?
•ICU - http://icu-project.org/ (IBM)
•Unicode
•CLDR - http://cldr.unicode.org/
3
Intl extension•Locale
•Collator
•Number & Currency formatter
•Date & Time formatter
•Message & Choice formatter
•Normalizer
•Graphemes
•IDN
•Calendars
•Resources
4
Intl extension
•Dual API OO and procedural
•Same implementation underneath
collator_create() == new Collator()
numfmt_format() == NumberFormatter::format()
locale_get_default() == Locale::getDefault()
5
Locale
•Relies on ICU locales
<language>[_<script>]_<country>[_<variant>][@<keywords>]
•Default locale
new Collator(Locale::DEFAULT)
Locale::setDefault, Locale::getDefault
You can use null
6
Locale
Locale pieces
getPrimaryLanguage($locale)
getScript($locale)
getRegion($locale)
getVariant($locale)
getKeywords($locale)
7
LocaleLocale display pieces
getDisplayName($locale, $in_locale = null)
getDisplayLanguage($locale, $in_locale = null)
getDisplayScript($locale, $in_locale = null)
getDisplayRegion($locale, $in_locale = null)
Example:
getDisplayScript(getScript("zh-Hant-TW"), "en-US") returns “Traditional Chinese”
8
Locale building blocks•parseLocale() - returns array composed of locale
subtags
•composeLocale() - creates locale ID out of subtags
parseLocale('sr-Latn-RS') returns
array('language'=>'sr', 'script'=>'Latn', 'region'=>’RS’)
composeLocale(array('language'=>'sr', 'script'=>'Latn', 'region'=>’RS’)) returns ‘sr-Latn-RS’
9
Locale guessing
•acceptFromHttp - Accept-Language to locale
•lookup – find in the list
•filterMatches – are they the same?
10
Collator
•Comparing, sorting strings
•Collation level (strength)
•All ICU collator attributes
Numeric collation
Ignoring punctuation
•Not yet: custom “tailoring” rules
11
Collator
$coll = new Collator("fr_CA");
if ($coll->compare("côte", "coté") < 0) {
echo "less\n";
} else {
echo "greater\n";
} côte < coté
12
Collator
$strings = array("cote", "côte", "Côte", "coté","Coté", "côté", "Côté", "coter");
$coll = new Collator("fr_CA");
$coll->sort($strings);
cotecôteCôtecotéCotécôtéCôtécoter
sort($array, $flags)asort($array, $flags)sortWithSortKeys($array)
13
NumberFormatter
•Formatting and parsing
•Numbers and currency
numfmt_create($locale, $style, $pattern = null)
NumberFormatter::PATTERN_DECIMAL NumberFormatter::ORDINALNumberFormatter::DECIMALNumberFormatter::DURATIONNumberFormatter::CURRENCY NumberFormatter::SCIENTIFICNumberFormatter::PERCENT NumberFormatter::SPELLOUT
14
NumberFormatterFormatting
$fmt = new NumberFormatter(‘en_US’, NumberFormatter::DECIMAL);
echo $fmt->format(1234);
// result is 1,234
$fmt = new NumberFormatter(‘de_CH’, NumberFormatter::DECIMAL);
echo $fmt->format(1234);
// result is 1'234
15
NumberFormatterParsing
$fmt = new NumberFormatter(‘de_DE’, NumberFormatter::DECIMAL);
$num = ‘1.234,567 min’;
$fmt->parse($num, NumberFormatter::TYPE_DOUBLE, $pos);
// result is 1234.567 , $pos = 9
$fmt->parse($num, NumberFormatter::TYPE_INT32);
// result is 1234
16
MessageFormatter
•Formatting and parsing whole messages, including data inside
•Also allows choice between things printed:
0≤are no files|1≤is one file|1<are many files
17
MessageFormatter
$fmt = new MessageFormatter("en_US", "{0,number,integer} monkeys on {1,number,integer} trees make {2,number} monkeys per tree");echo $fmt->format(array(4560, 123, 4560/123));
$fmt = new MessageFormatter("de", "{0,number,integer} Affen über {1,number,integer} Bäume um {2,number} Affen pro Baum");
echo $fmt->format(array(4560, 123, 4560/123));
18
IntlDateFormatter
•Allows using locale-dependent canned patterns
•Short, medium, long date & time
Long: Tuesday, April 12, 1952 AD or 3:30:42pm PST
Medium: January 12, 1952 or 3:30:32pm
Short: 12/13/52 or 3:30pm
•Also allows free-form patterns
"yyyy.MM.dd G 'at' HH:mm:ss vvvv"
1996.07.10 AD at 15:08:56 Pacific Time
19
IntlDateFormatter
$fmt = new IntlDateFormatter( "en_US" , IntlDateFormatter::FULL, IntlDateFormatter::FULL,'America/Los_Angeles',IntlDateFormatter::GREGORIAN);echo $fmt->format(0);
// Wednesday, December 31, 1969 4:00:00 PM PT $fmt = new IntlDateFormatter( "de-DE" , IntlDateFormatter::FULL, IntlDateFormatter::FULL,'America/Los_Angeles',IntlDateFormatter::GREGORIAN);echo $fmt->format(0); // Mittwoch, 31. Dezember 1969 16:00 Uhr GMT-08:00
20
Normalizer
•Brings Unicode text to one of the normal forms: NFC, NFD, NFKC, NFKD
•normalize(), isNormalized()
$combining_ring_above = "\xCC\x8A"; // 'COMBINING RING ABOVE' (U+030A) $chars = Normalizer::normalize( 'A' . $combining_ring_above, Normalizer::FORM_C );
echo urlencode($chars);
// %C3%85 i.e. // 'LATIN CAPITAL LETTER A WITH RING ABOVE' (U+00C5)
21
Grapheme functions
•Graphemes are multi-char entities, like letter + accent mark(s)
•Same as string functions, but operate on grapheme units
•Strlen, substr, strpos, strstr
•Extraction function – extract to fill limited buffer, but always keep graphemes whole
22
IDN
idn.icann.org ↔ xn--5dbqzzl.idn.icann.org.עברית
русский.idn.icann.org ↔ xn--h1acbxfam.idn.icann.org
•idn_to_ascii
•idn_to_utf8
23
TODO
•ResourceHandler
•Transliteration
•StringSearch
•Tighter integration with other modules in 6.0
24
Thanks!http://php.net/intl for futher information.