Post on 13-Oct-2019
Package ‘rebus’April 25, 2017
Type Package
Title Build Regular Expressions in a Human Readable Way
Version 0.1-3
Date 2017-04-25
Author Richard Cotton [aut, cre]
Maintainer Richard Cotton <richierocks@gmail.com>
Description Build regular expressions piece by piece using human readable code.This package is designed for interactive use. For package development, usethe rebus.* dependencies.
Depends R (>= 3.1.0)
Imports rebus.base (>= 0.0-3), rebus.datetimes, rebus.numbers,rebus.unicode (>= 0.0-2)
Suggests testthat
License Unlimited
LazyLoad yes
LazyData yes
Acknowledgments Development of this package was partially funded bythe Proteomics Core at Weill Cornell Medical College in Qatar<http://qatar-weill.cornell.edu>. The Core is supported by'Biomedical Research Program' funds, a program funded by QatarFoundation.
RoxygenNote 6.0.1
Collate 'export-base.R' 'export-datetimes.R' 'export-numbers.R''export-unicode.R' 'imports.R' 'regex-package.R'
NeedsCompilation no
Repository CRAN
Date/Publication 2017-04-25 21:42:46 UTC
1
2 Anchors
R topics documented:
Anchors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2as.regex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3Backreferences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3capture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3CharacterClasses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3char_class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3ClassGroups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4Concatenation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4DateTime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4escape_special . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4exactly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4format.regex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5get_weekdays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5IsoClasses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5literal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5lookahead . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5modify_mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6number_range . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6or . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6rebus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6recursive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8regex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8repeated . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8ReplacementCase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8roman . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9SpecialCharacters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9Unicode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9UnicodeGeneralCategory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9UnicodeOperators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9UnicodeProperty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10whole_word . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10WordBoundaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Index 11
Anchors The start or end of a string
Description
See Anchors.
as.regex 3
as.regex Convert or test for regex objects
Description
See as.regex.
Backreferences Backreferences
Description
See Backreferences.
capture Capture a token, or not
Description
See capture.
CharacterClasses Class Constants
Description
See CharacterClasses.
char_class A range or char_class of characters
Description
See char_class.
4 exactly
ClassGroups Character classes
Description
See ClassGroups.
Concatenation Combine strings together
Description
See Concatenation.
DateTime Date-time regexes
Description
See DateTime.
escape_special Escape special characters
Description
See escape_special.
exactly Make a regex exact
Description
See exactly.
format.regex 5
format.regex Print or format regex objects
Description
See format.regex.
get_weekdays Get the days of the week or months of the year
Description
See get_weekdays.
IsoClasses ISO 8601 date-time classes
Description
See IsoClasses.
literal Treat part of a regular expression literally
Description
See literal.
lookahead Lookaround
Description
See lookahead.
6 rebus
modify_mode Apply mode modifiers
Description
See modify_mode.
number_range Generate a regular expression for a number range
Description
See number_range.
or Alternation
Description
See or.
rebus rebus: Regular Expression Builder, Um, Something
Description
Build regular expressions in a human readable way.
Details
Regular expressions are a very powerful tool, but the syntax is terse enough to be difficult to read.This makes bugs easy to introduce, and hard to find. This package contains functions to makebuilding regular expressions easier.
Author(s)
Richard Cotton <richierocks@gmail.com>
rebus 7
See Also
regex and regexpr The ‘stringr‘ and ‘stringi‘ packages provide tools for matching regular expres-sions and nicely complement this package. http://www.regular-expressions.info has goodadvice on using regular expression in R. In particular, see http://www.regular-expressions.info/rlanguage.html and http://www.regular-expressions.info/examples.html https://www.debuggex.com is a visual regex debugging and testing site.
Examples
### Match a hex colour, like `"#99af01"`# This reads *Match a hash, followed by six hexadecimal values.*
"#" %R% hex_digit(6)
# To match only a hex colour and nothing else, you can add anchors to the# start and end of the expression.
START %R% "#" %R% hex_digit(6) %R% END
### Simple email address matching.# This reads *Match one or more letters, numbers, dots, underscores, percents,# plusses or hyphens. Then match an 'at' symbol. Then match one or more letters,# numbers, dots, or hyphens. Then match a dot. Then match two to four letters.*
one_or_more(char_class(ASCII_ALNUM %R% "._%+-")) %R%"@" %R%one_or_more(char_class(ASCII_ALNUM %R% ".-")) %R%DOT %R%ascii_alpha(2, 4)
### IP address matching.# First we need an expression to match numbers between 0 and 255. Both the# following syntaxes read *Match two then five then a number between zero and# five. Or match two then a number between zero and four then a digit. Or match# an optional zero or one followed by an optional digit folowed by a compulsory# digit. Make this a single token, but don't capture it.*
# Using the %|% operatorip_element <- group(
"25" %R% char_range(0, 5) %|%"2" %R% char_range(0, 4) %R% ascii_digit() %|%optional(char_class("01")) %R% optional(ascii_digit()) %R% ascii_digit()
)
# The same again, this time using the or functionip_element <- or(
"25" %R% char_range(0, 5),"2" %R% char_range(0, 4) %R% ascii_digit(),optional(char_class("01")) %R% optional(ascii_digit()) %R% ascii_digit()
)
# It's easier to write using number_range, though it isn't quite as optimal
8 ReplacementCase
# as handcrafted regexes.number_range(0, 255, allow_leading_zeroes = TRUE)
# Now an IP address consists of 4 of these numbers separated by dots. This# reads *Match a word boundary. Then create a token from an `ip_element`# followed by a dot, and repeat it three times. Then match another `ip_element`# followed by a word boundary.*
BOUNDARY %R%repeated(group(ip_element %R% DOT), 3) %R%ip_element %R%BOUNDARY
recursive Make the regular expression recursive.
Description
See recursive.
regex Create a regex
Description
See regex.
repeated Repeat values
Description
See repeated.
ReplacementCase Force the case of replacement values
Description
See ReplacementCase.
roman 9
roman Roman numerals
Description
See roman.
SpecialCharacters Special characters
Description
See SpecialCharacters.
Unicode Unicode classes
Description
See Unicode.
UnicodeGeneralCategory
Unicode General Categories
Description
See UnicodeGeneralCategory.
UnicodeOperators Unicode Operators
Description
See UnicodeOperators.
10 WordBoundaries
UnicodeProperty Unicode Properties
Description
See UnicodeProperty.
whole_word Match a whole word
Description
See whole_word.
WordBoundaries Word boundaries
Description
See WordBoundaries.
Index
%R% (Concatenation), 4%c% (Concatenation), 4
ADDITIONAL_ARROWS (Unicode), 9additional_arrows (Unicode), 9AEGEAN_NUMBERS (Unicode), 9aegean_numbers (Unicode), 9ALCHEMICAL_SYMBOLS (Unicode), 9alchemical_symbols (Unicode), 9ALNUM (CharacterClasses), 3alnum (ClassGroups), 4ALPHA (CharacterClasses), 3alpha (ClassGroups), 4ALPHABETIC_PRESENTATION_FORMS
(Unicode), 9alphabetic_presentation_forms
(Unicode), 9AM_PM (DateTime), 4Anchors, 2, 2ANCIENT_GREEK_MUSICAL_NOTATION
(Unicode), 9ancient_greek_musical_notation
(Unicode), 9ANCIENT_GREEK_NUMBERS (Unicode), 9ancient_greek_numbers (Unicode), 9ANCIENT_SYMBOLS (Unicode), 9ancient_symbols (Unicode), 9ANY_CHAR (CharacterClasses), 3any_char (ClassGroups), 4ARABIC (Unicode), 9arabic (Unicode), 9ARABIC_EXTENDED_A (Unicode), 9arabic_extended_a (Unicode), 9ARABIC_MATHEMATICAL_ALPHANUMERIC_SYMBOLS
(Unicode), 9arabic_mathematical_alphanumeric_symbols
(Unicode), 9ARABIC_PRESENTATION_FORMS_A (Unicode), 9arabic_presentation_forms_a (Unicode), 9ARABIC_PRESENTATION_FORMS_B (Unicode), 9
arabic_presentation_forms_b (Unicode), 9ARABIC_SUPPLEMENT (Unicode), 9arabic_supplement (Unicode), 9ARMENIAN (Unicode), 9armenian (Unicode), 9ARMENIAN_LIGATURES (Unicode), 9armenian_ligatures (Unicode), 9as.regex, 3, 3as_lower (ReplacementCase), 8as_upper (ReplacementCase), 8ASCII_ALNUM (CharacterClasses), 3ascii_alnum (ClassGroups), 4ASCII_ALPHA (CharacterClasses), 3ascii_alpha (ClassGroups), 4ASCII_DIGIT (CharacterClasses), 3ascii_digit (ClassGroups), 4ASCII_LOWER (CharacterClasses), 3ascii_lower (ClassGroups), 4ASCII_UPPER (CharacterClasses), 3ascii_upper (ClassGroups), 4AVESTAN (Unicode), 9avestan (Unicode), 9
Backreferences, 3, 3BACKSLASH (SpecialCharacters), 9BALINESE (Unicode), 9balinese (Unicode), 9BAMUN (Unicode), 9bamun (Unicode), 9BAMUN_SUPPLEMENT (Unicode), 9bamun_supplement (Unicode), 9BASSA_VAH (Unicode), 9bassa_vah (Unicode), 9BATAK (Unicode), 9batak (Unicode), 9BENGALI_AND_ASSAMESE (Unicode), 9bengali_and_assamese (Unicode), 9BLANK (CharacterClasses), 3blank (ClassGroups), 4BLOCK_ELEMENTS (Unicode), 9
11
12 INDEX
block_elements (Unicode), 9BOPOMOFO (Unicode), 9bopomofo (Unicode), 9BOPOMOFO_EXTENDED (Unicode), 9bopomofo_extended (Unicode), 9BOUNDARY (WordBoundaries), 10BOX_DRAWING (Unicode), 9box_drawing (Unicode), 9BRAHMI (Unicode), 9brahmi (Unicode), 9BRAILLE_PATTERNS (Unicode), 9braille_patterns (Unicode), 9BUGINESE (Unicode), 9buginese (Unicode), 9BUHID (Unicode), 9buhid (Unicode), 9BYZANTINE_MUSICAL_SYMBOLS (Unicode), 9byzantine_musical_symbols (Unicode), 9
capture, 3, 3CARD_SUITS (Unicode), 9card_suits (Unicode), 9CARET (SpecialCharacters), 9CARIAN (Unicode), 9carian (Unicode), 9case_insensitive (modify_mode), 6CAUCASIAN_ALBANIAN (Unicode), 9caucasian_albanian (Unicode), 9CENTURY (DateTime), 4CENTURY_IN (DateTime), 4CHAKMA (Unicode), 9chakma (Unicode), 9CHAM (Unicode), 9cham (Unicode), 9char_class, 3, 3char_range (ClassGroups), 4CharacterClasses, 3, 3CHEROKEE (Unicode), 9cherokee (Unicode), 9CHESS_CHECKERS_DRAUGHTS (Unicode), 9chess_checkers_draughts (Unicode), 9CJK_COMPATIBILITY (Unicode), 9cjk_compatibility (Unicode), 9CJK_COMPATIBILITY_FORMS (Unicode), 9cjk_compatibility_forms (Unicode), 9CJK_COMPATIBILITY_IDEOGRAPHS (Unicode),
9cjk_compatibility_ideographs (Unicode),
9
CJK_COMPATIBILITY_IDEOGRAPHS_SUPPLEMENT(Unicode), 9
cjk_compatibility_ideographs_supplement(Unicode), 9
CJK_IDEOGRAPHIC_DESCRIPTION_CHARACTERS(Unicode), 9
cjk_ideographic_description_characters(Unicode), 9
CJK_STROKES (Unicode), 9cjk_strokes (Unicode), 9CJK_SYMBOLS_AND_PUNCTUATION (Unicode), 9cjk_symbols_and_punctuation (Unicode), 9CJK_UNIFIED_IDEOGRAPHS (Unicode), 9cjk_unified_ideographs (Unicode), 9CJK_UNIFIED_IDEOGRAPHS_EXTENSION_A
(Unicode), 9cjk_unified_ideographs_extension_a
(Unicode), 9CJK_UNIFIED_IDEOGRAPHS_EXTENSION_B
(Unicode), 9cjk_unified_ideographs_extension_b
(Unicode), 9CJK_UNIFIED_IDEOGRAPHS_EXTENSION_C
(Unicode), 9cjk_unified_ideographs_extension_c
(Unicode), 9CJK_UNIFIED_IDEOGRAPHS_EXTENSION_D
(Unicode), 9cjk_unified_ideographs_extension_d
(Unicode), 9ClassGroups, 4, 4CLOSE_BRACKET (SpecialCharacters), 9CLOSE_PAREN (SpecialCharacters), 9CNTRL (CharacterClasses), 3cntrl (ClassGroups), 4COMBINING_DIACRITIC_EXTENDED (Unicode),
9combining_diacritic_extended (Unicode),
9COMBINING_DIACRITIC_MARKS (Unicode), 9combining_diacritic_marks (Unicode), 9COMBINING_DIACRITIC_MARKS_FOR_SYMBOLS
(Unicode), 9combining_diacritic_marks_for_symbols
(Unicode), 9COMBINING_DIACRITIC_SUPPLEMENT
(Unicode), 9combining_diacritic_supplement
INDEX 13
(Unicode), 9COMBINING_HALF_MARKS (Unicode), 9combining_half_marks (Unicode), 9COMMON_INDIC_NUMBER_FORMS (Unicode), 9common_indic_number_forms (Unicode), 9Concatenation, 4, 4CONTROL_PICTURES (Unicode), 9control_pictures (Unicode), 9COPTIC (Unicode), 9coptic (Unicode), 9COPTIC_EPACT_NUMBERS (Unicode), 9coptic_epact_numbers (Unicode), 9COUNTING_ROD_NUMERALS (Unicode), 9counting_rod_numerals (Unicode), 9CUNEIFORM (Unicode), 9cuneiform (Unicode), 9CUNEIFORM_NUMBERS_AND_PUNCTUATION
(Unicode), 9cuneiform_numbers_and_punctuation
(Unicode), 9CURRENCY_SYMBOLS (Unicode), 9currency_symbols (Unicode), 9CYPRIOT_SYLLABARY (Unicode), 9cypriot_syllabary (Unicode), 9CYRILLIC (Unicode), 9cyrillic (Unicode), 9CYRILLIC_EXTENDED_A (Unicode), 9cyrillic_extended_a (Unicode), 9CYRILLIC_EXTENDED_B (Unicode), 9cyrillic_extended_b (Unicode), 9CYRILLIC_SUPPLEMENT (Unicode), 9cyrillic_supplement (Unicode), 9
DateTime, 4, 4datetime (DateTime), 4DAY (DateTime), 4DAY_IN (DateTime), 4DAY_OF_YEAR (DateTime), 4DAY_OF_YEAR_IN (DateTime), 4DAY_SINGLE (DateTime), 4DESERET (Unicode), 9deseret (Unicode), 9DEVANAGARI (Unicode), 9devanagari (Unicode), 9DEVANAGARI_EXTENDED (Unicode), 9devanagari_extended (Unicode), 9DGT (CharacterClasses), 3dgt (ClassGroups), 4DIGIT (CharacterClasses), 3
digit (ClassGroups), 4DINGBATS (Unicode), 9dingbats (Unicode), 9DMY (DateTime), 4DMY_IN (DateTime), 4DOLLAR (SpecialCharacters), 9DOMINO_TILES (Unicode), 9domino_tiles (Unicode), 9DOT (SpecialCharacters), 9DTSEP (DateTime), 4duplicate_group_names (modify_mode), 6DUPLOYAN (Unicode), 9duployan (Unicode), 9DYM (DateTime), 4DYM_IN (DateTime), 4
EGYPTIAN_HIEROGLYPHS (Unicode), 9egyptian_hieroglyphs (Unicode), 9ELBASAN (Unicode), 9elbasan (Unicode), 9EMOTICONS (Unicode), 9emoticons (Unicode), 9ENCLOSED_ALPHANUMERIC_SUPPLEMENT
(Unicode), 9enclosed_alphanumeric_supplement
(Unicode), 9ENCLOSED_ALPHANUMERICS (Unicode), 9enclosed_alphanumerics (Unicode), 9ENCLOSED_CJK_LETTERS_AND_MONTHS
(Unicode), 9enclosed_cjk_letters_and_months
(Unicode), 9ENCLOSED_IDEOGRAPHIC_SUPPLEMENT
(Unicode), 9enclosed_ideographic_supplement
(Unicode), 9END (Anchors), 2engroup (capture), 3escape_special, 4, 4ETHIOPIC (Unicode), 9ethiopic (Unicode), 9ETHIOPIC_EXTENDED (Unicode), 9ethiopic_extended (Unicode), 9ETHIOPIC_EXTENDED_A (Unicode), 9ethiopic_extended_a (Unicode), 9ETHIOPIC_SUPPLEMENT (Unicode), 9ethiopic_supplement (Unicode), 9exactly, 4, 4
14 INDEX
FLOORS_AND_CEILINGS (Unicode), 9floors_and_ceilings (Unicode), 9format.regex, 5, 5FRACTIONAL_SECOND (DateTime), 4FRACTIONAL_SECOND_IN (DateTime), 4free_spacing (modify_mode), 6FULLWIDTH_ASCII_DIGITS (Unicode), 9fullwidth_ascii_digits (Unicode), 9FULLWIDTH_ASCII_PUNCTUATION (Unicode), 9fullwidth_ascii_punctuation (Unicode), 9
GENERAL_PUNCTUATION (Unicode), 9general_punctuation (Unicode), 9GEOMETRIC_SHAPES (Unicode), 9geometric_shapes (Unicode), 9GEOMETRIC_SHAPES_EXTENDED (Unicode), 9geometric_shapes_extended (Unicode), 9GEORGIAN (Unicode), 9georgian (Unicode), 9GEORGIAN_SUPPLEMENT (Unicode), 9georgian_supplement (Unicode), 9get_months (get_weekdays), 5get_weekdays, 5, 5GLAGOLITIC (Unicode), 9glagolitic (Unicode), 9GOTHIC (Unicode), 9gothic (Unicode), 9GRANTHA (Unicode), 9grantha (Unicode), 9GRAPH (CharacterClasses), 3graph (ClassGroups), 4GRAPHEME (CharacterClasses), 3grapheme (ClassGroups), 4GREEK_AND_COPTIC (Unicode), 9greek_and_coptic (Unicode), 9GREEK_EXTENDED (Unicode), 9greek_extended (Unicode), 9group (capture), 3GUJARATI (Unicode), 9gujarati (Unicode), 9GURMUKHI (Unicode), 9gurmukhi (Unicode), 9
HALFWIDTH_AND_FULLWIDTH_FORMS(Unicode), 9
halfwidth_and_fullwidth_forms(Unicode), 9
HANGUL_COMPATIBILITY_JAMO (Unicode), 9hangul_compatibility_jamo (Unicode), 9
HANGUL_JAMO (Unicode), 9hangul_jamo (Unicode), 9HANGUL_JAMO_EXTENDED_A (Unicode), 9hangul_jamo_extended_a (Unicode), 9HANGUL_JAMO_EXTENDED_B (Unicode), 9hangul_jamo_extended_b (Unicode), 9HANGUL_SYLLABLES (Unicode), 9hangul_syllables (Unicode), 9HANUNOO (Unicode), 9hanunoo (Unicode), 9HEBREW (Unicode), 9hebrew (Unicode), 9HEX_DIGIT (CharacterClasses), 3hex_digit (ClassGroups), 4HIRAGANA (Unicode), 9hiragana (Unicode), 9HM (DateTime), 4HM_IN (DateTime), 4HMS (DateTime), 4HMS_IN (DateTime), 4HOUR12 (DateTime), 4HOUR12_IN (DateTime), 4HOUR12_SINGLE (DateTime), 4HOUR24 (DateTime), 4HOUR24_IN (DateTime), 4HOUR24_SINGLE (DateTime), 4
ICU_REF1 (Backreferences), 3ICU_REF2 (Backreferences), 3ICU_REF3 (Backreferences), 3ICU_REF4 (Backreferences), 3ICU_REF5 (Backreferences), 3ICU_REF6 (Backreferences), 3ICU_REF7 (Backreferences), 3ICU_REF8 (Backreferences), 3ICU_REF9 (Backreferences), 3IMPERIAL_ARAMAIC (Unicode), 9imperial_aramaic (Unicode), 9INVISIBLE_OPERATORS (Unicode), 9invisible_operators (Unicode), 9IPA_EXTENSIONS (Unicode), 9ipa_extensions (Unicode), 9is.regex (as.regex), 3ISO_DATE (DateTime), 4iso_date (IsoClasses), 5ISO_DATE_IN (DateTime), 4ISO_DATETIME (DateTime), 4iso_datetime (IsoClasses), 5ISO_DATETIME_IN (DateTime), 4
INDEX 15
ISO_TIME (DateTime), 4iso_time (IsoClasses), 5ISO_TIME_IN (DateTime), 4IsoClasses, 5, 5IsoDateTime (IsoClasses), 5
JAPANESE_CHESS (Unicode), 9japanese_chess (Unicode), 9JAVANESE (Unicode), 9javanese (Unicode), 9
KAITHI (Unicode), 9kaithi (Unicode), 9KANA_SUPPLEMENT (Unicode), 9kana_supplement (Unicode), 9KANBUN (Unicode), 9kanbun (Unicode), 9KANGXI_RADICALS (Unicode), 9kangxi_radicals (Unicode), 9KANGXI_RADICALS_SUPPLEMENT (Unicode), 9kangxi_radicals_supplement (Unicode), 9KANNADA (Unicode), 9kannada (Unicode), 9KATAKANA (Unicode), 9katakana (Unicode), 9KATAKANA_PHONETIC_EXTENSIONS (Unicode),
9katakana_phonetic_extensions (Unicode),
9KAYAH_LI (Unicode), 9kayah_li (Unicode), 9KHAROSHTHI (Unicode), 9kharoshthi (Unicode), 9KHMER (Unicode), 9khmer (Unicode), 9KHMER_SYMBOLS (Unicode), 9khmer_symbols (Unicode), 9KHOJKI (Unicode), 9khojki (Unicode), 9KHUDAWADI (Unicode), 9khudawadi (Unicode), 9
LAO (Unicode), 9lao (Unicode), 9LATIN (Unicode), 9latin (Unicode), 9LATIN_1_PUNCTUATION (Unicode), 9latin_1_punctuation (Unicode), 9LATIN_1_SUPPLEMENT (Unicode), 9
latin_1_supplement (Unicode), 9LATIN_EXTENDED_A (Unicode), 9latin_extended_a (Unicode), 9LATIN_EXTENDED_ADDITIONAL (Unicode), 9latin_extended_additional (Unicode), 9LATIN_EXTENDED_B (Unicode), 9latin_extended_b (Unicode), 9LATIN_EXTENDED_C (Unicode), 9latin_extended_c (Unicode), 9LATIN_EXTENDED_D (Unicode), 9latin_extended_d (Unicode), 9LATIN_EXTENDED_E (Unicode), 9latin_extended_e (Unicode), 9LATIN_LIGATURES (Unicode), 9latin_ligatures (Unicode), 9lazy (repeated), 8LEPCHA (Unicode), 9lepcha (Unicode), 9LETTERLIKE_SYMBOLS (Unicode), 9letterlike_symbols (Unicode), 9LIMBU (Unicode), 9limbu (Unicode), 9LINEAR_A (Unicode), 9linear_a (Unicode), 9LINEAR_B_IDEOGRAMS (Unicode), 9linear_b_ideograms (Unicode), 9LINEAR_B_SYLLABARY (Unicode), 9linear_b_syllabary (Unicode), 9LISU (Unicode), 9lisu (Unicode), 9literal, 5, 5lookahead, 5, 5lookbehind (lookahead), 5LOWER (CharacterClasses), 3lower (ClassGroups), 4LYCIAN (Unicode), 9lycian (Unicode), 9LYDIAN (Unicode), 9lydian (Unicode), 9
MAHAJANI (Unicode), 9mahajani (Unicode), 9MAHJONG_TILES (Unicode), 9mahjong_tiles (Unicode), 9MALAYALAM (Unicode), 9malayalam (Unicode), 9MANDAIC (Unicode), 9mandaic (Unicode), 9MANICHAEAN (Unicode), 9
16 INDEX
manichaean (Unicode), 9MATH_ARROWS (Unicode), 9math_arrows (Unicode), 9MATHEMATICAL_ALPHANUMERIC_SYMBOLS
(Unicode), 9mathematical_alphanumeric_symbols
(Unicode), 9MDY (DateTime), 4MDY_IN (DateTime), 4MEETEI_MAYEK (Unicode), 9meetei_mayek (Unicode), 9MEETEI_MAYEK_EXTENSIONS (Unicode), 9meetei_mayek_extensions (Unicode), 9MENDE_KIKAKUI (Unicode), 9mende_kikakui (Unicode), 9MEROITIC_CURSIVE (Unicode), 9meroitic_cursive (Unicode), 9MEROITIC_HIEROGLYPHS (Unicode), 9meroitic_hieroglyphs (Unicode), 9MIAO (Unicode), 9miao (Unicode), 9MINUTE (DateTime), 4MINUTE_IN (DateTime), 4MISCELLANEOUS_MATHEMATICAL_SYMBOLS_A
(Unicode), 9miscellaneous_mathematical_symbols_a
(Unicode), 9MISCELLANEOUS_MATHEMATICAL_SYMBOLS_B
(Unicode), 9miscellaneous_mathematical_symbols_b
(Unicode), 9MISCELLANEOUS_SYMBOLS_AND_PICTOGRAPHS
(Unicode), 9miscellaneous_symbols_and_pictographs
(Unicode), 9MISCELLANEOUS_TECHNICAL (Unicode), 9miscellaneous_technical (Unicode), 9MODI (Unicode), 9modi (Unicode), 9MODIFIER_TONE_LETTERS (Unicode), 9modifier_tone_letters (Unicode), 9modify_mode, 6, 6MONGOLIAN (Unicode), 9mongolian (Unicode), 9MONTH (DateTime), 4MONTH_IN (DateTime), 4MRO (Unicode), 9mro (Unicode), 9
MS (DateTime), 4MS_IN (DateTime), 4multi_line (modify_mode), 6MUSICAL_SYMBOLS (Unicode), 9musical_symbols (Unicode), 9MYANMAR (Unicode), 9myanmar (Unicode), 9MYANMAR_EXTENDED_A (Unicode), 9myanmar_extended_a (Unicode), 9MYANMAR_EXTENDED_B (Unicode), 9myanmar_extended_b (Unicode), 9MYD (DateTime), 4MYD_IN (DateTime), 4
NABATAEAN (Unicode), 9nabataean (Unicode), 9negate_and_group (char_class), 3negated_char_class (char_class), 3negative_lookahead (lookahead), 5negative_lookbehind (lookahead), 5NEW_TAI_LUE (Unicode), 9new_tai_lue (Unicode), 9NEWLINE (CharacterClasses), 3newline (ClassGroups), 4NKO (Unicode), 9nko (Unicode), 9no_backslash_escaping (modify_mode), 6NOT_BOUNDARY (WordBoundaries), 10NOT_DGT (CharacterClasses), 3not_dgt (ClassGroups), 4NOT_SPC (CharacterClasses), 3not_spc (ClassGroups), 4NOT_WRD (CharacterClasses), 3not_wrd (ClassGroups), 4NUMBER_FORMS (Unicode), 9number_forms (Unicode), 9number_range, 6, 6
OGHAM (Unicode), 9ogham (Unicode), 9OL_CHIKI (Unicode), 9ol_chiki (Unicode), 9OLD_ITALIC (Unicode), 9old_italic (Unicode), 9OLD_NORTH_ARABIAN (Unicode), 9old_north_arabian (Unicode), 9OLD_PERMIC (Unicode), 9old_permic (Unicode), 9OLD_PERSIAN (Unicode), 9
INDEX 17
old_persian (Unicode), 9OLD_SOUTH_ARABIAN (Unicode), 9old_south_arabian (Unicode), 9OLD_TURKIC (Unicode), 9old_turkic (Unicode), 9one_or_more (repeated), 8OPEN_BRACE (SpecialCharacters), 9OPEN_BRACKET (SpecialCharacters), 9OPEN_PAREN (SpecialCharacters), 9OPT_LEADING_0 (DateTime), 4OPTICAL_CHARACTER_RECOGNITION
(Unicode), 9optical_character_recognition
(Unicode), 9optional (repeated), 8or, 6, 6or1 (or), 6ORIYA (Unicode), 9oriya (Unicode), 9ORNAMENTAL_DINGBATS (Unicode), 9ornamental_dingbats (Unicode), 9OSMANYA (Unicode), 9osmanya (Unicode), 9
PAHAWH_HMONG (Unicode), 9pahawh_hmong (Unicode), 9PAHLAVI_INSCRIPTIONAL (Unicode), 9pahlavi_inscriptional (Unicode), 9PAHLAVI_PSALTER (Unicode), 9pahlavi_psalter (Unicode), 9PALMYRENE (Unicode), 9palmyrene (Unicode), 9PAU_CIN_HAU (Unicode), 9pau_cin_hau (Unicode), 9PHAGS_PA (Unicode), 9phags_pa (Unicode), 9PHAISTOS_DISC (Unicode), 9phaistos_disc (Unicode), 9PHOENICIAN (Unicode), 9phoenician (Unicode), 9PHONETIC_EXTENSIONS (Unicode), 9phonetic_extensions (Unicode), 9PHONETIC_EXTENSIONS_SUPPLEMENT
(Unicode), 9phonetic_extensions_supplement
(Unicode), 9PIPE (SpecialCharacters), 9PLAYING_CARDS (Unicode), 9playing_cards (Unicode), 9
PLUS (SpecialCharacters), 9PRINT (CharacterClasses), 3print.regex (format.regex), 5printable (ClassGroups), 4PRIVATE_USE_AREA (Unicode), 9private_use_area (Unicode), 9PUNCT (CharacterClasses), 3punct (ClassGroups), 4
QUESTION (SpecialCharacters), 9
rebus, 6rebus-package (rebus), 6recursive, 8, 8REF1 (Backreferences), 3REF2 (Backreferences), 3REF3 (Backreferences), 3REF4 (Backreferences), 3REF5 (Backreferences), 3REF6 (Backreferences), 3REF7 (Backreferences), 3REF8 (Backreferences), 3REF9 (Backreferences), 3regex, 7, 8, 8regexpr, 7REJANG (Unicode), 9rejang (Unicode), 9repeated, 8, 8ReplacementCase, 8, 8ROMAN (roman), 9roman, 9, 9RUMI_NUMERAL_SYMBOLS (Unicode), 9rumi_numeral_symbols (Unicode), 9RUNIC (Unicode), 9runic (Unicode), 9
SAMARITAN (Unicode), 9samaritan (Unicode), 9SAURASHTRA (Unicode), 9saurashtra (Unicode), 9SECOND (DateTime), 4SECOND_IN (DateTime), 4SHARADA (Unicode), 9sharada (Unicode), 9SHAVIAN (Unicode), 9shavian (Unicode), 9SHORTHAND_FORMAT_CONTROLS (Unicode), 9shorthand_format_controls (Unicode), 9SIDDHAM (Unicode), 9
18 INDEX
siddham (Unicode), 9single_line (modify_mode), 6SINHALA (Unicode), 9sinhala (Unicode), 9SINHALA_ARCHAIC_NUMBERS (Unicode), 9sinhala_archaic_numbers (Unicode), 9SMALL_FORM_VARIANTS (Unicode), 9small_form_variants (Unicode), 9SORA_SOMPENG (Unicode), 9sora_sompeng (Unicode), 9SPACE (CharacterClasses), 3space (ClassGroups), 4SPACING_MODIFIER_LETTERS (Unicode), 9spacing_modifier_letters (Unicode), 9SPC (CharacterClasses), 3spc (ClassGroups), 4SpecialCharacters, 9, 9SPECIALS (Unicode), 9specials (Unicode), 9STAR (SpecialCharacters), 9START (Anchors), 2SUNDANESE (Unicode), 9sundanese (Unicode), 9SUNDANESE_SUPPLEMENT (Unicode), 9sundanese_supplement (Unicode), 9SUPERSCRIPTS_AND_SUBSCRIPTS (Unicode), 9superscripts_and_subscripts (Unicode), 9SUPPLEMENTAL_ARROWS_A (Unicode), 9supplemental_arrows_a (Unicode), 9SUPPLEMENTAL_MATHEMATICAL_OPERATORS
(Unicode), 9supplemental_mathematical_operators
(Unicode), 9SUPPLEMENTAL_PUNCTUATION (Unicode), 9supplemental_punctuation (Unicode), 9SUPPLEMENTARY_PRIVATE_USE_AREA_A
(Unicode), 9supplementary_private_use_area_a
(Unicode), 9SUPPLEMENTARY_PRIVATE_USE_AREA_B
(Unicode), 9supplementary_private_use_area_b
(Unicode), 9SYLOTI_NAGRI (Unicode), 9syloti_nagri (Unicode), 9SYRIAC (Unicode), 9syriac (Unicode), 9
TAGALOG (Unicode), 9
tagalog (Unicode), 9TAGBANWA (Unicode), 9tagbanwa (Unicode), 9TAGS (Unicode), 9tags (Unicode), 9TAI_LE (Unicode), 9tai_le (Unicode), 9TAI_THAM (Unicode), 9tai_tham (Unicode), 9TAI_VIET (Unicode), 9tai_viet (Unicode), 9TAI_XUAN_JING_SYMBOLS (Unicode), 9tai_xuan_jing_symbols (Unicode), 9TAKRI (Unicode), 9takri (Unicode), 9TAMIL (Unicode), 9tamil (Unicode), 9TELUGU (Unicode), 9telugu (Unicode), 9THAANA (Unicode), 9thaana (Unicode), 9THAI (Unicode), 9thai (Unicode), 9TIBETAN (Unicode), 9tibetan (Unicode), 9TIFINAGH (Unicode), 9tifinagh (Unicode), 9TIMEZONE (DateTime), 4TIMEZONE_OFFSET (DateTime), 4TIRHUTA (Unicode), 9tirhuta (Unicode), 9token (capture), 3TRANSPORT_AND_MAP_SYMBOLS (Unicode), 9transport_and_map_symbols (Unicode), 9
UGARITIC (Unicode), 9ugaritic (Unicode), 9UGC_CASED_LETTER
(UnicodeGeneralCategory), 9ugc_cased_letter
(UnicodeGeneralCategory), 9UGC_CLOSE_PUNCTUATION
(UnicodeGeneralCategory), 9ugc_close_punctuation
(UnicodeGeneralCategory), 9UGC_CONNECTOR_PUNCTUATION
(UnicodeGeneralCategory), 9ugc_connector_punctuation
(UnicodeGeneralCategory), 9
INDEX 19
UGC_CONTROL (UnicodeGeneralCategory), 9ugc_control (UnicodeGeneralCategory), 9UGC_CURRENCY_SYMBOL
(UnicodeGeneralCategory), 9ugc_currency_symbol
(UnicodeGeneralCategory), 9UGC_DASH_PUNCTUATION
(UnicodeGeneralCategory), 9ugc_dash_punctuation
(UnicodeGeneralCategory), 9UGC_DECIMAL_NUMBER
(UnicodeGeneralCategory), 9ugc_decimal_number
(UnicodeGeneralCategory), 9UGC_ENCLOSING_MARK
(UnicodeGeneralCategory), 9ugc_enclosing_mark
(UnicodeGeneralCategory), 9UGC_FINAL_PUNCTUATION
(UnicodeGeneralCategory), 9ugc_final_punctuation
(UnicodeGeneralCategory), 9UGC_FORMAT_CONTROL
(UnicodeGeneralCategory), 9ugc_format_control
(UnicodeGeneralCategory), 9UGC_INITIAL_PUNCTUATION
(UnicodeGeneralCategory), 9ugc_initial_punctuation
(UnicodeGeneralCategory), 9UGC_LETTER (UnicodeGeneralCategory), 9ugc_letter (UnicodeGeneralCategory), 9UGC_LETTER_NUMBER
(UnicodeGeneralCategory), 9ugc_letter_number
(UnicodeGeneralCategory), 9UGC_LINE_SEPARATOR
(UnicodeGeneralCategory), 9ugc_line_separator
(UnicodeGeneralCategory), 9UGC_LOWERCASE_LETTER
(UnicodeGeneralCategory), 9ugc_lowercase_letter
(UnicodeGeneralCategory), 9UGC_MARK (UnicodeGeneralCategory), 9ugc_mark (UnicodeGeneralCategory), 9UGC_MATH_SYMBOL
(UnicodeGeneralCategory), 9
ugc_math_symbol(UnicodeGeneralCategory), 9
UGC_MODIFIER_LETTER(UnicodeGeneralCategory), 9
ugc_modifier_letter(UnicodeGeneralCategory), 9
UGC_MODIFIER_SYMBOL(UnicodeGeneralCategory), 9
ugc_modifier_symbol(UnicodeGeneralCategory), 9
UGC_NONSPACING_MARK(UnicodeGeneralCategory), 9
ugc_nonspacing_mark(UnicodeGeneralCategory), 9
UGC_NUMBER (UnicodeGeneralCategory), 9ugc_number (UnicodeGeneralCategory), 9UGC_OPEN_PUNCTUATION
(UnicodeGeneralCategory), 9ugc_open_punctuation
(UnicodeGeneralCategory), 9UGC_OTHER (UnicodeGeneralCategory), 9ugc_other (UnicodeGeneralCategory), 9UGC_OTHER_LETTER
(UnicodeGeneralCategory), 9ugc_other_letter
(UnicodeGeneralCategory), 9UGC_OTHER_NUMBER
(UnicodeGeneralCategory), 9ugc_other_number
(UnicodeGeneralCategory), 9UGC_OTHER_PUNCTUATION
(UnicodeGeneralCategory), 9ugc_other_punctuation
(UnicodeGeneralCategory), 9UGC_OTHER_SYMBOL
(UnicodeGeneralCategory), 9ugc_other_symbol
(UnicodeGeneralCategory), 9UGC_PARAGRAPH_SEPARATOR
(UnicodeGeneralCategory), 9ugc_paragraph_separator
(UnicodeGeneralCategory), 9UGC_PRIVATE_USE_CONTROL
(UnicodeGeneralCategory), 9ugc_private_use_control
(UnicodeGeneralCategory), 9UGC_PUNCTUATION
(UnicodeGeneralCategory), 9
20 INDEX
ugc_punctuation(UnicodeGeneralCategory), 9
UGC_SEPARATOR (UnicodeGeneralCategory),9
ugc_separator (UnicodeGeneralCategory),9
UGC_SPACE_SEPARATOR(UnicodeGeneralCategory), 9
ugc_space_separator(UnicodeGeneralCategory), 9
UGC_SPACING_MARK(UnicodeGeneralCategory), 9
ugc_spacing_mark(UnicodeGeneralCategory), 9
UGC_SURROGATE_CONTROL(UnicodeGeneralCategory), 9
ugc_surrogate_control(UnicodeGeneralCategory), 9
UGC_SYMBOL (UnicodeGeneralCategory), 9ugc_symbol (UnicodeGeneralCategory), 9UGC_TITLECASE_LETTER
(UnicodeGeneralCategory), 9ugc_titlecase_letter
(UnicodeGeneralCategory), 9UGC_UNASSIGNED_CONTROL
(UnicodeGeneralCategory), 9ugc_unassigned_control
(UnicodeGeneralCategory), 9UGC_UPPERCASE_LETTER
(UnicodeGeneralCategory), 9ugc_uppercase_letter
(UnicodeGeneralCategory), 9Unicode, 9, 9unicode_intersect (UnicodeOperators), 9unicode_inverse (UnicodeOperators), 9unicode_setdiff (UnicodeOperators), 9unicode_union (UnicodeOperators), 9UnicodeGeneralCategory, 9, 9UnicodeOperators, 9, 9UnicodeProperty, 10, 10UNIFIED_CANADIAN_ABORIGINAL_SYLLABICS
(Unicode), 9unified_canadian_aboriginal_syllabics
(Unicode), 9UNIFIED_CANADIAN_ABORIGINAL_SYLLABICS_EXTENDED
(Unicode), 9unified_canadian_aboriginal_syllabics_extended
(Unicode), 9
UNMATCHABLE (CharacterClasses), 3UP_ALPHABETIC (UnicodeProperty), 10up_alphabetic (UnicodeProperty), 10UP_ASCII_HEX_DIGIT (UnicodeProperty), 10up_ascii_hex_digit (UnicodeProperty), 10UP_BIDI_CONTROL (UnicodeProperty), 10up_bidi_control (UnicodeProperty), 10UP_BIDI_MIRRORED (UnicodeProperty), 10up_bidi_mirrored (UnicodeProperty), 10UP_CASE_IGNORABLE (UnicodeProperty), 10up_case_ignorable (UnicodeProperty), 10UP_CASE_SENSITIVE (UnicodeProperty), 10up_case_sensitive (UnicodeProperty), 10UP_CASED (UnicodeProperty), 10up_cased (UnicodeProperty), 10UP_CHANGES_WHEN_CASEFOLDED
(UnicodeProperty), 10up_changes_when_casefolded
(UnicodeProperty), 10UP_CHANGES_WHEN_CASEMAPPED
(UnicodeProperty), 10up_changes_when_casemapped
(UnicodeProperty), 10UP_CHANGES_WHEN_LOWERCASED
(UnicodeProperty), 10up_changes_when_lowercased
(UnicodeProperty), 10UP_CHANGES_WHEN_NFKC_CASEFOLDED
(UnicodeProperty), 10up_changes_when_nfkc_casefolded
(UnicodeProperty), 10UP_CHANGES_WHEN_TITLECASED
(UnicodeProperty), 10up_changes_when_titlecased
(UnicodeProperty), 10UP_CHANGES_WHEN_UPPERCASED
(UnicodeProperty), 10up_changes_when_uppercased
(UnicodeProperty), 10UP_DASH (UnicodeProperty), 10up_dash (UnicodeProperty), 10UP_DEFAULT_IGNORABLE_CODE_POINT
(UnicodeProperty), 10up_default_ignorable_code_point
(UnicodeProperty), 10UP_DEPRECATED (UnicodeProperty), 10up_deprecated (UnicodeProperty), 10UP_DIACRITIC (UnicodeProperty), 10
INDEX 21
up_diacritic (UnicodeProperty), 10UP_EXTENDER (UnicodeProperty), 10up_extender (UnicodeProperty), 10UP_HEX_DIGIT (UnicodeProperty), 10up_hex_digit (UnicodeProperty), 10UP_HYPHEN (UnicodeProperty), 10up_hyphen (UnicodeProperty), 10UP_ID_CONTINUE (UnicodeProperty), 10up_id_continue (UnicodeProperty), 10UP_ID_START (UnicodeProperty), 10up_id_start (UnicodeProperty), 10UP_IDEOGRAPHIC (UnicodeProperty), 10up_ideographic (UnicodeProperty), 10UP_LOWERCASE (UnicodeProperty), 10up_lowercase (UnicodeProperty), 10UP_MATH (UnicodeProperty), 10up_math (UnicodeProperty), 10UP_NONCHARACTER_CODE_POINT
(UnicodeProperty), 10up_noncharacter_code_point
(UnicodeProperty), 10UP_POSIX_ALNUM (UnicodeProperty), 10up_posix_alnum (UnicodeProperty), 10UP_POSIX_BLANK (UnicodeProperty), 10up_posix_blank (UnicodeProperty), 10UP_POSIX_GRAPH (UnicodeProperty), 10up_posix_graph (UnicodeProperty), 10UP_POSIX_PRINT (UnicodeProperty), 10up_posix_print (UnicodeProperty), 10UP_POSIX_XDIGIT (UnicodeProperty), 10up_posix_xdigit (UnicodeProperty), 10UP_QUOTATION_MARK (UnicodeProperty), 10up_quotation_mark (UnicodeProperty), 10UP_SOFT_DOTTED (UnicodeProperty), 10up_soft_dotted (UnicodeProperty), 10UP_TERMINAL_PUNCTUATION
(UnicodeProperty), 10up_terminal_punctuation
(UnicodeProperty), 10UP_UPPERCASE (UnicodeProperty), 10up_uppercase (UnicodeProperty), 10UP_WHITE_SPACE (UnicodeProperty), 10up_white_space (UnicodeProperty), 10UPPER (CharacterClasses), 3upper (ClassGroups), 4
VAI (Unicode), 9vai (Unicode), 9VARIATION_SELECTORS (Unicode), 9
variation_selectors (Unicode), 9VARIATION_SELECTORS_SUPPLEMENT
(Unicode), 9variation_selectors_supplement
(Unicode), 9VEDIC_EXTENSIONS (Unicode), 9vedic_extensions (Unicode), 9VERTICAL_FORMS (Unicode), 9vertical_forms (Unicode), 9
WARANG_CITI (Unicode), 9warang_citi (Unicode), 9WEEK_OF_YEAR (DateTime), 4WEEK_OF_YEAR_IN (DateTime), 4WEEKDAY0 (DateTime), 4WEEKDAY1 (DateTime), 4whole_word, 10, 10WordBoundaries, 10, 10WRD (CharacterClasses), 3wrd (ClassGroups), 4
YDM (DateTime), 4YDM_IN (DateTime), 4YEAR (DateTime), 4YEAR2 (DateTime), 4YEAR4 (DateTime), 4YI_RADICALS (Unicode), 9yi_radicals (Unicode), 9YI_SYLLABLES (Unicode), 9yi_syllables (Unicode), 9YIJING_HEXAGRAM_SYMBOLS (Unicode), 9yijing_hexagram_symbols (Unicode), 9YIJING_MONO_DI_AND_TRIGRAMS (Unicode), 9yijing_mono_di_and_trigrams (Unicode), 9YMD (DateTime), 4YMD_IN (DateTime), 4
zero_or_more (repeated), 8