Post on 28-Jan-2015
description
Math Editing and Display Math Editing and Display in Word 2007in Word 2007
Murray Sargent IIIMurray Sargent IIIPublisher Text ServicesPublisher Text Services
28-may-200828-may-2008
OverviewOverview
8 math infrastructures enable better math 8 math infrastructures enable better math display/editingdisplay/editing
New Office math edit/display environmentNew Office math edit/display environment Interoperate with math programs such as Interoperate with math programs such as
Mathematica, Maple, publisher workflowMathematica, Maple, publisher workflow Input methods and formatsInput methods and formats LayoutLayout Math fontMath font
Complex ProjectComplex Project
Intricacies of math typesettingIntricacies of math typesetting Creating and using a large set of glyph variantsCreating and using a large set of glyph variants Vagaries of math notationVagaries of math notation Embedding math zones into international text Embedding math zones into international text
environmentsenvironments Interaction with complex scriptsInteraction with complex scripts Math in other objects like hyperlinks, rubyMath in other objects like hyperlinks, ruby Input with nonASCII keyboardsInput with nonASCII keyboards
Eight Math InfrastructuresEight Math Infrastructures
[La]TeX: current tech-doc standards[La]TeX: current tech-doc standards Unicode 5.0: includes ~2000 math symbolsUnicode 5.0: includes ~2000 math symbols MathML 2.0: math K MathML 2.0: math K –– 12 and beyond 12 and beyond OpenType font technology: special math tablesOpenType font technology: special math tables New math font (Cambria Math)New math font (Cambria Math) Math layout handlerMath layout handler Shared math input componentsShared math input components MS Office environment, autocorrect MS Office environment, autocorrect
[La]TeX[La]TeX
Widely used, high-quality tech document Widely used, high-quality tech document preparation languagepreparation language
Simple ASCII keyboard entrySimple ASCII keyboard entry Usage and math typography are very well Usage and math typography are very well
documenteddocumented Stable since 1990Stable since 1990 Complex scenarios are hard to editComplex scenarios are hard to edit Numerous dialects, user macros, and lack of Numerous dialects, user macros, and lack of
Unicode complicate interchangeUnicode complicate interchange Fonts aren’t well suited to screen displayFonts aren’t well suited to screen display
Unicode 5.0Unicode 5.0
340 math chars exist in ASCII, U+2200 block, 340 math chars exist in ASCII, U+2200 block, arrows, combining marksarrows, combining marks
1016 math alphanumeric characters are in 1016 math alphanumeric characters are in Unicode Plane 1 or Letterlike SymbolsUnicode Plane 1 or Letterlike Symbols
591 new math symbols and operators are on 591 new math symbols and operators are on BMPBMP
One math variant selectorOne math variant selector One new combining character (reverse solidus)One new combining character (reverse solidus) New math characters were requested by STIXNew math characters were requested by STIX
Basic Set of Alphanumeric Basic Set of Alphanumeric CharactersCharacters
Latin digits (Latin digits (0 - 90 - 9)) Upper- & lowercase Latin letters (Upper- & lowercase Latin letters (aa - - zz, , AA - - ZZ) ) Uppercase Greek letters Uppercase Greek letters - Α Ω - Α Ω plus the nabla plus the nabla ∇∇
and a variant of theta and a variant of theta ΘΘ Lowercase Greek letters Lowercase Greek letters - α ω - α ω plus the partial plus the partial
differential sign differential sign ∂∂ and glyph variants of and glyph variants of , , , , ε θ κ φ, , , , ε θ κ φρρ,, and and ππ
Only unaccented forms of letters are usedOnly unaccented forms of letters are used
Legibility LossLegibility Loss
Without math alphabetics, the Hamiltonian formulaWithout math alphabetics, the Hamiltonian formula
HH = = ddττ [[εεEE22 + + μμHH22]]
becomes an integral equationbecomes an integral equation
H = H = ddττ [[εεEE22 + + μμHH22]]
Math Alphanumeric CharactersMath Alphanumeric Characters
• Math needs various Latin and Greek styles like Math needs various Latin and Greek styles like normal, bold, italic, script, Fraktur, and open-facenormal, bold, italic, script, Fraktur, and open-face
• May appear to be font variations, but have distinct May appear to be font variations, but have distinct semantics and spacingssemantics and spacings
• Without these distinctions, you get gibberish, violating Without these distinctions, you get gibberish, violating Unicode rule: Unicode rule: plain text must contain enough info to plain text must contain enough info to permit text to be rendered legibly, and nothing morepermit text to be rendered legibly, and nothing more
• Plain-text searches should distinguish between Plain-text searches should distinguish between alphabets, e.g., a search for script alphabets, e.g., a search for script HH shouldn’t match shouldn’t match HH, etc., etc.
MathMLMathML
MathML 1.0 (April, 1998) was the first World MathML 1.0 (April, 1998) was the first World Wide Web Consortium (W3C) endorsed XML Wide Web Consortium (W3C) endorsed XML vocabularyvocabulary
Low-level format for describing mathematics as Low-level format for describing mathematics as a basis for machine to machine communicationa basis for machine to machine communication
MathML facilitates the use and re-use of MathML facilitates the use and re-use of scientific content on the Webscientific content on the Web
MathML 2.0 released in late 2003 is now widely MathML 2.0 released in late 2003 is now widely used in exchanging mathematical textused in exchanging mathematical text
MathML 2.0 spec has a wealth of math infoMathML 2.0 spec has a wealth of math info
MathML Presentation MarkupMathML Presentation Markup
<mrow> <mi>E</mi> <mo>=</mo> <mrow> <mi>m</mi>
<mo>⁢</mo> <msup> <mi>c</mi> <mn>2</mn> </msup>
</mrow></mrow>
Presentation markup directs how the math Presentation markup directs how the math should be rendered.should be rendered.
E = mc2
Office MathML (OMML)Office MathML (OMML)
<m:oMath><m:oMath>
<m:r><m:t>E=m</m:t></m:r><m:r><m:t>E=m</m:t></m:r>
<m:sSup><m:sSup>
<m:e><m:e>
<m:r><m:t>c</m:t></m:r><m:r><m:t>c</m:t></m:r>
</m:e></m:e>
<m:sup><m:sup>
<m:r><m:t>2</m:t></m:r><m:r><m:t>2</m:t></m:r>
</m:sup></m:sup>
</m:sSup></m:sSup>
</m:oMath></m:oMath>
E = mc2
MathML with Custom XMLMathML with Custom XML
Can put arbitrary namespace attributes in Can put arbitrary namespace attributes in MathML tagsMathML tags
More complicated embellishments can useMore complicated embellishments can use
<semantics><semantics>MathML representationMathML representation<annotation-XML><annotation-XML>
EnhancementsEnhancements</annotation-XML></annotation-XML>
</semantics></semantics>
MathML ParsingMathML Parsing
MathML can be tricky to parse. For sin MathML can be tricky to parse. For sin xx::
<mrow><mrow>
<mi>sin</mi><mi>sin</mi>
<mo>&FunctionApply;</mo><mo>&FunctionApply;</mo>
<mi>x</mi><mi>x</mi>
</mrow></mrow>
Don’t know it’s a function-apply object until Don’t know it’s a function-apply object until reaching &FunctionApply: have to analyze reaching &FunctionApply: have to analyze expressions as with the linear formatexpressions as with the linear format
Linear FormatLinear Format
E=mc^2E=mc^2
E = mc2
Math RTFMath RTF
Math RTF is OMML in RTF syntaxMath RTF is OMML in RTF syntax Somewhat simplified (doesn’t need text tag)Somewhat simplified (doesn’t need text tag) For example, For example,
<m:f> ... </m:f> → {\mf ... }<m:f> ... </m:f> → {\mf ... } Thoroughly defined in latest Thoroughly defined in latest RTF spec Reading spec is great way to learn how Word Reading spec is great way to learn how Word
represents mathrepresents math
Accented charactersAccented characters
Accents are handled by math accent Accents are handled by math accent objectobject
Accents may apply to multiple charactersAccents may apply to multiple characters Accents may be flattenedAccents may be flattened
Vagaries of Math NotationVagaries of Math Notation
Choice of subscript/superscript baseChoice of subscript/superscript base Function arguments likeFunction arguments like Integrands and Integrands and nn-aryands-aryands Absolute value ambiguities like ||Absolute value ambiguities like ||aa|-||-|bb||. ||.
Actually this example is unambiguous, but Actually this example is unambiguous, but ||aa||b b - - cc||dd| has two possible meanings| has two possible meanings
Context sensitive ellipses: … vs ⋯Context sensitive ellipses: … vs ⋯
Math SpacingMath Spacing
Operators have math spacing given by extended Operators have math spacing given by extended TeX spacing rulesTeX spacing rules
Function object gives correct spacing between Function object gives correct spacing between object and neighbors, and between function object and neighbors, and between function name and argumentname and argument
nn-aryand object gives correct spacing between -aryand object gives correct spacing between nn-ary operator and its -ary operator and its nn-aryand-aryand
Automate much need for TeX spacing “tweaks”Automate much need for TeX spacing “tweaks” Context-dependent operator spacing like + - . , :Context-dependent operator spacing like + - . , :
Font SizingFont Sizing
Text style, script style (70%), script script Text style, script style (70%), script script style (60%)style (60%)
Sub/sups…, fractions in lineSub/sups…, fractions in line CrampedCramped
ConfusablesConfusables
1 vs l1 vs l 𝑎 𝑎 vs vs 𝛼𝛼 𝑣 𝑣 vsvs 𝜈 𝜈 vsvs 𝜐 𝜐 𝒳 𝒳 vsvs 𝜒 𝜒 Y vs Y vs ΥΥOther letter similarities are so close Other letter similarities are so close that they are avoided, e.g., UC alpha that they are avoided, e.g., UC alpha and LC omicron are never used.and LC omicron are never used.
Math Input MethodsMath Input Methods
Linear format input and manual buildupLinear format input and manual buildup Formula autobuildup (FAB)Formula autobuildup (FAB) Math ribbonsMath ribbons Recognition of handwritten formulaeRecognition of handwritten formulae Hex code inputHex code input WYSIWYG editingWYSIWYG editing Hybrid editing (combination of WYSIWYG Hybrid editing (combination of WYSIWYG
and FAB)and FAB)
Hex to Unicode Input MethodHex to Unicode Input Method
Type Unicode character hexadecimal codeType Unicode character hexadecimal code Make corrections as need beMake corrections as need be Type Alt+x to convert to characterType Alt+x to convert to character Type Alt+x to convert back to hex (useful Type Alt+x to convert back to hex (useful
especially for “missing glyph” character)especially for “missing glyph” character) Resolve ambiguities by selectionResolve ambiguities by selection Input higher-plane chars using 5 or 6-digit codeInput higher-plane chars using 5 or 6-digit code MS Word and RichEdit standardMS Word and RichEdit standard
Autocorrect ExamplesAutocorrect Examples
Type \delta and get Type \delta and get δδ, \Delta and get , \Delta and get ΔΔ Define \quadratic to beDefine \quadratic to be
x = (-b ± √(b^2 - 4ac))/2ax = (-b ± √(b^2 - 4ac))/2a Then typing \quadratic<space> inserts:Then typing \quadratic<space> inserts:
Math AlphabeticsMath Alphabetics
\scriptA, \frakturA, \doubleA, etc., are used to \scriptA, \frakturA, \doubleA, etc., are used to insert math script, Fraktur, and double-struck insert math script, Fraktur, and double-struck alphabeticsalphabetics
Italic and bold are controlled by italic & bold Italic and bold are controlled by italic & bold format tools and only apply to math alphabeticsformat tools and only apply to math alphabetics
Italic and/or bold is ignored for characters that Italic and/or bold is ignored for characters that don’t have corresponding Unicodedon’t have corresponding Unicode
Linear format mathLinear format math
• Simple operand is a Simple operand is a spanspan of alphanumeric of alphanumeric characterscharacters
• E.g., simple numerator or denominator is E.g., simple numerator or denominator is terminated by any nonalphanumeric terminated by any nonalphanumeric charactercharacter
• abcabc//dd gives gives
• More complicated operands use parentheses More complicated operands use parentheses ( ), brackets [ ], or { } ( ), brackets [ ], or { }
• Outermost parens in fractions aren’t Outermost parens in fractions aren’t displayed in built-up formdisplayed in built-up form
abcd
Linear format math (cont)Linear format math (cont)
E.g., plain text (a + c)E.g., plain text (a + c)//d displays asd displays as
• Easier to read than TEasier to read than TEEX’s, e.g., {X’s, e.g., {a + c\over da + c\over d} } • MathML: MathML: <mfrac><mrow><mi>a</mi><mo>+</mo> <mfrac><mrow><mi>a</mi><mo>+</mo>
<mi>c</mi></mrow><mrow><mi>d</mi> <mi>c</mi></mrow><mrow><mi>d</mi> </mrow></mfrac></mrow></mfrac>
• Neat feature: linear-format text looks like mathNeat feature: linear-format text looks like math
Subscripts and SuperscriptsSubscripts and Superscripts
Unicode has numeric subscripts and Unicode has numeric subscripts and superscripts along with some operators superscripts along with some operators (U+2070-U+208E): convert to regular(U+2070-U+208E): convert to regular
Others need some kind of markup like Others need some kind of markup like <msup>…<msup>…</msup></msup>
Use TeX’s _ and ^ subscript/superscript ops for Use TeX’s _ and ^ subscript/superscript ops for input; they can be displayed as a subscripted input; they can be displayed as a subscripted down arrow and superscripted up arrowdown arrow and superscripted up arrow
Use parentheses as for fractions to overrule Use parentheses as for fractions to overrule built-in precedence orderbuilt-in precedence order
Formula AutobuildupFormula Autobuildup
Enter formulas in linear format in a math zoneEnter formulas in linear format in a math zone When a character is typed that renders an When a character is typed that renders an
expression syntactically unambiguous, the expression syntactically unambiguous, the expression is built upexpression is built up
Edit expressions in built-up form or in linear formEdit expressions in built-up form or in linear form For integrals, type \int (which autocorrects to ∫ ) For integrals, type \int (which autocorrects to ∫ )
optionally followed by subscript and superscript optionally followed by subscript and superscript for limits, which auto build upfor limits, which auto build up
Can autocorrect \<letters> to built-up characters Can autocorrect \<letters> to built-up characters or expressionsor expressions
Roles of Space (U+0020)Roles of Space (U+0020)
The ASCII space is rarely needed inside math The ASCII space is rarely needed inside math expressions, since math spacing is automaticexpressions, since math spacing is automatic
Use to terminate autocorrect entries and to Use to terminate autocorrect entries and to terminate expressions. When so used, is deletedterminate expressions. When so used, is deleted
Use as command to build up math objectsUse as command to build up math objects Use to define spacings for , . and : and to force a Use to define spacings for , . and : and to force a
unary operator to display with binary spacingunary operator to display with binary spacing A space builds up one subexpression; other A space builds up one subexpression; other
operators build up as many as they canoperators build up as many as they can
Unicode SpacesUnicode Spaces
Space Unicode Autocorrect
0 em U+200B \zwsp
1/18 em U+200A \hairsp
3/18 em U+2009 \thinsp
4/18 em U+205F \medsp
5/18 em U+2005 \thicksp
6/18 em U+2004 \vthicksp
9/18 em U+2002 \ensp
18/18 em U+2003 \emsp
(digit width) U+2007 \numsp
(space width) U+00A0 \nbsp
OperatorsOperators
Operator Precedence
CR 0
opOpen 1
opClose 2
opSeparator 3
concatenation 4
/ \atop 5
opNary 6_ ^ opFApply \above \below 7
□ ∛ ∜ ■ opHbracket 8opAccent 9
opUniSubSup 10
Four Math InvisiblesFour Math Invisibles
There are four “invisible” math control codesThere are four “invisible” math control codes
Used for semantic content and usually don’t Used for semantic content and usually don’t display a glyph. May have a small width, e.g., display a glyph. May have a small width, e.g., Function Apply has \thinspFunction Apply has \thinsp
Math control code Unicode
Invisible Function Apply U+2061
Invisible Times U+2062
Invisible Comma U+2063
Invisible Plus U+2064
Math LayoutMath Layout
Collaboration between 5 entities:Collaboration between 5 entities: Unicode rich-text text processing program Unicode rich-text text processing program
such as Word or RichEditsuch as Word or RichEdit LineServices math handler LineServices math handler Page/TableServices math handlerPage/TableServices math handler Math font, e.g., Cambria MathMath font, e.g., Cambria Math Math-font handlerMath-font handler
Equation Breaking & NumberingEquation Breaking & Numbering
PTS math handler can break equations into PTS math handler can break equations into multiple lines automatically or by user breaksmultiple lines automatically or by user breaks
PTS can handle layout of equation numbersPTS can handle layout of equation numbers Client needs to support “math paragraph”Client needs to support “math paragraph” Two kinds of user breaks: at operator via context Two kinds of user breaks: at operator via context
menu, at line break (Shift+Enter)menu, at line break (Shift+Enter) At operator indentation: each TAB indents to At operator indentation: each TAB indents to
next binary/relational operatornext binary/relational operator Line break: align at specific operators, e.g., = Line break: align at specific operators, e.g., =
Math Engine ObjectsMath Engine Objects
Glyph VariantsGlyph Variants
Subscripts/superscriptsSubscripts/superscripts PrimesPrimes Dotless i, j used in bases of accent objectsDotless i, j used in bases of accent objects Flattened and wide accentsFlattened and wide accents Growable brackets, integrals, arrowsGrowable brackets, integrals, arrows Display of differentials using U+2146Display of differentials using U+2146 Mirror images for right-to-left mathMirror images for right-to-left math Variation selector U+FE00Variation selector U+FE00
Cambria Math FontCambria Math Font
Cambria typeface designed by Jelle BosmaCambria typeface designed by Jelle Bosma Extended for math by Ross Mills and Andrei Extended for math by Ross Mills and Andrei
Burago in collaboration with the ClearType and Burago in collaboration with the ClearType and math-layout groupsmath-layout groups
Contains extensive math tables, glyph variants Contains extensive math tables, glyph variants and much of the Unicode math setand much of the Unicode math set
Is designed with ClearType and excellent screen Is designed with ClearType and excellent screen readibility in mindreadibility in mind
Enables best screen-resolution display of mathEnables best screen-resolution display of math
New Math FontsNew Math Fonts
Cambria Math has new version with more math Cambria Math has new version with more math characters, e.g., U+2900..U+2AFFcharacters, e.g., U+2900..U+2AFF
202 math characters still needed for Unicode 5.1202 math characters still needed for Unicode 5.1 STIX Times Roman math font is in beta; doesn’t STIX Times Roman math font is in beta; doesn’t
support Word 2007 math wellsupport Word 2007 math well STIX has full math character set + someSTIX has full math character set + some STIX font is Type I, so it doesn’t work with the STIX font is Type I, so it doesn’t work with the
Office pdf writerOffice pdf writer Font demosFont demos
Font Math TablesFont Math Tables
Specialized math tables have been created to Specialized math tables have been created to control glyph placementscontrol glyph placements
Position subscripts/superscripts horizontally Position subscripts/superscripts horizontally using cut-ins and italic correctionsusing cut-ins and italic corrections
Many math constants: axis height, fraction rule Many math constants: axis height, fraction rule thickness, etc.thickness, etc.
Compare kerning of Compare kerning of The math tables are formalized as OpenType The math tables are formalized as OpenType
tables accessible via mathfont.dlltables accessible via mathfont.dll
Math ConstantsMath Constants
User Spacing AdjustmentsUser Spacing Adjustments
Layout engine attempts to render with high Layout engine attempts to render with high typographic qualitytypographic quality
Users can spoil layout by inserting space where Users can spoil layout by inserting space where engine would insert it automaticallyengine would insert it automatically
Have autocorrect procedure to reduce thisHave autocorrect procedure to reduce this Users can insert Unicode spacesUsers can insert Unicode spaces Phantoms and smashesPhantoms and smashes Size and placement overridesSize and placement overrides
Phantoms and SmashesPhantoms and Smashes
Phantoms have size but no display. Can Phantoms have size but no display. Can have both width & height, ascent only, have both width & height, ascent only, descent onlydescent only
Smashes display, but remove one or more Smashes display, but remove one or more sizes, e.g., descent, ascent, and/or widthsizes, e.g., descent, ascent, and/or width
Word 2007 Math FacilityWord 2007 Math Facility
Elegant math entry and displayElegant math entry and display Display is competitive with TeXDisplay is competitive with TeX Automatic line breaking, special kerningAutomatic line breaking, special kerning More math semantics than TeX: greater More math semantics than TeX: greater
interoperability (Presentation MathML)interoperability (Presentation MathML) Input with math ribbon, context menusInput with math ribbon, context menus Formula autobuildup input methodFormula autobuildup input method WYSIWYG editing as well as linear formatWYSIWYG editing as well as linear format MS Math graphing calculator add-inMS Math graphing calculator add-in
What Word 2007 doesn’t haveWhat Word 2007 doesn’t have
Built-in equation numberingBuilt-in equation numbering Math Find/ReplaceMath Find/Replace OpenType enhancements (aside from math OpenType enhancements (aside from math
table functionality)table functionality) Optimal line breakingOptimal line breaking Configurable math-zone vertical spacingConfigurable math-zone vertical spacing [La]TeX import/export[La]TeX import/export Document wide MathML support (only MathML Document wide MathML support (only MathML
for a single math zone)for a single math zone)
ConclusionsConclusions Eight infrastructures allow us to do math display and Eight infrastructures allow us to do math display and
editing better than ever beforeediting better than ever before High quality math handler and font enable typography High quality math handler and font enable typography
competitive with or better than TeXcompetitive with or better than TeX Best screen-resolution display of mathematicsBest screen-resolution display of mathematics Streamlined input methods such as Formula AutobuildupStreamlined input methods such as Formula Autobuildup Incorporated into Word 2007, Word down-level Incorporated into Word 2007, Word down-level
converter, Microsoft Math calculatorconverter, Microsoft Math calculator Cambria Math font: state-of-art math fontCambria Math font: state-of-art math font