SIGNWRITING SYMPOSIUM PRESENTATION 43: The SignWriting Stack 2015 by Stephen E Slevinski Jr
SIGNWRITING IN UNICODE 8 ISSUES 2015 by Stephen E Slevinski Jr
-
Upload
signwriting-for-sign-languages -
Category
Software
-
view
118 -
download
1
Transcript of SIGNWRITING IN UNICODE 8 ISSUES 2015 by Stephen E Slevinski Jr
Issues withSignWriting
inUnicode 8
Prepared for UTC # 144 / L2 # 241 (July 27-31, 2015)a Unicode Technical Committee meeting in Redmond,
WAby Stephen E Slevinski Jr
in association with the Center for Sutton Movement Writing
My BackgroundBachelor of Science
in MathematicsRaised two kids with
sign languageStarted collaboration with
Valerie Sutton from 2004 until today
Complete symbol encoding model on PUA Plane 16 (37,811
characters)
Complete script encoding model on PUA Plane 15 (1,179
characters)
Argued with Unicode in 2011 and then walked away
Released the ISWA 2010 symbol set in 2010
Finalized Formal SignWriting in ASCII on Jan 12, 2012
5 Years of stability with the symbol set and fonts
design
3 1/2 Years of stability with the character encoding
models
Involved with dozens of sign languages around the world
Foundation for all online use and modern publishing efforts
SignWriting in SoftwareAll major SignWriting editors and viewers are compatible.
• SignPuddle OnlinePrimary source of written sign language
• Delegs EditorEducational software from Germany for bilingual education.
• SignWriter StudioGeneral purpose SignWriting editor, integrated dictionary, and printing.
• SWiftSignWriting improved fast transcriber that aims to simplify the editing
process.• JSPad
SignWriting editor for Japanese sign language based in the Gifu University.• Tunisigner
interact with SignWriting notations through a 3D virtual signer able to reproduce the exact gestures represented within the sign language transcription.• SignTyp
a linguistic coding system developed by Rachel Channon through an NSF grant that is being integrated with SignWriting.
http://www.signbank.org/signmaker.html
Code Breakdown
Series10 KB7 KB
14 KB21 KB28 KB35 KB42 KB49 KB56 KB63 KB70 KB
ConfigurationSupport LibrariesCustom HTML, JS, and CSS
SignMaker 2015Cross-browser, drag-and-drop sign editor,
with dictionary and advanced sign searching
SignWriting in Software
Bookmarklet
Javascript-based SignWriting Keyboard
Keyboarding editing has returned to SignWriting
Wikimedia IncubatorThe keyboard editor is enabled on Wikimedia Incubator for the American Sign Language Wikipedia and every other sign language project.
Store JavaScript in a bookmark and you can use SignWriting on any web page in any text fields.
Any WebsiteAdd a few KB of JavaScript and the keyboard editor can be enabled on any website using standard edit boxes and visual presentation.
http://www.signwriting.org/symposium/presentation0041.html
SignWriting in Software
What about Unicode?
PUA Plane 15 design (1,179 characters)The symbol only design removed 2-D layout by dropping 5 structural markers and 500 number
characters
N4015 Preliminary Unicode (674 characters)
N4090 Revised Unicode (672 characters)
N4342 Unicode Proposal (672 characters)
A new inherent design removes 2 characters (F1 and R1) and breaks collation as stated in
proposal
A new facial diacritic design is proposed that is unsupported and
untested
The original design is still compatible with the community efforts.
Issues with SignWritingin Unicode 8
The Unicode 8 specification will not be used for any SignWriting project around
the world.The Unicode 8 specification for SignWriting is
politically valuable, but unhelpful for developers.
Issues with SignWritingin Unicode 8
The issue of the moment is sorting, but there are three
main issues.If we address all of the issues for
SignWriting, the existing International community of SignWriters is ready, able,
and willing to embrace the standard.
Issue 1: Unicode 8 is incomplete
http://signbank.org/SignWriting_Character_Viewer.html
Unicode 8 only encodesthe symbols and ignores
the issue of layout.
Unicode 8 is missing the structural markers
and number charactersrequired for 2-D Layout.
Unicode 8 requires SVGfor the visual presentation.
Unicode 8 requires additionalcharacters/markup to write a sign.
Issue 2: Unicode 8 is flawed
The idea of Inherent charactersbreaks from the communityuse of today and historically.
Because of Inherent modifiers,sorting is broken, searching isambiguous, and replacements
can be destructive.
w s PSymbol BasesTokens
i oSymbol ModifiersTokens
identified with a string of 3
tokens.
w i o
Writing Symbol
P i oPunctuation Symbol
Fill Rotation
TriadicSymbol
Issue 2: Unicode 8 is flawedSorting is broken
1D800 SIGNWRITING HAND-FIST INDEX (HFI)1DAA1 SIGNWRITING ROTATION MODIFIER-2 (R2)1DA9B SIGNWRITING FILL MODIFIER-2 (F2)
1. HFI F1 R15. HFI F1 R1 HFI F1 R12. HFI F1 R26. HFI F1 R2 HFI F1 R13. HFI F2 R17. HFI F2 R1 HFI F1 R14. HFI F2 R2
1. HFI5. HFI HFI3. HFI F27. HFI F2 HFI4. HFI F2 R22. HFI R26. HFI R2 HFI
Correct sorting with F1 & R1 Incorrect sorting without F1 & R1
http://www.unicode.org/L2/L2015/15184-signwriting-ducet.txt
http://signpuddle.net/15184-signwriting-ducet-response.txt
http://www.unicode.org/L2/L2015/15202-signwriting-ducet-aux.txt
Issue 2: Unicode 8 is flawedSorting is broken
1D800 SIGNWRITING HAND-FIST INDEX (HFI)1DAA1 SIGNWRITING ROTATION MODIFIER-2 (R2)1DA9B SIGNWRITING FILL MODIFIER-2 (F2)
HFI weight of 100F2 weight of 420R2 weight of 410
1. HFI 1005. HFI HFI 100 1002. HFI R2 100 4106. HFI R2 HFI 100 410 100 3. HFI F2 100 4207. HFI F2 HFI 100 420 1004. HFI F2 R2 100 420 410
DUCET FixCorrect sorting with DUCET
1, 5, 2, 6, 3, 7, 4
Correct Sort Order
1, 2, 3, 4, 5, 6, 7
Incorrect Sort Order
Issue 2: Unicode 8 is flawedSearching is ambiguous
1D800 SIGNWRITING HAND-FIST INDEX (HFI)1DAA1 SIGNWRITING ROTATION MODIFIER-2 (R2)1DA9B SIGNWRITING FILL MODIFIER-2 (F2)
1. HFI F1 R15. HFI F1 R1 HFI F1 R12. HFI F1 R26. HFI F1 R2 HFI F1 R13. HFI F2 R17. HFI F2 R1 HFI F1 R14. HFI F2 R2
1. HFI5. HFI HFI3. HFI F27. HFI F2 HFI4. HFI F2 R22. HFI R26. HFI R2 HFI
Searching with F1 & R1 Searching without F1 & R1
Searching for the symbolHFI F1 R1 correctly
finds 4 matches
Searching for the symbol HFI incorrectly finds 10
matches without negative lookaheads
Issue 2: Unicode 8 is flawedSearching is ambiguous
Query String:QS10000S20500
Searching for signs that include 2 exact symbols will return these results from the ASL Dictionary.
Issue 2: Unicode 8 is flawedSearching is ambiguous
Plus 6 more pages of signs.
Query String:QS100uuS205uu
In Unicode 8, searching for a symbol base without fill or
rotation modifiers will return 6 times as much noise as signal.
Issue 2: Unicode 8 is flawedReplacements can be destructive
sub uFD830 uFD810 uFD820 by S10000;sub uFD830 uFD810 uFD821 by S10001;sub uFD830 uFD810 uFD822 by S10002;sub uFD830 uFD810 uFD823 by S10003;sub uFD830 uFD810 uFD824 by S10004;sub uFD830 uFD810 uFD825 by S10005;sub uFD830 uFD810 uFD826 by S10006;sub uFD830 uFD810 uFD827 by S10007;
sub u1DA8B u1DAA7 by S38b07;sub u1DA8B u1DAA6 by S38b06;sub u1DA8B u1DAA5 by S38b05;sub u1DA8B u1DAA4 by S38b04;sub u1DA8B u1DAA3 by S38b03;sub u1DA8B u1DAA2 by S38b02;sub u1DA8B u1DAA1 by S38b01;sub u1DA8B by S38b00;
https://github.com/Slevinski/signwriting_2010_tools
The TrueType Fonts use Ligatures to support multiple character sets.
Plane 15 Characters Unicode 8 Characters
Increasing symbols keys or decreasing works without
issue.
Decreasing symbol keys to avoid destruction.
Issue 3: Unicode 8 is fictionalFacial diacritics do not exist. There is no font support, no software support, and
no data.
Facial diacritics are described in one document, using 177 words.
Facial diacritics have never been tested on any individual, let alone an
international group.
Facial expressions are created usingoverlap and overlay of many symbolsusing Cartesian coordinates for each.
Facial diacritics should be handled in software rather than the character
encoding.
Facial diacritics development was quietly abandoned the end of 2012.
Formal SignWriting
Regular Expressions
Query Strings
Community Use
SVG
PUA Plane 15
Graphite Font
Unicode 8 PUA Plane 16TTF
10% to 50% reduction
15 to 50 times expansion
process million of characters per second
search results
15 times expansion
single character per symbolligatures of 1 to 3 characters
twice the size
cartesian coordinates with GPOS
CSSstyle text
Isomorphic
JS
ASCII Lite Markup
preferredunused
prototype
6 KB zipped
AS18711S20500 M514x517S18711490x483S20500486x506
AS18711S20500M514x517S18711490x483S20500486x506
A S18711 S20500 M514x517 S18711490x483 S20500486x506
M 514x517 S18711 490x483 S20500 486x506
(514,517) (490,483) (486,506)
Time Space
SequenceMarker
Symbol
Middle LaneSignBox
MaxCoord
SpatialSymbol
Community UseFormal SignWriting
Standard ASCII format is Isomorphic to PUA Plane 15
Unicode 9
Regular Expressions
Query Strings
Ideal SolutionGraphite Font TTF
10% to 50% reduction
15 to 50 times expansion
process million of characters per second
search results
cartesian coordinates with GPOS
CSSstyle text
http://signpuddle.net/iswa/#smartfont
Prototype Font uses Cartesian coordinates for 2-D layout with Graphite
JS6 KB zipped
Too Late?
SignWriting is spreading around the world and exploding online. All of the SignWriting projects are using an ASCII solution and have no plans to switch to the Unicode 8 design for the symbols.Without a full script solution for SignWriting, Unicode will not be used for SignWriting, especially the Unicode 8 design which complicates otherwise simple routines.Using Unicode for SignWriting is a great idea in theory, but there are few advantages and too many disadvantages to seriously consider applying the Unicode 8 design, even if sorting is fixed.
I left the Unicode effort the end of 2011. In 2012, I was shown the latest proposal (N4342). I objected privately and asked that they produce a working font before they contact me again.In 2014, I was contacted that SignWriting will be in Unicode 8. I reiterated my objections, pointing out the issues, and was told it was too late to change the design in any way.
Discussion Ideas2-Color FontsSignWriting relies on a 2-color font. Currently, SignWriting mimics a 2-color font by using 2 TrueType Fonts: one for the line and another for the filling. If you have any experience with 2-color fonts, let’s discuss the possibilities.
2-Dimensional Layout with Graphite and Cartesian coordinatesSignWriting has a prototype font that uses Cartesian coordinates to control the 2-dimensional layout with Graphite and PUA Plane 15 characters. If you have any experience with 2-dimensional layout using Cartesian coordinates, let’s discuss the possibilities.
Alternate designs for a 2-dimensional scriptThis type of discussion is interesting, but it will not effect the SignWriting community. The standards are stable and widely used. This would make for an interesting project, but it is not work that I will be doing myself.
Discussion Ideas
Unicode 9 or 10Can we deprecate Unicode 8? The community design has been stable for 3 1/2 years. There is an interested community and there are many possibilities for 2-Color fonts and 2-Dimensional layout.
Unicode 8I will not be using Unicode 8. I partially support Unicode 8 with the SignWriting 2010 Fonts, but not the facial diacritics. I suggested that people avoid use SignWriting in Unicode 8. I’m willing to discuss any of the 3 issues that I have outlined, but I’m not invested in any tweaks to the Unicode 8 design.
Symbol Encoding ModelPUA Plane 16 (37,811 characters)
Script Encoding ModelPUA Plane 15 (1,179
characters)
both designs are productive and used today
Issues with SignWriting in Unicode
8by Stephen E Slevinski Jr
http://slevinski.github.io
http://www.slideshare.net/StephenSlevinski/presentations