.NET Managed HTML Rendering Engine

31
Introduction to Introduction to Lightweight Pure .NET Lightweight Pure .NET Managed Managed HTML Rendering Engine HTML Rendering Engine With Multi-Language Script With Multi-Language Script Host Host [ [ fâr fâr -dll- -dll- fâr fâr ] ] Project “.fair[dll=‘fair’] Project “.fair[dll=‘fair’]

Transcript of .NET Managed HTML Rendering Engine

Page 1: .NET Managed HTML Rendering Engine

Introduction to Introduction to Lightweight Pure .NET ManagedLightweight Pure .NET ManagedHTML Rendering EngineHTML Rendering EngineWith Multi-Language Script HostWith Multi-Language Script Host

[[fârfâr-dll--dll-fârfâr]]

Project “.fair[dll=‘fair’]”Project “.fair[dll=‘fair’]”

Page 2: .NET Managed HTML Rendering Engine

AppDomainAppDomain

Standard .NET Standard .NET WebBrowser Control Pros and ConsWebBrowser Control Pros and ConsBasics of System.Windows.Foms.WebBrowser Control• Officially Introduced in System.Windows.Forms.* namespace

in .NET Framework 2.0. - .NET 1.0 and .NET 1.1 did not include HTML rendering control. A

custom interop assembly creation was required by using TlibImp.exe.

• WebBrowser Control Interops with IE component, such as mshtml.dll, ieframe.dll etc, known as ‘Trident’ (similar to RichTextEdit Control).• Trident, itself is native x86 or x64 binary application component, it can perform very fast DOM Processing and Trustworthy Market Standard Browser Engine provided by Microsoft.• Trident calls for WinInet (wininet.dll) to access the Web, which is widely used other desktop applications.

Managed ControlManaged Control

Managed XMLManaged XML

Managed DataManaged Data

WebBrowser(Trident)WebBrowser(Trident)

Managed SocketsManaged Sockets

Unmanaged from System32

JScriptJScript

MSHTMLMSHTML

WinInetWinInet

ieframeieframeManaged WebRequestManaged WebRequest

Cache Management

In 2015, a “cutting edge” browser called ‘Microsoft Edge’ is bundled with Windows 10.

Page 3: .NET Managed HTML Rendering Engine

弊社は、三菱商事や、シグマクシスとは、一切かかわりは、ございません。

Page 4: .NET Managed HTML Rendering Engine

IE Versioning IssueIE Versioning Issue• Trident is normally ‘one single version per OS’, not ‘side-by-

side’.• Not all Client uses latest version of IE, which may result in

different HTML look-and-feel. -WebBrowser Control (including custom interop assembly)

possibly invokes with From IE5x to IE10x. -XP can install up to Internet Explorer 8. Vista can install

Internet Explorer 9.• Not all browser functionality is available to .net programmers.• New features of WebBrowser Control will be introduced in

future version of .NET may not work on .NET 2.0.• ChakraCore has been open sourced as MIT Liscence in 2016/01/14 on Github. (https://github.com/Microsoft/ChakraCore)-x86,x64, ARM support-Linux will be supported soon.

Page 5: .NET Managed HTML Rendering Engine

Why 100 % Managed Browser?

• Upgrading company’s browser is a political issue.• Pointer-less managed object may protect the system from security attack,

such as buffer overrun vulnerability.• 100% Managed Class Object may be faster and safer for Runtime.• (No COM Marshaling Cost)• Stay with standard .NET dispose() and finalize(), not ReleaseComObject()

API for COM Object.• Persistent output regardless of Client environment.• Both 32bit and 64 bit (or 128bit future) processor (Intel, Arm, PowerPC)

needs to run one rendering library. • Be Free from IE Versioning Issue (No Component Dependency).• Looking for light-weight component which has capability to process HTML

DOM, CSS and JavaScript. (only HTML DOM is not good enough)

Page 6: .NET Managed HTML Rendering Engine

『 .fair[dll=‘fair’] 』 Target .NET Runtime Version

.NET Framework.NET Framework HTMLHTML CSSCSS JavaScriptJavaScript

1.0 N/A N/A N/A

1.1 N/A N/A N/A

2.0 ✓ ✓ ✓

3.0 ✓ ✓ ✓

4.0 ✓ ✓ ✓

4.5 ✓ ✓ ✓

4.51 ✓ ✓ ✓

4.52 ✓ ✓ ✓

4.6 ✓ ✓ ✓

Target Operating System

Any WinNT Platform which supports .NET 2.0 or greater framework can be supported.Any WinNT Platform which supports .NET 2.0 or greater framework can be supported.No Support for .NET Compact, .NET Mini , Mono currently (it may work on Mono, but no Mono No Support for .NET Compact, .NET Mini , Mono currently (it may work on Mono, but no Mono support.).support.).Unless chooses jscript.dll engine, IE installation is NOT required.Unless chooses jscript.dll engine, IE installation is NOT required.『『 .fair[dll=‘fair’].fair[dll=‘fair’] has its own cache management scheme apart from Microsoft.Win32.WinInetCache has its own cache management scheme apart from Microsoft.Win32.WinInetCache...NET 1.0 and .NET 1.1 has been out of mainstream support phase..NET 1.0 and .NET 1.1 has been out of mainstream support phase.

Windows 2000, Windows XP, Windows 7, Windows 8(and 8.1),Windows 10

Page 7: .NET Managed HTML Rendering Engine

『『 .fair[dll=‘fair’].fair[dll=‘fair’] 』 』 Basic Architecture Basic Architecture

Common Language Runtime (2.0 – 4.6.1)

CHtmlMultiversalWindowRendererCHtmlMultiversalWindowRenderer

CHtmlCSSStyleSheetCHtmlElement

ActiveXWrapper

OSSScriptingAssembly

JINT

RhinoNavigator

Screen

Collection

Flash 3rd Party

CHtmlContext(Canvas)

XmlHttpRequest

•Render HTMLDocument•Process UI Event•Control CHTMLDocumenent

[Plan]ChakraCore

ScriptProcessor

ScriptProcessorCHtmlCssRule

MediaQuery

GeoLocation

Other Lang

Ex. Php, perl, ec.

ScriptProcessor

Event

DOMParser

Add-In

CHtmlMutliversalWindow•Almost identical ‘window’ object•Hosts ScriptProcessor•Acts as global object for script

CHtmlDocumentCHtmlDocument (XML , SVG Document (XML , SVG Document))

•Parse HTML•Layout•Process CSS•Process Script

StyleKey

CHtmlCssRuleMergeQueue

CH tml CssRuleGroundList

Page 8: .NET Managed HTML Rendering Engine

Supported HTML Tags

Most of HTML 3.2,HTML4, and HTML5 standard tags are supported to generate CHtmlElement.(Undefined tags processed as block element within HTML document as default).

HTML4

<a><applet><b><base><big><blockquote><body><br><caption><dd><div><dl><dt><em><embed><font><form><h><head><hr><html><i><img><input><li><link><meta><nobr><noembed><object><ol><option><p><pre><s><script><select><small><span><strike><strong><sub><sup><table><tbody><td><textarea><tfoot><th><thead><title><tr><tt><u><ul><!DOCTYPE>

HTML5

<canvas><audio><video><source><track><bdi><aside><nav><ruby><rt><section><time><wbr><footer<<header><progress><figure><output><datalist> <progress>

XML Any XML tags

SVG ALL SVG Tags. Ex <svg><text><rect>

Page 9: .NET Managed HTML Rendering Engine

Rendering with Multiple Managed Threads

Request CHtmlDocument

ReturnsCHtmlDocument

I/O ThreadUI Thread DOM Thread

CSS Thread

Image Thread

Script Thread

<<HTML>HTML> <head><head> <link…><link…> <script..><script..> </head></head>

<body><body> <canvas><canvas> <image><image> <video><video> <audio><audio> </body></body></HTML></HTML>

Page 10: .NET Managed HTML Rendering Engine

Choosing GDI+ or System.Windows.Control

• CHtmlMultiversalWindowRender is System.Windows.Controls.Control• Most of CHtmlElement class (HTML Tags) will be drawn by GDI+/GDI• Control Tags ex. <IFrame> , <Input>, <button> will host Managed

Control.• <Object> or <Embed> will create ActiveXObject if ActiveXObject Option

is enabled.• <svg> is partially supported.• Canvas 2D will be supported with standard System.Drawing API set, ex.

DrawText, DrawLine, FillRectangle etc.

HTML can grow over 10000 HTML/XML tags, it is recommended to keep minimize the number of Control as low as possible.

“Control resource is limited by OS.”

fair[dll=fair] basic design guidance

Page 11: .NET Managed HTML Rendering Engine

Script Engine SupportScript Engine Support

• Rhino (via IKVM) (https://developer.mozilla.org/ja/docs/Rhino) - originating from Netscape/Mozilla ‘javagator’ project in 1997. - over 15 years, constantly improving. - supports legacy JavaScript mode with setLanguageVersion() API. - Supports Javascript 1.7 from version 1.7R1 - supports interpreter mode and compile mode(with setOptimizationLevel() API) - stable and widely implemented in java users. - supports execution timeout in interpeter mode. - .fair[dll=‘fair’] default script engine.

MultiversalWindow

IMultiversalInterface

RhinoProcessor Rhino Interpreter

This defines script interpretor function interface to be called. It is separate module from Browser assembly. Any script language engine can be implemented as browser

compiler if It uses Imultiversal interface.

Page 12: .NET Managed HTML Rendering Engine

DOM API and AJAX Support• getElementById(), getElementsByClassName(), getElementByTagName() 、 getElementByTagNameNS()• createElement(), createTextElement(), createStyleSheet(), createDocumentFragment() etc• appendChild(), insertBefore() , replaceChild() etc.• querySelector(), querySelectorAll(), matchesSelector()• XMLHTTPRequest() , DOMParser() (embedded in System.Net.HttpWebRequest Object)(XDomainRequest support has been dropped)• image(), document.all() - some features only• ArrayBuffer, IntxArray, UIntxArray, FloatxArray (We shall changed to use Rhino 17R5 spec).• document.cookie, Element.attributes, Element.classList, Element.dataSet• window.localStorage, window.sessionStorage• createElement(‘<iframe name=“abc”/>’)• postMessage(), window.onmessage• new ActiveXObject(“TypteID”) (some object features only. Ex. XMLHTTP, XMLDOM, Shockwave etc.)• setInterval(), setTimeout(), clearTimeout()• getContext(), window.requestAnimationFrame()• document.evalute() now supported thru its own System.Xml.Xpath.XPathNavigator (which is based upon Xpath 1.0)CSS 3.0• :nth-Last-Child, :nth-last-of-type: nth-last-child() , only-child, only-of-type (Done. Best Effort Implementation), matches(), :not()Prototype Extension Support• HTML Prototype objects, such as window.Element, window.Event, window.HTMLElement, are defined.

•Most of HTML3.2/4 Standard API Has been implemented•Most of JQuery standard scripts can be compiled.•Melon.js (HTML5 Canvas Game Framework) can be compiled. •Utility scripts such as embed.js, core130.js, core131.js, chartbeat.js, modernizr (version 2.7.1), can also be compiled.•prototype.js and mootools.js may be able to be compiled (some version only).•some utility scripts remains difficulty to be compiled which calls minor HTML5 APIs.•Dynamic Element CSS Recaluculation by altering Element Class with script is now supported.

Page 13: .NET Managed HTML Rendering Engine

Rendering Result(Beta)

Page 14: .NET Managed HTML Rendering Engine

HTMLCollection.prototype.forEach = function (fun, thisp) {

return Array.prototype.forEach.call(this, fun, thisp); };  

var elems = document.getElementsByTagName('*'); elems.forEach(function (node, index, nodeList) { console.log(node, index); });

In “.fair[dll=‘fair’], most of query result will be returned as HTMLCollection. document.getElementsByTagName('*') instanceof HTMLCollection;

‘forEach’ function call thru HTMLCollecion Prototype is suported as following script.

Scripting with Prototype Objects and forEachScripting with Prototype Objects and forEach

Page 15: .NET Managed HTML Rendering Engine

About Layout Engine…• Unlike sophisticated layout engine which standard browser (ex.Chrome, Unlike sophisticated layout engine which standard browser (ex.Chrome,

Firefox, or IEFirefox, or IE) ) has, the layout engine of fair[dll=fair] only performs one has, the layout engine of fair[dll=fair] only performs one element phase layout.element phase layout.

• fair[dll=‘fair’] can now perform dynamic css recalculations by javascripts, fair[dll=‘fair’] can now perform dynamic css recalculations by javascripts, however, the actual layout may not work as expected.however, the actual layout may not work as expected.

• Currently, 30% of commercial news web site are rendered as ‘relatively Currently, 30% of commercial news web site are rendered as ‘relatively good’ output condition.good’ output condition.

• fair[dll=‘fair’] works well on classic Web sites with Plain Basic html and css fair[dll=‘fair’] works well on classic Web sites with Plain Basic html and css (ex. Blog, Wikipedia, Github, stackoverflow, government web site). (ex. Blog, Wikipedia, Github, stackoverflow, government web site).

• If there is float left after centered block, layout can be mixed up.If there is float left after centered block, layout can be mixed up.

Multiple Phase layout scheme may be introduced in future…Our current priority (due to our limited resource) is

Script handing > performance > layout

Center Left

Page 16: .NET Managed HTML Rendering Engine

Hardware Requirement• .fair[‘dll’= fair] will be able to operate with low-end spec

CPU(ex.Celeron, or Pentium) whose CPU passmark score is below 1000-2000 points.

• Because of the system design, mutli-core CPU is recommended strongly.

• .fair[‘dll’= fair] is mainly designed for broadband network rather than narrow band(less than 500Kbps). 1Mbps or higher is recommended.

• If the network speed less than 200kbps, rendering speed tends to be degraded noticeably in order to render script-rich and content-rich web sites.

Even old low spec cpu(like Dothan Pentium M 1.7G) can run this rendering engine as far as network speed is high enough.

Page 17: .NET Managed HTML Rendering Engine

Multimedia Support for HTML5 <Video> <Audio> TagsMultimedia Support for HTML5 <Video> <Audio> Tags

・ HTML5 Video Element and Audio Element is supported through Windows Media API.

Page 18: .NET Managed HTML Rendering Engine

Canvas 2D with JavaScript - Part 1• Fundamental 2-dimentional Canvas drawing scripts now works.

Page 19: .NET Managed HTML Rendering Engine

Canvas 2D with JavaScript - Part 2

Page 20: .NET Managed HTML Rendering Engine

Canvas 2D with JavaScript - Part 3Canvas 2D with JavaScript - Part 3

• The drawback is performance. Pixel array manipulation scripts tend to be slow in particular.• Animation performance is worse than normal browser as you can expect. (almost 1/3 for 2D)• The complex canvas games, such as ‘Full Screen Mario’ and ‘Gradius’, are not ready to play.

Page 21: .NET Managed HTML Rendering Engine

• Approximately 95 % WebGL Related-API exists.• ‘Three.js’ can be compiled.• Currently, WebGL API just do almost nothing.

Canvas WebGL/3D Progress

•The current WebGL is basically derived from classic OpenGL, which is not thread efficient.•There big changes on DirectX 12 and OpenGL 4.5 to easy to interact with multi-GPU, WebGL will be changed on the next version(CanvasMantra[?]). We wait for next ‘better’ version of WebGL.

Page 22: .NET Managed HTML Rendering Engine

As Canvas WebGL Alternative…As Canvas WebGL Alternative…“phoria.js” is an “excellent” JavaScript library for simple 3D graphics on a canvas 2D renderer, which is still slow, but works on “.fair[dll=‘fair’]”.

Page 23: .NET Managed HTML Rendering Engine

DOM Structure,CSS, and JavaScript Viewer

DOM Tree ViewDOM Tree View

Script Compilation Result ViewScript Compilation Result View

CSS Process ViewCSS Process View

Page 24: .NET Managed HTML Rendering Engine

Current JavaScript Script Processing Update – Part 1Current JavaScript Script Processing Update – Part 1200 : Success400, 500x : Script Error

The left picture is visiting money.cnn.com result view.Most of javascript compilation was success [Status:200], including jquery 1.5.1.However, there is 1 script remains script error [Status : 500].The script error ratio depends upon the complexity of the web page.]

200 : all Success

Page 25: .NET Managed HTML Rendering Engine

Current JavaScript Script Processing Error Count on well-known web site.Current JavaScript Script Processing Error Count on well-known web site.

Tested based upon the build of 2014/09/30

Site NameSite Name URLURL Script CountScript Count Error CountError CountComputer WorldComputer World htto://www.computerworld.com 7474 11

NBCNBC http://www.nbc.com5050 11

PC World PC World htto://www.pcworld.com 7575 22

Yahoo Yahoo htto://www.yahoo.com 2020 00

YoutubeYoutube http://www.youtube.com 1717 11

New York TimesNew York Times http://www.nytimes.com 2323 00

China newsChina news http://www.chinanews.com/ 6969 00Wired.comWired.com http://www.wired.com/http://www.wired.com/ 7171 00Innovation Excellence Innovation Excellence http://www.innovationexcellence.com/http://www.innovationexcellence.com/ 5050 22CIACIA http://www.cia.govhttp://www.cia.gov 1616 00

Code ProjectCode Project http://www.codeproject.comhttp://www.codeproject.com 1515 00Wikipedia English Main PageWikipedia English Main Page http://en.wikipedia.org/wiki/Main_Pagehttp://en.wikipedia.org/wiki/Main_Page 1313 11StackoverflowStackoverflow http://stackoverflow.comhttp://stackoverflow.com 2121 00Recode.netRecode.net http://recode.nethttp://recode.net 8888 11San Jose Mecury NewsSan Jose Mecury News http://www.mercurynews.com/http://www.mercurynews.com/ 104104 22SFGateSFGate http://www.sfgate.comhttp://www.sfgate.com 7474 11XconomyXconomy http://www.xconomy.comhttp://www.xconomy.com 6666 11BoketeBokete http://bokete.jphttp://bokete.jp 1919 00Passmark homePassmark home http://www.passmark.comhttp://www.passmark.com 66 00Toms hardware homeToms hardware home http://www.tomshardware.com/http://www.tomshardware.com/ 6262 00Computer ShopperComputer Shopper http://www.computershopper.com/http://www.computershopper.com/ 5757 00Amazon comAmazon com http://www.amazon.comhttp://www.amazon.com 7070 11Bloomberg BusinessweekBloomberg Businessweek httphttp ://://www.businessweek.comwww.businessweek.com 4949 00

Note1) Async, deffered Scripts or element onload(ex img) scripts may no counted in ‘script count’.

Page 26: .NET Managed HTML Rendering Engine

Latest Build Page Load Performance ResultLatest Build Page Load Performance ResultHTML DOM + CSS with/without JavaScript Processing

Site Name DOM + CSS DOM + CSS + JavaScript

Google Top Pagehttp://www.google.com

Before: 280 - 400 msNew: 78 -110ms

Before : 395 - 402 msNew : 85 - 125ms

CNNhttp://edition.cnn.com/

2300 - 3200 ms Before 5534 -7750 msNew: 3050 -3900 ms

USA Todayhttp://www.usatoday.com/

Before: 2100 - 2500 msNew: 850 - 1100 ms

Before : 1500 - 3500 msNew : 900 - 1400 ms

Computerworldhttp://www.computerworld.com

600 - 1300 ms Before :9600 - 26000 msNew : 3200 - 4300 ms

ChinaView Cnhttp://www.chinaview.cn/

New: 1300-2500ms New : 2500- 4800 ms

Information Weekhttp://www.informationweek.com

New :1300 - 1800 ms Before :10500 -33600 msNew : 5100 - 5900 ms

ZDNET http://www.zdent.com

New :770 - 950 ms New : 2500 - 4500 ms

Wikipediahttp://en.wikipedia.org/wiki/Main_Page

New : 650- 750 ms New : 1300 - 1700 ms

Yahoo (US)http://www.yahoo.com

Old : 3300 -4500 ms2013/12 :500ms –2000ms

Before :5900- 12000ms2013/12 : 935 - 5800 ms

Testing environment : Pentium G640 2.8G 2 Core 8G RAM Windows 8 + Rhino Javascript Engine width .Net Framework 4.5 Some script may fail to execute. Web Data may be cached locally or remotely.

In average, 30 - 60% Performance up!

Page 27: .NET Managed HTML Rendering Engine

fair[dll=“fair”] on Xamarin Mono Project fair[dll=“fair”] now run under Mono 4.0. Most of functions, such as HTML Dom, CSS,

Javascript functions are working properly on Linux also.

•With Mono, fair[dll=“fair”] now has high performance timer,(not WM_Timer). The performance of requestAnimationFrame() has been improved.•Although Mono 4.0 has been improved, some minor issues remains on graphics and decompression.•No ActiveXObject(ex. Flash) over Mono.•No plan to support <Video> and <Audio> in near future.

Mono on Windows Mono on Linux Mono on MacNo Image

OpenSUSE 13.2

Page 28: .NET Managed HTML Rendering Engine

WebP Image Format Support• WebP is relatively new BSD-based format developed by On2 Tech which is

acquired by Google.[see wiki in detail:wikipedia]• Although it is not widely used image format in the Net, some popular site, such as

Youtube, uses this format. It is announced in 2010. • .NET framework 4.5 does not support WebP image format currently.• Thanks to Java VP8 Decoder (

http://sourceforge.net/p/javavp8decoder/news/2010/11/pure-java-webp-decoder-available/) , most of webp images are now decoded in rendering engine.

Java VP8 Java VP8 DecoderDecoder

WebP WebP ImageImage

Page 29: .NET Managed HTML Rendering Engine

Current Issues - a lot!• Only a few events are supported currently. - onclick, onload, onreadystatechange, onmousemove,

onkeydown,onkeypress, domcontentloaded, visibilityChange, onfocus etc.• Poor Layout Engine output.(undefined width on block element may result in unexpected layout.)• @ font-face supports WOFF 1.0 (not WOFF 2.0 yet)• Hang or Crash issues due to stackoverflow.• HTML5 Canvas Context API is under progress. (Currently, 2D Canvas 70%- 80% API is working.)• Needs Visualization Support for 3D/WebGL /WebCL thru DirectX or

OpenGL.• Multiple Background Image for one element is now suported

on .far[dll=‘fair’], the acutual rendering output has many issues.• Incomplete Worker/SharedWorker Threading Objects.

long way to go…

Page 30: .NET Managed HTML Rendering Engine

Not Supported Feature (in near future)• In Progress Rendering Support (Full Document DOM load must be loaded to render at first

currently).• Live Script Debug• <applet> needs java runtime, so no support for <applet> . (iphone and android do not support <applet> either.)

• @ CSS expression()• Browser Specific CSS Hack (“* html+xyz”)(*:first-child+etc)• XSL DOM Transformation• unpopular HTML5 API (Web FileSystem API,Microdata, or

Web SQL etc).• Web Real-Time Communication API set (WebRTC etc) is

beyond current .NET and hardware spec.

Page 31: .NET Managed HTML Rendering Engine

Our Mission• Create Open-Standard Managed light-weight HTML5 compatible Create Open-Standard Managed light-weight HTML5 compatible

managed browser with multi script language until 2017.managed browser with multi script language until 2017.• Open LeaderShip in HTML community.Open LeaderShip in HTML community.• We are going to release .fair[dll=‘fair’] as a contribution to “2011 We are going to release .fair[dll=‘fair’] as a contribution to “2011

Japan Tohoku Fukushima Earthquake Disaster 311 Recovery” and Japan Tohoku Fukushima Earthquake Disaster 311 Recovery” and express our gratidude for Ken Shama who is a greatexpress our gratidude for Ken Shama who is a great pioneer of pioneer of supply chain management and Satoru Murakami(who was the supply chain management and Satoru Murakami(who was the best Exchange engineer and the best mentor)best Exchange engineer and the best mentor)

Windows

Linux

Apple OS X

AIX

FreeBSD

SolarisTron

ConnectConnect!!