.NET Managed HTML Rendering Engine with Multi-language script host

27
Project “.fair[dll=‘fair’] Project “.fair[dll=‘fair’] Introduction to Introduction to Lightweight Pure .NET Lightweight Pure .NET Managed Managed HTML Rendering Engine HTML Rendering Engine With Multi-Language Script With Multi-Language Script Host Host [ [ fâr fâr -dll- -dll- fâr fâr ] ]

description

css, javascript ,dom parser and rendering

Transcript of .NET Managed HTML Rendering Engine with Multi-language script host

Page 1: .NET Managed HTML Rendering Engine with Multi-language script host

Project “.fair[dll=‘fair’]”Project “.fair[dll=‘fair’]”

Introduction to Introduction to Lightweight Pure .NET ManagedLightweight Pure .NET ManagedHTML Rendering EngineHTML Rendering EngineWith Multi-Language Script HostWith Multi-Language Script Host

[[fârfâr-dll--dll-fârfâr]]

Page 2: .NET Managed HTML Rendering Engine with Multi-language script host

AppDomainAppDomain

Standard .NET Standard .NET WebBrowser Control Pros and ConsWebBrowser Control Pros and Cons

Basics of System.Windows.Foms.WebBrowser Control• Officially Introduced in System.Windows.Forms.* namespace

in .NET Framework 2.0. - .NET 1.0 and .NET 1.1 did not include HTML rendering control.

A custom interop assembly creation was required by using TlibImp.exe.

• WebBrowser Control Interops with IE component, such as mshtml.dll, ieframe.dll etc, known as ‘Trident’ (similar to RichTextEdit Control).• Trident, itself is native x86 or x64 binary application component, it can perform very fast DOM Processing and Trustworthy Market Standard Browser Engine provided by Microsoft.• Trident calls for WinInet (wininet.dll) to access the Web, which is widely used other desktop applications.

Managed ControlManaged Control

Managed XMLManaged XML

Managed DataManaged Data

WebBrowser(Trident)WebBrowser(Trident)

Managed SocketsManaged Sockets

Unmanaged from System32

JScriptJScript

MSHTMLMSHTML

WinInetWinInet

ieframeieframeManaged WebRequestManaged WebRequest

Cache Management

Page 3: .NET Managed HTML Rendering Engine with Multi-language script host

IE Versioning IssueIE Versioning Issue• Trident is normally ‘one single version per OS’, not ‘side-

by-side’.• Not all Client uses latest version of IE, which may result in

different HTML look-and-feel. -WebBrowser Control (including custom interop assembly)

possibly invokes with From IE5x to IE10x. -XP can install up to Internet Explorer 8. Vista can install

Internet Explorer 9.• Currently, Latest Microsoft Script Technology Engine

‘Chakra’ can not be installed or distributed separately.• Not all browser functionality is available to .net

programmers.• New features of WebBrowser Control will be introduced in

future version of .NET may not work on .NET 2.0.

Page 4: .NET Managed HTML Rendering Engine with Multi-language script host

Why 100 % Managed Browser?• Upgrading company’s browser is a political issue.• Pointer-less managed object may protect the system from security attack,

such as buffer overrun vulnerability.• 100% Managed Class Object may be faster and safer for Runtime. (No COM Marshaling Cost)• Stay with standard .NET dispose() and finalize(), not ReleaseComObject()

API for COM Object.• Persistent output regardless of Client environment.• Both 64bit or 32 bit (or 128bit future) application mode needs run one

HTML rendering library.• Be Free from IE Versioning Issue (No Component Dependency).• Looking for light-weight component which has capability to process HTML

DOM, CSS and JavaScript. (only HTML DOM is not good enough)

Page 5: .NET Managed HTML Rendering Engine with Multi-language script host

『 .fair[dll=‘fair’] 』 Target .NET Runtime Version

.NET Framework.NET Framework

HTMLHTML CSSCSS JavaScriptJavaScript

1.0 N/A N/A N/A

1.1 N/A N/A N/A

2.0 ✓ ✓ ✓

3.0 ✓ ✓ ✓

4.0 ✓ ✓ ✓

4.5 ✓ ✓ ✓

4.51 ✓ ✓ ✓

Target Operating System

Any WinNT Platform which supports .NET 2.0 or greater framework can be supported.Any WinNT Platform which supports .NET 2.0 or greater framework can be supported.No Support for .NET Compact, .NET Mini , Mono currently (it may work on Mono, but no Mono No Support for .NET Compact, .NET Mini , Mono currently (it may work on Mono, but no Mono support.).support.).Unless chooses jscript.dll engine, IE installation is NOT required.Unless chooses jscript.dll engine, IE installation is NOT required.『『 .fair[dll=‘fair’].fair[dll=‘fair’] has its own cache management scheme apart from Microsoft.Win32.WinInetCache has its own cache management scheme apart from Microsoft.Win32.WinInetCache ...NET 1.0 and .NET 1.1 has been out of mainstream support phase..NET 1.0 and .NET 1.1 has been out of mainstream support phase.

Windows 2000, Windows XP, Windows 7, Windows 8(and 8.1)

Page 6: .NET Managed HTML Rendering Engine with Multi-language script host

『『 .fair[dll=‘fair’].fair[dll=‘fair’] 』 』 Basic Architecture Basic Architecture

Common Language Runtime (2.0 – 4.5.1)

CHtmlMultiversalWindowRendererCHtmlMultiversalWindowRenderer

StyleElementCHtmlElement

ActiveXWrapper

OSSScriptingAssembly

JINT

Rhino

Nashorn

Navigator

Screen

Collection

100% from scratch!100% from scratch!Flash 3rd Party

CHtmlContext(Canvas)

XmlHttpRequest

•Render HTMLDocument•Process UI Event•Control CHTMLDocumenent

Jscript.dll

ScriptProcessor

ScriptProcessor

ScriptProcessor

StyleElementKeyClass

MediaQuery

GeoLocation

Other Lang

Ex. Php, perl, ec.

ScriptProcessor

Event

DOMParser

Add-In

MutliversalWindow•Almost identical ‘window’ object•Hosts ScriptProcessor•Acts as global object for script

CHtmlDocumentCHtmlDocument (XML , SVG Document (XML , SVG Document))

•Parse HTML•Layout•Process CSS•Process Script

Page 7: .NET Managed HTML Rendering Engine with Multi-language script host

Supported HTML TagsMost of HTML 3.2,HTML4, and HTML5 standard tags are supported to generate

CHtmlElement.(Undefined tags processed as block element within HTML document as default).

HTML4

<a><applet><b><base><big><blockquote><body><br><caption><dd><div><dl><dt><em><embed><font><form><h><head><hr><html><i><img><input><li><link><meta><nobr><noembed><object><ol><option><p><pre><s><script><select><small><span><strike><strong><sub><sup><table><tbody><td><textarea><tfoot><th><thead><title><tr><tt><u><ul><!DOCTYPE>

HTML5

<canvas><audio><video><source><track><bdi><aside><nav><ruby><rt><section><time><wbr><footer<<header><progress><figure><output><datalist> <progress>

XML Any XML tags

SVG ALL SVG Tags. Ex <svg><text><rect>

Page 8: .NET Managed HTML Rendering Engine with Multi-language script host

Rendering with Multiple Managed Threads

Request CHtmlDocument

ReturnsCHtmlDocument

I/O ThreadUI Thread DOM Thread

CSS Thread

Image Thread

Script Thread

<<HTML>HTML> <head><head> <link…><link…> <script..><script..> </head></head>

<body><body> <canvas><canvas> <image><image> <video><video> <audio><audio> </body></body></HTML></HTML>

Page 9: .NET Managed HTML Rendering Engine with Multi-language script host

Choosing GDI+ or System.Windows.Control

• CHtmlMultiversalWindowRender is System.Windows.Controls.Control• Most of CHtmlElement class (HTML Tags) will be drawn by GDI+/GDI• Control Tags ex. <IFrame> , <Input>, <button> will host Managed

Control.• <Object> or <Embed> will create ActiveXObject if ActiveXObject Option

is enabled.• <svg> is partially supported.• Canvas 2D will be supported with standard System.Drawing API set, ex.

DrawText, DrawLine, FillRectangle etc.

HTML can grow over 10000 HTML/XML tags, it is recommended to keep minimize the number of Control as low as possible.

“Control resource is limited by OS.”

fair[dll=fair] basic design guidance

Page 10: .NET Managed HTML Rendering Engine with Multi-language script host

Multiple Script Engine SupportMultiple Script Engine Support• IE Script Engine(jscript.dll) -Thru IActiveScript Wrapper.

• JINT (http://jint.codeplex.com/) -pure .net based interpreter (not compiler)- simple access to .net objects.

• Rhino (via IKVM) (https://developer.mozilla.org/ja/docs/Rhino)

- originating from Netscape/Mozilla ‘javagator’ project in 1997. - over 15 years, constantly improving. - supports legacy JavaScript mode with setLanguageVersion() API. - Supports Javascript 1.7 from version 1.7R1 - supports interpreter mode and compile mode - stable and widely implemented in java users. - .fair[dll=‘fair’] default script engine.

[about nashorn]Current performance of Nashorn with ikvm far worse than rhino or JINT, so nashorn via ikvm support is postphoned until performance is better than rhino.

Page 11: .NET Managed HTML Rendering Engine with Multi-language script host

DOM API and AJAX Support• getElementById(), getElementsByClassName(), getElementByTagName() 、 getElementByTagNameNS()• createElement(), createTextElement(), createStyleSheet(), createDocumentFragment() etc• appendChild(), insertBefore() , replaceChild() etc.• querySelector(), querySelectorAll(), matchesSelector()• XMLHTTPRequest() , DOMParser() (embedded in System.Net.HttpWebRequest Object)(XDomainRequest support has been dropped)• image(), document.all() - some features only• ArrayBuffer, IntxArray, UIntxArray, FloatxArray• document.cookie, Element.attributes, Element.classList, Element.dataSet• window.localStorage, window.sessionStorage• createElement(‘<iframe name=“abc”/>’)• postMessage(), window.onmessage• new ActiveXObject(“TypteID”) (some object features only. Ex. XMLHTTP, XMLDOM, Shockwave etc.)• setInterval(), setTimeout(), clearTimeout()• getContext(), window.requestAnimationFrame()• document.evalute() now supported thru its own System.Xml.Xpath.XPathNavigator (which is based upon Xpath 1.0)CSS 3.0• :nth-Last-Child, :nth-last-of-type: nth-last-child() , only-child, only-of-type (Done. Best Effort Implementation), matches(), :not()Prototype Extension Support• HTML Prototype objects, such as window.Element, window.Event, window.HTMLElement, are defined.

•Most of HTML3.2/4 Standard API Has been implemented•Most of JQuery standard scripts can be compiled.•Melon.js (HTML5 Canvas Game Framework) can be compiled. •Utility scripts such as embed.js, core130.js, core131.js, chartbeat.js, modernizr (version 2.7.1), can also be compiled.•prototype.js and mootools.js may be able to be compiled (some version only).•some utility scripts remains difficulty to be compiled which calls minor HTML5 APIs.•Dynamic Element CSS Recaluculation by altering Element Class with script is now supported.

Page 12: .NET Managed HTML Rendering Engine with Multi-language script host

Rendering Result(Beta)

Page 13: .NET Managed HTML Rendering Engine with Multi-language script host

About Layout Engine…• Unlike sophisticated layout engine which standard browser (ex.Chrome, Unlike sophisticated layout engine which standard browser (ex.Chrome,

Firefox, or IEFirefox, or IE) ) has, the layout engine of fair[dll=fair] only performs one has, the layout engine of fair[dll=fair] only performs one element phase layout.element phase layout.

• If there is float left after centered block, layout can be mixed up.If there is float left after centered block, layout can be mixed up.• fair[dll=‘fair’] can now perform dynamic css recalculations by javascripts, fair[dll=‘fair’] can now perform dynamic css recalculations by javascripts,

however, the actual layout may not work as expected.however, the actual layout may not work as expected.• Currently, 30% of commercial news web site are rendered as ‘relatively Currently, 30% of commercial news web site are rendered as ‘relatively

good’ output condition.good’ output condition.

Multiple Phase layout scheme may be introduced in future…Our current priority (due to our limited resource) is

Script handing > performance > layout

Center Left

Page 14: .NET Managed HTML Rendering Engine with Multi-language script host

Hardware Requirement• .fair[‘dll’= fair] will be able to operate with low-end spec

CPU(ex.Celeron, or Pentium) whose CPU passmark score is below 1000-2000 points.

• Because of the system design, mutli-core CPU is recommended strongly.

• .fair[‘dll’= fair] is mainly designed for broadband network rather than narrow band(less than 500Kbps). 1Mbps or higher is recommended.

• If the network speed less than 200kbps, rendering speed tends to be degraded noticeably in order to render script-rich and content-rich web sites.

Even old low spec cpu(like Dothan Pentium M 1.7G) can run this rendering engine as far as network speed is high enough.

Page 15: .NET Managed HTML Rendering Engine with Multi-language script host

Multimedia Support for HTML5 <Video> <Audio> TagsMultimedia Support for HTML5 <Video> <Audio> Tags

・ HTML5 Video Element and Audio Element is supported through Windows Media API.

Page 16: .NET Managed HTML Rendering Engine with Multi-language script host

Canvas 2D with JavaScript - Part 1• Fundamental 2-dimentional Canvas drawing scripts now works.

Page 17: .NET Managed HTML Rendering Engine with Multi-language script host

Canvas 2D with JavaScript - Part 2

Page 18: .NET Managed HTML Rendering Engine with Multi-language script host

Canvas 2D with JavaScript - Part 3

• The drawback is performance. Pixel array manipulation scripts tend to be slow in particular.• Animation performance is worse than normal browser as you can expect. (almost 1/3 for 2D)• The complex canvas games, such as ‘Full Screen Mario’ and ‘Gradius’, are not ready to play.

Page 19: .NET Managed HTML Rendering Engine with Multi-language script host

• Approximately 95 % WebGL Related-API exists.• ‘Three.js’ can be compiled.• The implementation of Visualization Effect is under progress.

Canvas WebGL/3D Progress

Page 20: .NET Managed HTML Rendering Engine with Multi-language script host

As Canvas WebGL Alternative…As Canvas WebGL Alternative…“phoria.js” is a “excellent” JavaScript library for simple 3D graphics on a canvas 2D renderer, which is still slow, but works on “.fair[dll=‘fair’]”.

Page 21: .NET Managed HTML Rendering Engine with Multi-language script host

DOM Structure,CSS, and JavaScript Viewer

DOM Tree ViewDOM Tree View

Script Compilation Result ViewScript Compilation Result View

CSS Process ViewCSS Process View

Page 22: .NET Managed HTML Rendering Engine with Multi-language script host

Current JavaScript Script Processing Update – Part 1Current JavaScript Script Processing Update – Part 1

200 : Success400, 500x : Script Error

The left picture is visiting money.cnn.com result view.Most of javascript compilation was success [Status:200], including jquery 1.5.1.However, there is 1 script remains script error [Status : 500].The script error ratio depends upon the complexity of the web page.]

200 : all Success

Page 23: .NET Managed HTML Rendering Engine with Multi-language script host

Current JavaScript Script Processing Error Count on well-known web site.Current JavaScript Script Processing Error Count on well-known web site.

Tested based upon the build of 2014/09/30

Site NameSite Name URLURL Script CountScript Count Error CountError Count

Computer WorldComputer World htto://www.computerworld.com 7474 11

NBCNBC http://www.nbc.com5050 11

PC World PC World htto://www.pcworld.com 7575 22

Yahoo Yahoo htto://www.yahoo.com 2020 00

YoutubeYoutube http://www.youtube.com 1717 11

New York TimesNew York Times http://www.nytimes.com 2323 00

China newsChina news http://www.chinanews.com/http://www.chinanews.com/ 6969 00

Wired.comWired.com http://www.wired.com/http://www.wired.com/ 7171 00

Innovation Excellence Innovation Excellence http://www.innovationexcellence.com/http://www.innovationexcellence.com/ 5050 22

CIACIA http://www.cia.govhttp://www.cia.gov 1616 00

Code ProjectCode Project http://www.codeproject.comhttp://www.codeproject.com 1515 00

Wikipedia English Main PageWikipedia English Main Page http://en.wikipedia.org/wiki/Main_Pagehttp://en.wikipedia.org/wiki/Main_Page 1313 11

StackoverflowStackoverflow http://stackoverflow.comhttp://stackoverflow.com 2121 00

Recode.netRecode.net http://recode.nethttp://recode.net 8888 11

San Jose Mecury NewsSan Jose Mecury News http://www.mercurynews.com/http://www.mercurynews.com/ 104104 22

SFGateSFGate http://www.sfgate.comhttp://www.sfgate.com 7474 11

XconomyXconomy http://www.xconomy.comhttp://www.xconomy.com 6666 11

BoketeBokete http://bokete.jphttp://bokete.jp 1919 00

Passmark homePassmark home http://www.passmark.comhttp://www.passmark.com 66 00

Toms hardware homeToms hardware home http://www.tomshardware.com/http://www.tomshardware.com/ 6262 00

Computer ShopperComputer Shopper http://www.computershopper.com/http://www.computershopper.com/ 5757 00

Amazon comAmazon com http://www.amazon.comhttp://www.amazon.com 7070 11

Bloomberg BusinessweekBloomberg Businessweek httphttp ://://www.businessweek.comwww.businessweek.com 4949 00

Note1) Async, deffered Scripts or element onload(ex img) scripts may no counted in ‘script count’.

Page 24: .NET Managed HTML Rendering Engine with Multi-language script host

Latest Build Page Load Performance ResultLatest Build Page Load Performance ResultHTML DOM + CSS with/without JavaScript Processing

Site Name DOM + CSS DOM + CSS + JavaScript

Google Top Pagehttp://www.google.com

Before: 280 - 400 msNew: 78 -110ms

Before : 395 - 402 msNew : 85 - 125ms

CNNhttp://edition.cnn.com/

2300 - 3200 ms Before 5534 -7750 msNew: 3050 -3900 ms

USA Todayhttp://www.usatoday.com/

Before: 2100 - 2500 msNew: 850 - 1100 ms

Before : 1500 - 3500 msNew : 900 - 1400 ms

Computerworldhttp://www.computerworld.com

600 - 1300 ms Before :9600 - 26000 msNew : 3200 - 4300 ms

ChinaView Cnhttp://www.chinaview.cn/

New: 1300-2500ms New : 2500- 4800 ms

Information Weekhttp://www.informationweek.com

New :1300 - 1800 ms Before :10500 -33600 msNew : 5100 - 5900 ms

ZDNET http://www.zdent.com

New :770 - 950 ms New : 2500 - 4500 ms

Wikipediahttp://en.wikipedia.org/wiki/Main_Page

New : 650- 750 ms New : 1300 - 1700 ms

Yahoo (US)http://www.yahoo.com

Old : 3300 -4500 ms2013/12 :500ms –2000ms

Before :5900- 12000ms2013/12 : 935 - 5800 ms

Testing environment : Pentium G640 2.8G 2 Core 8G RAM Windows 8 + Rhino Javascript Engine width .Net Framework 4.5 Some script may fail to execute. Web Data may be cached locally or remotely.

In average, 30 - 60% Performance up!

Page 25: .NET Managed HTML Rendering Engine with Multi-language script host

Current Issues - a lot!• Only a few events are supported currently. - onclick, onload, onreadystatechange, onmousemove, onkeydown,onkeypress,

domcontentloaded, visibilityChange, onfocus etc.• Poor Layout Engine output.(undefined width on block element may result in unexpected layout.)• @ font-face and Web Font and related API set is under research.• Hang or Crash issues due to stackoverflow.• HTML5 Canvas Context API is under progress. (Currently, 2D Canvas 70%- 80% API is working.)• Needs Visualization Support for 3D/WebGL /WebCL thru DirectX or OpenGL.• While Shockwave Flash works, Silverlight plugin does not load properly.• Multiple Background Image for one element is now suported on .far[dll=‘fair’],

the acutual rendering output has many issues.• Incomplete Worker/SharedWorker Threading Objects.

long way to go…

Page 26: .NET Managed HTML Rendering Engine with Multi-language script host

Not Supported Feature (in near future)• In Progress Rendering Support (Full Document DOM load must be loaded to render at first

currently).• Live Script Debug• <applet> needs java runtime, so no support for <applet> . (iphone and android do not support <applet> either.)

• @ CSS expression()• Browser Specific CSS Hack (“* html+xyz”)(*:first-child+etc)• XSL DOM Transformation• unpopular HTML5 API (Web FileSystem API,Microdata, or

Web SQL etc).• Web Real-Time Communication API set (WebRTC etc) is

beyond current .NET and hardware spec.

Page 27: .NET Managed HTML Rendering Engine with Multi-language script host

Our Mission• Create Open-Standard Managed light-weight HTML5 Create Open-Standard Managed light-weight HTML5

compatible managed browser with multi script language until compatible managed browser with multi script language until 2017.2017.

• Open LeaderShip in HTML community.Open LeaderShip in HTML community.• We are going to release .fair[dll=‘fair’] as a contribution to We are going to release .fair[dll=‘fair’] as a contribution to

“2011 Japan Tohoku Fukushima Earthquake Disaster 311 “2011 Japan Tohoku Fukushima Earthquake Disaster 311 Recovery” and express our gratidude for Ken Shama who is a Recovery” and express our gratidude for Ken Shama who is a great pioneer of supply chain management.great pioneer of supply chain management.