XML Data Binding: Encoding for High-Performance Content-Based Event Routing
-
Upload
cleopatra-tocci -
Category
Documents
-
view
53 -
download
0
description
Transcript of XML Data Binding: Encoding for High-Performance Content-Based Event Routing
![Page 1: XML Data Binding: Encoding for High-Performance Content-Based Event Routing](https://reader030.fdocuments.net/reader030/viewer/2022032607/56812fee550346895d9565a2/html5/thumbnails/1.jpg)
XML Data Binding:Encoding for High-Performance Content-Based Event RoutingGail Kaiser
Phil GrossColumbia UniversityProgramming Systems Lab
![Page 2: XML Data Binding: Encoding for High-Performance Content-Based Event Routing](https://reader030.fdocuments.net/reader030/viewer/2022032607/56812fee550346895d9565a2/html5/thumbnails/2.jpg)
Overview
PSL Intro MEET Project Encoding Conversion Efficiency Encoding Size Efficiency Encoding Classification Efficiency
![Page 3: XML Data Binding: Encoding for High-Performance Content-Based Event Routing](https://reader030.fdocuments.net/reader030/viewer/2022032607/56812fee550346895d9565a2/html5/thumbnails/3.jpg)
Programming Systems Lab
“PSL conducts research on Web technologies, collaborative work, virtual worlds, process/workflow, extended transaction models, software development environments and tools, software engineering, information management, and distributed programming systems”
Lately, lots of XML stuff
![Page 4: XML Data Binding: Encoding for High-Performance Content-Based Event Routing](https://reader030.fdocuments.net/reader030/viewer/2022032607/56812fee550346895d9565a2/html5/thumbnails/4.jpg)
PSL XML-related Research
FlexML: Flexible XML– Open-ended XML streams that may include “new” tags– Dynamic schema and semantics discovery and
composition XUES: XML-based Universal Event Service
– Event Packager: Data mining over XML structured data– Event Distiller: XML event poset pattern matching– Learning new application-domain events to recognize
DISCUS: Decentralized Information Spaces for Composition and Unification of Services – Rapid and secure application composition using Web
Services– Trust Evolution: PGP Trust + KeyNote + real-world
business
![Page 5: XML Data Binding: Encoding for High-Performance Content-Based Event Routing](https://reader030.fdocuments.net/reader030/viewer/2022032607/56812fee550346895d9565a2/html5/thumbnails/5.jpg)
MEET
Multiply Extensible Event Transport Content-based multicast routing Must be efficient enough for embedded
and high-performance applications
![Page 6: XML Data Binding: Encoding for High-Performance Content-Based Event Routing](https://reader030.fdocuments.net/reader030/viewer/2022032607/56812fee550346895d9565a2/html5/thumbnails/6.jpg)
MEET Motivations
Personal Life Recorder (sensor oriented) GroupWork Recorder (computer/DB
oriented) Parallel/Grid computing Distributed simulation Battlefield C4I Last, but not least:
– Dissertation submission
![Page 7: XML Data Binding: Encoding for High-Performance Content-Based Event Routing](https://reader030.fdocuments.net/reader030/viewer/2022032607/56812fee550346895d9565a2/html5/thumbnails/7.jpg)
Relationship to Other Work
Generally modeling communication like
What actually goes over the line is afterthought
But with N-Way Internet-scale communication– Millions of publishers and subscribers
We can (must!) do better than ASCII text…– Line speed => ≈250 assembly instructions per
packet
Machine ARelational
Machine BXML
![Page 8: XML Data Binding: Encoding for High-Performance Content-Based Event Routing](https://reader030.fdocuments.net/reader030/viewer/2022032607/56812fee550346895d9565a2/html5/thumbnails/8.jpg)
MEET Extensibility
Want to scale up, to millions of pubs and subs
Want to scale down, to embedded and wireless
No single solution satisfactory at all scales Composed of hot-swappable subsystems
– Router, transports, clock/causality, types, etc.
![Page 9: XML Data Binding: Encoding for High-Performance Content-Based Event Routing](https://reader030.fdocuments.net/reader030/viewer/2022032607/56812fee550346895d9565a2/html5/thumbnails/9.jpg)
Why Types
Event data is not just an opaque bag of bits
Subscriptions are Boolean functions over events
Type safety would be nice What type system to use?
![Page 10: XML Data Binding: Encoding for High-Performance Content-Based Event Routing](https://reader030.fdocuments.net/reader030/viewer/2022032607/56812fee550346895d9565a2/html5/thumbnails/10.jpg)
Initial MEET Type Design
Initial design calls for supporting Java, C#, and XML Schema defined objects “out of the box”
XML Schema used as Ur-language/Esperanto for conversions
Subscriptions are arbitrary boolean functions on datatypes
XML Schema is not ideal ur-type– Excessively complex, verbose, etc.
![Page 11: XML Data Binding: Encoding for High-Performance Content-Based Event Routing](https://reader030.fdocuments.net/reader030/viewer/2022032607/56812fee550346895d9565a2/html5/thumbnails/11.jpg)
Encodings for Efficiency
Java, C#, XML, ASN.1 have well-defined but proprietary encodings for instances
Would be nice to have an independent encoding scheme with some desirable properties missing from the above– Fast serialization/deserialization– Elimination of redundant information from
message sequences– Data organized for rapid classification/routing
![Page 12: XML Data Binding: Encoding for High-Performance Content-Based Event Routing](https://reader030.fdocuments.net/reader030/viewer/2022032607/56812fee550346895d9565a2/html5/thumbnails/12.jpg)
Conversion Efficiency
Need to get to and from wire format as fast as possible
Leverage homogeneity to eliminate unnecessary conversions, e.g., network byte order
ECho system from Eisenhauer et. al., Georgia Tech– Using “native data” for ultra-low latency– Necessary for HPC
![Page 13: XML Data Binding: Encoding for High-Performance Content-Based Event Routing](https://reader030.fdocuments.net/reader030/viewer/2022032607/56812fee550346895d9565a2/html5/thumbnails/13.jpg)
Size Efficiency
Ideal for single message is self-describing data With multiple messages of same type, one can
pull out redundant type info, e.g., schema Goal is to go further: If 90% of content of
messages is the same, generate a new subtype with fixed values
From self-describing to all-schema is a continuum
![Page 14: XML Data Binding: Encoding for High-Performance Content-Based Event Routing](https://reader030.fdocuments.net/reader030/viewer/2022032607/56812fee550346895d9565a2/html5/thumbnails/14.jpg)
Classification Efficiency
When bits start arriving serially at the router, would like to begin cut-through routing as soon as possible– Avoid the curse of IP/IPv6: source address
first Want key routing bits as close to the
front as possible Want data in fixed locations
![Page 15: XML Data Binding: Encoding for High-Performance Content-Based Event Routing](https://reader030.fdocuments.net/reader030/viewer/2022032607/56812fee550346895d9565a2/html5/thumbnails/15.jpg)
Fast Classifying: First Things First
In the packet, type info first (after magic)– Would like to represent type codes as bit
string with “most significant” info e.g. parent type first, followed by subtype identifier, sub-subtype, etc.
– Need access to type hierarchy Popular classification fields at the front
– Need to tag with popularity metadata– “subscribers will want to select on me”
![Page 16: XML Data Binding: Encoding for High-Performance Content-Based Event Routing](https://reader030.fdocuments.net/reader030/viewer/2022032607/56812fee550346895d9565a2/html5/thumbnails/16.jpg)
Fast Classifying: Fixed Positions
Would like to avoid scanning through long or variable-length fields
Long/Variable data needs to be in a separate channel/section
Primitives and fixed-length references at the front– References point into data section– Classifier can jump large, uninteresting data
quickly
![Page 17: XML Data Binding: Encoding for High-Performance Content-Based Event Routing](https://reader030.fdocuments.net/reader030/viewer/2022032607/56812fee550346895d9565a2/html5/thumbnails/17.jpg)
Plus: Schema Format
We’d like the schema format to be amenable to programmatic manipulation and analysis
For instance, when negotiating formats, we’d like to be able to compute how our original format offer differs from the counter-offer
XML Schema is pretty good for this
![Page 18: XML Data Binding: Encoding for High-Performance Content-Based Event Routing](https://reader030.fdocuments.net/reader030/viewer/2022032607/56812fee550346895d9565a2/html5/thumbnails/18.jpg)
Conclusions
Efficient instance transfer is an interesting case for data-binding
Special needs for efficiency But we can negotiate our own format
among the communicating parties Some explicit support for this in a
general data-binding solution could help acceptance
![Page 19: XML Data Binding: Encoding for High-Performance Content-Based Event Routing](https://reader030.fdocuments.net/reader030/viewer/2022032607/56812fee550346895d9565a2/html5/thumbnails/19.jpg)