Data Serialization Using Google Protocol Buffers
-
Upload
william-kibira -
Category
Software
-
view
317 -
download
1
description
Transcript of Data Serialization Using Google Protocol Buffers
DATA SERIALIZATION WITH GOOGLE PROTOCOL BUFFERS
By: William Kibira
What is Data Serialization
● The process of translating a data structure and its object state into a format that can be stored in a memory buffer, file or transported on a network.
● End goal being that it can be reconstructed in another computer environment.
Reasons as To why We do this
● Persist Objects [Store and later Retrieve them]● Perform Remote Procedural Calls● Create Distributed Objects [Corba , JavaRMI,
ICE]
Key Words
● Computer Environment
- Programming Languages
- Operating Systems
- Architectures and processors
● Platform Independent Solutions
Popular Platform Independent Solutions
● JSON and XML● BSON and Binary XML● Google Protocol Buffer , Thrift , Avro
Ref
http://en.wikipedia.org/wiki/Comparison_of_data_serialization_formats
JSON AND XML
● Most popular● Easily Human Readable to some extent● Most Web based APIs use it by default● Lots of generators for this stuff
How to works
● You write an IDL [Interface Description Language] . Kinda like CORBA IDLs but , much cleaner and more flexible.
● Pass it through a C++ based code generator● Get your Boiler plate code in a given language
you specified
GOOGLE PROTOCOL BUFFERS
● This is a platform independent language independent data serialization solution similar to XML in structure but much smaller in size and easier to structure .
● Been there since 2001 , made open in 2008
JSON BINARY FORMATS
● JSON is darn easy to read , If you can read binary , you definitely need to see a doctor.
● JSON [Gets fat even on little Data], Binary really compact{"deposit_money": "12345678"}
JSON BINARY
'0x6d', '0x6f', '0x6e', '0x01', '0xBC614E'
'0x65', '0x79', '0x31',
'0x32', '0x33', '0x34',
'0x35', '0x36', '0x37',
'0x38'
SPEED AT PARSING
● JSON is Fairly fast but , Binary is close to machine speed since it is readily parse-able.
FLOW
Schema / IDL
C++ Code Generator
C++ JAVA Python JavaScript
Server /Client application bases
What does a Schema Look Like ?
Howto Generate the Code
● Use the protobuffer compiler by specifying the language you want out put and the file.proto
● Protoc -I=/DIR_to_Schema/ --out_language=FOLDER_TO_Buffer/ DIR_TO_Schema/file.proto
A Look in my Terminal
What is Inside My XX.java
SIZE COMPARISON
RMI
GPB
JSON
XML
0 100 200 300 400 500 600 700 800 900 1000
905
250
559
836
Runtime Performance
Server CPU AVG Client CPU AVG Time
Protobuf 30.0% 37.75% 01:19:48
JSON 20.0% 75.00% 04:44:83
XML 12.00 80.75% 05:27:45
Versioning
● This is to do with backward compatibility between Protocol buffers that are old or new
● Old server new Client and Vice Versa
Even if a field has changed , the data will be parsed
Other Protocol Buffers
● MessagePack [.Net]● Thrift [Facebook]● Avro
Reasons To use Protocol Buffers
● They are smaller to push around over networks
● Easier [If Not easiest] to structure● Give a sense object oriented structuring
Reasons Not To use it
● Well, you will have to maintain both the server and clients .
● They may in most cases not be easy to learn● They are not an industry standard.● I am just trying to be fair here :)
SIMPLE DEMO CHAT APPS
● Simple chat application working on both desktops, laptops and Also on different Operating systems
● Partial Inspiration from the Fifth Estate
THE END
● Links to Check out
Google Protocol Buffers Main Page
https://developers.google.com/protocol-buffers/
● Apache Thrift
https://thrift.apache.org/