Building fast, memory-efficient and maintainable web ... C++ Web-Services.pdf · • No static file...
Transcript of Building fast, memory-efficient and maintainable web ... C++ Web-Services.pdf · • No static file...
Web-Services in C++
Maximilian Haupt || MUC++ || 2016/06/30
Building fast, memory-efficient andmaintainable web-services in C++
Background
• Professional C/C++ developer since 2006
• VR, Real-Time Audio, Game-Engine, Big-Data, Machine-Learning
When talking about web-services
• Plain REST-like HTTP APIs talking JSON
• a.k.a. Micro-Services
• No fully fledged web-stack à Ruby on Rails, Django, MEAN, …
• No static file serving à nginx/apache
• No reverse proxy / load balancing à nginx/haproxy
• No SSL termination à nginx/haproxy
• No template rendering à frontend (Angular, React, …)
RTB detour
https://github.com/openrtb/OpenRTB/raw/master/OpenRTB-API-Specification-Version-2-3-1-FINAL.pdf
mbr targeting bidder wrap-up
• Started as a research project written in python
• Small RTBkit Intermezzo
• Later, rewritten in node.js
• Performance issues due to increasing traffic
rtb stack before
nginx
DB/Cache
Logging
bidder
HTTP
HTTP
Graphite
rtb stack after
nginx
broker
DB/Cache
Logging
bidder
HTTP
ZeroMQ (CBFC)
HTTP
Graphite
More than HTTP
• Proxygen: https://github.com/facebook/proxygen
• lwan: https://github.com/lpereira/lwan
• Wt: https://www.webtoolkit.eu/wt
• http-parser: https://github.com/nodejs/http-parser
• httpp: https://github.com/daedric/httpp - Hi Thomas! :)
• served: https://github.com/datasift/served
• libevhtp: https://github.com/ellzey/libevhtp
JSON
• JsonCpp https://github.com/open-source-parsers/jsoncpp
• RapidJSON https://github.com/miloyip/rapidjson
• Json11 https://github.com/dropbox/json11
• yajl https://lloyd.github.io/yajl/
• gason https://github.com/vivkin/gason
• ArduinoJson https://github.com/bblanchon/ArduinoJson
For a more complete list, e.g.: https://github.com/miloyip/nativejson-benchmark
TODO: put some nice broker graphs here
Adx/Adscale/…
Augmentation
Routing
Timeout
JSON
Dispatch
ZeroMQ + JSON
Sanity Checks
Adx/Adscale/…
(expensive stuff)
Linux, C++14 (g++ 4.8), boost::asio, httpp, JsonCpp, ZeroMQ, tcmalloc, Intel TBB
Req
uest
Res
pons
e
Btw: use latest JsonCpp and link it statically!
Broker evolution
• Augmentation with additional data
• Logging to Kafka
• QoS, load balancing, rate limiting
• Data scientists love data, i.e. traffic
JSON love & hate
Schemas to the rescue
• JsonSchema helps. a bit. maybe
• Typos are still an issue
• Knowledge about types still spread throughout the code
• Still accessing json members through strings
• Start writing you own serialization
• Structs plus hand-crafted de-/serialization code
Protobuf to the rescue!!!11elf
Protobuf to the rescue!!!11elf – generated Classes
Protobuf to the rescue!!!11elf – Reflection
Protobuf to the rescue!!!11elf – Serialization
Protobuf to the rescue!!!11elf – Message::Clear()
Protobuf to the rescue!!!11elf – Documentation
• Thanks Google: openrtb.proto (github.com/google/openrtb)
Json to protobuf and back
• json-protobuf mapper using the generated reflection
• https://github.com/yinqiwen/pbjson (RapidJSON based)
• https://github.com/shramov/json2pb (jansson based)
• https://github.com/shafreeck/pb2json (jansson based)
• https://github.com/bivas/protobuf-java-format (Java)
• https://github.com/mafintosh/protocol-buffers (node.js)
• à Single place for schema validation and performance improvements
Json to protobuf and back
Drop the bass response times
Now ~3 billion requests/day
Let it sink
…
Use protobuf for all data structures!
http://knowyourmeme.com/memes/x-all-the-y
Sky is the limit
• Still more memory used than needed (raw buffer + json + protobuf)
• Use event-based json parser (e.g. yajl or RapidJson) for memory-efficiency
• Get rid of reflection by also generating parser code
Generate json state machine from proto file
0 1
2
3
5 6
4{
}
key:"a"
key:"b"key:"c"{
}
<number>
null
7 8
<string>
protog
A protobuf json-parser generator!
https://github.com/0x7f/protog
$ protoc --cpp_out=. openrtb.proto$ protog -p openrtb.proto -i openrtb.pb.h \
-m com.google.openrtb.BidRequest
Will generate openrtb.pb.{cc,h} and bidrequest_parser.pb.{cc,h}
Disclaimer
• protog is not ready for production!
• Prototype which contains bugs
• Does not support self-referencing messages
• Many rough edges
Benchmark preface
• Never trust benchmarks other people present to you
• Highly recommend Brendan Gregg’s talks and blog: brendangregg.com
• All code is online: github.com/0x7f/cpp-meetup
0 0,5 1 1,5 2 2,5 3
httpp + rapidjson + validation
node.js + ajv
node.js + JSON
nodejs (raw)
httpp + pbjson
httpp + rapidjson
httpp + jsoncpp
httpp + protog
httpp + ganson
proxygen (raw)
httpp (raw)
EC2 c4.4xlarge (14.04 LTS), 4 threads, wrk (c=200,t=12,d=60s), POST bidrequest.2.json (~1.5kb)
https://lisavandore.com/2015/04/17/8-popular-countertop-materials-the-pros-and-the-cons/
There is no silver bullet
• Serialization vs meat
• Large vs small JSON
• Multi-purpose backend vs single task
• Green field vs existing code
• Standalone vs embedded http server
• Normal vs high throughput
• Hard memory constraints
Summary
• Measure! Measure! Measure!
• Web-services are more than HTTP+JSON
• Mastering JSON is hard
• Schemas and generated code helps
• C++ can be very fast when used correctly
• Which library/tool for which use-case
Thank you!