Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your...
Transcript of Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your...
![Page 1: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/1.jpg)
Optimizing your cloud for millions of connections: A Case StudyZbyněk Šlajchrt
![Page 2: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/2.jpg)
Overall architecture
![Page 3: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/3.jpg)
![Page 4: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/4.jpg)
DeliveryRate vs. Cost
• Max(DeliveryRate)• Min(Cost)
• DeliveryRate ~ Servers*BandwithPerServer• Servers = ActiveClients/MaxConsPerServer• Cost = ServerCost + NetworkCost• ServerCost = CostPerServer*Servers• CostPerServer ~ 1/Servers• NetworkCost ~ MinimalDeliveryRate
![Page 5: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/5.jpg)
Pull vs. Push
• Pull+ Existing distribution technology, simple
- Danger of SYN flood, minimal control over connected clients
• Push+ Better control over clients, kept-alive connections => less SYNs
- Unproven technology, more complex- How many connections can we keep on one
server? Less than 1M => Pull, otherwise Push
![Page 6: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/6.jpg)
![Page 7: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/7.jpg)
System Configuration
• CentOS 6• Max open file descriptors (startApp.sh)
ulimit -n 3000000
• System wide settings for file descriptors (/etc/sysctl.conf)fs.nr_open = 6291456
fs.file-max = 6291456
• System wide limit for open files per user (/etc/security/limits.conf)* hard nofile 6291456
![Page 8: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/8.jpg)
![Page 9: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/9.jpg)
![Page 10: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/10.jpg)
Technologies
![Page 11: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/11.jpg)
Netty
• Asynchronous event-driven network application framework written in Java by Trustin Lee
• Very good performance thanks to NIO and underlying epoll or kqueue kernel functions
• Easy programming model
![Page 12: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/12.jpg)
![Page 13: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/13.jpg)
![Page 14: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/14.jpg)
![Page 15: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/15.jpg)
Google Protocol Buffers
• Mechanism for serializing structured data• Language neutral• Platform neutral• Java, C++, Python• Very easy to grasp• Suitable for specifications (SRS)
![Page 16: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/16.jpg)
![Page 17: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/17.jpg)
Components
![Page 18: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/18.jpg)
Key components
• Comet Session Handler• Distributor• Commands Queue Store• Bandwidth Limiter• Client module
![Page 19: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/19.jpg)
![Page 20: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/20.jpg)
Comet Session Handler
• Netty handler managing comet sessions• Creates, registers and destroys comet
sessions• Decodes command results from HTTP
requests• Encodes commands as HTTP responses
![Page 21: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/21.jpg)
![Page 22: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/22.jpg)
Distributor
• Responsible for continuous delivery of commands to clients (scheduled thread)
• Loops over all currently connected sessions
• For each session selects commands to be sent
• Uses client’s Queue Status held in session
![Page 23: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/23.jpg)
![Page 24: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/24.jpg)
![Page 25: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/25.jpg)
![Page 26: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/26.jpg)
Commands Queue Store
• Storage for commands to be distributed• Populated by the Viruslab• Commands are stored in one or more queues• Every queue has a priority ordinal (zero-based)• Client holds and sends the Queue Status
– List of last command IDs for every queue (used)– Serves for selecting command for a client by
Distributor
![Page 27: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/27.jpg)
Command Scopes
• Priority 0 – request scope– Not kept in the client’s Queue Status =>
Command with the same ID can be delivered multiple times– Similar to JMS Topic (publish-subscribe) – commands are
distributed to the currently connected only• Priority 1 – session scope
– the Queue Status until the application is running => No command is delivered twice as long as the application runs
• Priority >1 – persistent scope– Last cmd IDs are kept persistently on the client– No command is sent twice unless the client is reinstalled
![Page 28: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/28.jpg)
![Page 29: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/29.jpg)
![Page 30: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/30.jpg)
![Page 31: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/31.jpg)
Bandwidth Limiter
• Protects kernel’s socket memory from exhaustion
• In case of big packets the distribution rate may exceed the bitrate of the NIC
• Limiter continuously computes the distribution rate and compares it with the predefined max bitrate
• Auto-regulating algorithm
![Page 32: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/32.jpg)
![Page 33: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/33.jpg)
![Page 34: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/34.jpg)
Client side
• Command interpreter• DLL module written in C • Integrated with Avast AV• CURL library
![Page 35: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/35.jpg)
Monitoring
![Page 36: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/36.jpg)
JMX Aggregator
• From a defined set of servers – Aggregates read-only attributes into one
tabular JMX attribute– Corresponding read-write attributes on
servers are represented by one attribute on the aggregator
![Page 37: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/37.jpg)
Aggregated read-only JMX attribute example:
HeapMemoryUsage.used[Node1]=18GHeapMemoryUsage.used[Node2]=16GHeapMemoryUsage.used[Node3]=19G
Aggregated read-write attributeexample:
MaxThroughput = 40000
![Page 38: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/38.jpg)
Zenoss
• Network management platform• Based on Zope app server (Python)• GPL v2• Supports JMX• Nice web interface• Maintains history
![Page 39: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/39.jpg)
![Page 40: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/40.jpg)
![Page 41: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/41.jpg)
![Page 42: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/42.jpg)
![Page 43: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/43.jpg)
SLOGAN
• Analyzer of log streams• Command line tool• SQL-like syntax (filtering, grouping,
sorting)• Events are POJOs• Can be represented in Google Protobuf
(very handy for Streaming Updates)• Events are serialized and appended to a
log file
![Page 44: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/44.jpg)
SLOGAN - Examples
Usage:tail –f app.log | slogan <query>
Query examples:
tstamp etype
tstamp –f etype==‘NOP’
tstamp max(tstamp)-min(tstamp) –g reqID
![Page 45: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/45.jpg)
H2 Database
• Used as the storage for the commands• Very fast in-memory database• Both embedded and server modes• JDBC API• Nice web-based console
![Page 46: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/46.jpg)
![Page 48: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/48.jpg)
Title
• Text
![Page 49: Optimizing your cloud for millions of connections: A Case Study · 2012. 5. 25. · Optimizing your cloud for millions of connections: A Case ... • Responsible for continuous delivery](https://reader036.fdocuments.net/reader036/viewer/2022062317/5fca522ef3fcc131d54c2c35/html5/thumbnails/49.jpg)
Title
• Text