Nyc storm meetup_robdoherty
-
Upload
robert-doherty -
Category
Technology
-
view
1.019 -
download
0
description
Transcript of Nyc storm meetup_robdoherty
![Page 2: Nyc storm meetup_robdoherty](https://reader033.fdocuments.net/reader033/viewer/2022052821/554a0f0db4c9058c5d8b48e0/html5/thumbnails/2.jpg)
What is Outbrain?
![Page 3: Nyc storm meetup_robdoherty](https://reader033.fdocuments.net/reader033/viewer/2022052821/554a0f0db4c9058c5d8b48e0/html5/thumbnails/3.jpg)
Before Storm
● Custom distributed processing system● Python and ZMQ
● Advantages:○ Simple components○ Well-understood
● Disadvantages:○ Did not scale○ Batch-processing
![Page 4: Nyc storm meetup_robdoherty](https://reader033.fdocuments.net/reader033/viewer/2022052821/554a0f0db4c9058c5d8b48e0/html5/thumbnails/4.jpg)
Kafka + Storm
● Kafka: high-throughput distributed messaging● Storm: distributed, real-time computation
![Page 5: Nyc storm meetup_robdoherty](https://reader033.fdocuments.net/reader033/viewer/2022052821/554a0f0db4c9058c5d8b48e0/html5/thumbnails/5.jpg)
Why Kafka?
● Need method to buffer clicks into “stream”● Kafka + Storm common pattern for click tracking
![Page 6: Nyc storm meetup_robdoherty](https://reader033.fdocuments.net/reader033/viewer/2022052821/554a0f0db4c9058c5d8b48e0/html5/thumbnails/6.jpg)
Why Storm?
● “Real time” (15s latency requirements)● Fault tolerance● Easy to manage parallelism● Stream grouping● Active community● Open-source project
![Page 7: Nyc storm meetup_robdoherty](https://reader033.fdocuments.net/reader033/viewer/2022052821/554a0f0db4c9058c5d8b48e0/html5/thumbnails/7.jpg)
Nginx Servers
Kafka Cluster
Storm Topology
Elastic Load Balancer
Customer Traffic
AWS
MongoDB
Redis
Algo
API
Architecture
![Page 8: Nyc storm meetup_robdoherty](https://reader033.fdocuments.net/reader033/viewer/2022052821/554a0f0db4c9058c5d8b48e0/html5/thumbnails/8.jpg)
Kafka Cluster
● 40 Producers (8 m1.large instances)○ Python brod
● 4 Brokers (4 m1.large instances)
10k Clicks per second (peak)
14B Clicks per month
Kafka v0.7.2
![Page 9: Nyc storm meetup_robdoherty](https://reader033.fdocuments.net/reader033/viewer/2022052821/554a0f0db4c9058c5d8b48e0/html5/thumbnails/9.jpg)
Storm Topology
● 40 Supervisors (c1.xlarge instances)● 35 Bolts, 1 Kafka spout● 250+ Executors (worker threads)
160k+ tuples executed per second
Storm v0.82
Leiningen v1.7
![Page 10: Nyc storm meetup_robdoherty](https://reader033.fdocuments.net/reader033/viewer/2022052821/554a0f0db4c9058c5d8b48e0/html5/thumbnails/10.jpg)
Customer Traffic
Kafka Spout Aggregate 15s
Aggregate 5m
Position
Customer
Social
Arrangement
Front Page
@Handle
Storm Topology
![Page 11: Nyc storm meetup_robdoherty](https://reader033.fdocuments.net/reader033/viewer/2022052821/554a0f0db4c9058c5d8b48e0/html5/thumbnails/11.jpg)
Challenges
● Shell Bolts● Anchor Bolts/Replaying Stream● Acking Tuples● Monitoring
![Page 12: Nyc storm meetup_robdoherty](https://reader033.fdocuments.net/reader033/viewer/2022052821/554a0f0db4c9058c5d8b48e0/html5/thumbnails/12.jpg)
Monitoring
● Scribe Logging● Munin + Nagios● JMX-JMXTrans + Ganglia
● Storm UI● Thrift interface into Nimbus + D3
![Page 13: Nyc storm meetup_robdoherty](https://reader033.fdocuments.net/reader033/viewer/2022052821/554a0f0db4c9058c5d8b48e0/html5/thumbnails/13.jpg)
![Page 14: Nyc storm meetup_robdoherty](https://reader033.fdocuments.net/reader033/viewer/2022052821/554a0f0db4c9058c5d8b48e0/html5/thumbnails/14.jpg)
![Page 15: Nyc storm meetup_robdoherty](https://reader033.fdocuments.net/reader033/viewer/2022052821/554a0f0db4c9058c5d8b48e0/html5/thumbnails/15.jpg)
Monitoring
● Scribe Logging● Munin + Nagios● JMX-JMXTrans + Ganglia
● Storm UI● Thrift interface into Nimbus + D3
![Page 16: Nyc storm meetup_robdoherty](https://reader033.fdocuments.net/reader033/viewer/2022052821/554a0f0db4c9058c5d8b48e0/html5/thumbnails/16.jpg)
![Page 17: Nyc storm meetup_robdoherty](https://reader033.fdocuments.net/reader033/viewer/2022052821/554a0f0db4c9058c5d8b48e0/html5/thumbnails/17.jpg)
![Page 18: Nyc storm meetup_robdoherty](https://reader033.fdocuments.net/reader033/viewer/2022052821/554a0f0db4c9058c5d8b48e0/html5/thumbnails/18.jpg)
Future Plans
● Load testing● Break topology into smaller pieces● Move from AWS to private data center