Server Monitoring (Scaling while bootstrapped)
-
Upload
ajibola-aiyedogbon -
Category
Technology
-
view
151 -
download
0
Transcript of Server Monitoring (Scaling while bootstrapped)
By Ajibola Aiyedogbon
Server Monitoring(scaling while bootstrapped)
About meCo-founder Amebo App
Mobile Developer (Jobberman, GTBank WP, etc)
DevOps enthusiast
Before before...1 server for everything
-1 users
J2ME only
What throughput!
Cloudinary as CDN
Deployment fails
High costs, ignorance is very expensive
Now5+ servers
Hundreds of thousands of users
Multi Platform apps
18000 req/min throughput
Cloudflare as CDN
Deployments with zero downtime
Managed costs
Scaling rhymes with Failing!
Server Stack
Server Stack1 load balancer (layer 4, high availability, failover*)
3 web servers (vertically & horizontally scalable)
1 database server (replication*, redundancy*)
1 staging server
$65 monthly serving over 100 million requests
Cloudflare secret weapon, caches static requests (70%).
Technology Stack
Technology StackHaproxy (load balancer)
Nginx, Php-fpm (web server, php interpreter)
Phalcon, Php-Resque (framework, scheduler)
Redis, MongoDB, MariaDB (in-memory cache, datastores)
Git (BitBucket), Packer, Ansible (server provisioning, code provisioning)
SetCronJob, CloudFlare, Fastly (3rd party)
Why Iaas not Paas?All about the pricing page!
Bandwidth costs too high
Code optimizations are hidden behind computing power
Mission critical? Offload to PaaS selectively, e.g. Parse EOL, death by acquisition...
Why Monitor?
Don’t end up like these guys...
Why monitor?Get Visibility
Improve usability & stability
Complicated technology stacks with hard to trace errors
Mission critical
More sleep!
What to monitor apart from everything?
Server Metrics (infrastructure)Ram usage, spikes
Bandwidth usage, highs vs lows
CPU usage over time, peak usage
Disk I/O
Open source vs Saas
Free mostly
Server Metrics (services)Haproxy stats
Nginx Stats
Mysql performance etc
Service *something* status
Application ErrorsCatch all exception php
User defined errors
3rd party Library errors
Tech Stack (Application Performance Monitoring)
Request throughput
Resource usage
Service Health
Database monitoring
Infrastructure bottlenecks
Failure Alerts
Code Errors
High level overview with deep dive
Log Tracking
Better way to tail -f
Http stack errors & anomalies
Multiple log files from diff services
Manual tailing is difficult
Get pre configured graphs based on logs
All server traffic is logged, access_log
Client Errors (Mobile)
Client side stack traces post deployment
Valuable version & device insight
Very handy at debug time & post
Catch all errors …. mostly
Memory leaks & stack traces
3rd party library errors or platform errors
Open Source vs Proprietary
Vendor lockin
Community support
DIY vs training
Industry standards & experience
Fault tolerance
Enterprise customer experience
3rd Party vs Native monitoring toolsCore business?
Pricing again!
Support lifecycle and responsiveness
Product version, beta or 5.0?
Dashboard simplicity
Security implications? firewalled? https? localhost only? Install certs?
Too many alerts….!
What now?Congratulations, you reward is more work!
Customize alerts
Fix errors
Webhooks
Send to slack
Ignore at own risk
Be like this guy….or not!
Graphs on graphs on graphs on graphsInformation overload is real
Customize dashboard
Overviews only
Deep dive early to be familiar with dashboard
What Next? Setup BugSnag
ConclusionWhy Monitor
What to Monitor
How to monitor
Pricing
Dashboards
Discuss your stack with peers
Thank You@Ajibz