Post on 14-May-2015
description
Gearman and asynchronous processing in PHP applications
Pham Cong Dinh (a.k.a pcdinh)@pcdinh on Twitter
BarCampSaiGon 2010
Skunkworks@teamskunkworks on Twitter
2
The aim of my talk
Discuss about a solution that helps
scale
your high traffic
PHP web applications
3
Introduction
PHP developer since 2002. 8 years in PHP development and counting
Presenter at Hanoi PHP Day in 2008, 2009 Founder and maintainer of PHPVietnam mailing list (Google
Group) since 2004 Very interested in Linux, server farm, big data, database,
distributed processing, scalability, high performance web systems
Involved in clip.vn development at Vega Corporation 1 year ago
Software developer at Skunkworks
4
Agenda
Challenges in developing large scale PHP applications for high traffic web sites
Resolve the challenge: How to distribute workload
Gearman: an open source high performance job server
Develop PHP clients and workers
Challenges in managing workers – a case study of Gearman Agent Manager
5
What is large scale? How high is high traffic?
Challenges in developing large scale PHP applications for high traffic web sites (1)
6
Large Scale?
Challenges in developing large scale PHP applications for high traffic web sites (2)
Traffic
Data graph
Storage
Code base
Development team
7
Typical challenges: limitation of resources
CPU Disk speed Memory Bandwidth: router, NIC Architecture: application and system
Challenges in developing large scale PHP applications for high traffic web sites (3)
8
Major challenges
No preparation for growth No idea on how to scale your application at a certain extent No in-depth understanding of your system No proper system capacity monitoring Lack of proper skills
Challenges in developing large scale PHP applications for high traffic web sites (4)
9
Our challenge today
Resolve the challenge: How to distribute workload (1)
TOO MUCH WORKLOAD FOR A SINGLE SERVER
10
Many solutions
Load balancing: Hardware: F5, Cisco Content Services Switch Software: Bind, LVS, HAProxy, Varnish ...
Precalculate data Multi-tier application architecture
Resolve the challenge: How to distribute workload (2)
11
Our solution today
Queue up the workload Categorize workload pattern Optimize processing model, security Job server
Resolve the challenge: How to distribute workload (3)
12
Is queuing the final answer?
Keep up with peak workload? Handle backlog gracefully
Resolve the challenge: How to distribute workload (4)
13
Concepts
Synchronous and asynchronous Job, job queue and job server
Who
Used at LiveJournal, Yahoo!, Digg, BackType and many more
Used at Vega (clip.vn, vega.com.vn) for sending mails.
At Skunkworks?
Gearman: an open source high performance job server (1)
14
Architecture
Client Worker Job server
Gearman: an open source high performance job server (2)
Fail-over cluster
15
Features
Fast Programming language neutral A bridge between a message queue server and a pub/sub engine Enables applications to outsource tasks to other servers in a
synchronous or asynchronous manner Fault-tolerant Poison message and retries Persistent queues for background jobs Timeout
Gearman: an open source high performance job server (3)
16
How it works
Worker• worker connects to all gearmand servers.• worker registers what functions it supports.• worker asks for jobs.• if no jobs, sends command 'pre_sleep' to all gearmand's and sleeps.
Client connect to gearmand. submit a job for a particular job name
Gearmand acks the job, finds all sleeping workers related to the job. sends them all a 'noop' command to wake them up.
Gearman: an open source high performance job server (4)
17
Use cases
Long running processes: thumbnail generation, image resizing, order processing in e-commerce …
High CPU or memory requirements: high volume data processing, MapReduce, log aggregation, video encoding
Distributed and parallel processing Timing processing: incremental updates, data replication Limited rate FIFO processing Separation of concerns or security issues. Priority-aware system monitoring tasks: WonderProxy
Gearman: an open source high performance job server (5)
18
PHP interface library to Gearman server
PECL gearman: http://pecl.php.net/package/gearman or https://github.com/php/pecl-gearman
Pear's Net_Gearman: http://pear.php.net/package/Net_Gearman
Develop PHP clients and workers (1)
19
PHP Client = Job Sender
Develop PHP clients and workers (2)
20
PHP Worker = Job Executor
Develop PHP clients and workers (3)
21
Ease of use
How to manage multiple worker processes for a single job: launch, reload, stop, add process ...
Monitoring
Centralized management over set of servers
Web API (Restful)
Challenges in managing workers – a case study of Gearman Agent Manager
Questions?
@skunkworksvn, @pcdinh #barcampsaigon #teamskunkworks