Rhebok, High Performance Rack Handler / Rubykaigi 2015

67
Rhebok, High performance Rack Handler Masahiro Nagano @kazeburo RubyKaigi 2015

Transcript of Rhebok, High Performance Rack Handler / Rubykaigi 2015

Rhebok,High performance Rack HandlerMasahiro Nagano @kazeburoRubyKaigi 2015

Me

•Masahiro Nagano

•@kazeburo

• Principal Site Reliability Engineerat Mercari, Inc.

Mercari

•Download: 27M (JP+US)

•GMV: Several Billion per a Month

• Items: Several hundreds of thousand or more new items in a Day

• Backend language: PHP, Go, lua, etc

Agenda

• Rhebok Overview and Benchmark

•How to create a High Performance Rack Handler & Rhebok internals

RhebokOverview and Benchmark

Rhebok• Rack Handler/Web Server

• 1.5x-2x performance when compared to Unicorn

• Prefork Architecture same as Unicorn

• Rhebok is suitable for running HTTP application servers behind a reverse proxy like nginx

• Ruby port of Perl’s Gazelle

What’s Gazelle?• High Performance Plack Handler• Plack is Perl’s Rack

• 2x~3x times faster than servers commonly used like Starman, Starlet

• Production Ready• Installed to dozen servers and has shown to

reduce their CPU usage by 1-3%

Who should use Rhebok?

• A Highly optimized high traffic websites• Gaming, Ad-tec, Recipe Site, Media or

massive scale SNS

• By using Rhebok, it is possible to improve the response speed to higher level

• Can be applied to any website

general website

optimized website

SQLCacheWAFRackHandler Ruby

SQLCacheRubyWAFRackHandler

% in response time

Who should not use Rhebok?

•Who want to use WebSocket or Streaming

•Who can not setup the reverse proxy in front of Rhebok

Rhebok Spec•HTTP/1.1 Web Server

• Support full HTTP/1.1 features except for KeepAlive

• Support TCP and Unix Domain Socket

•Hot Deployment using start_server

•OobGC

Usage

$ rackup -s Rhebok \ --port 8080 \ -E production -O MaxWorkers=20 \ -O MaxRequestPerChild=1000 \ -O OobGC=yes \ config.ru

Recommended configuration

Rhebok©2011 Amazon Web Services LLC or its affiliates. All rights reserved.

User Users Client Multimedia Corporate data center

Traditional server

Mobile Client

Internet AWS Management Console

IAM Add-on Example:IAM Add-on

Amazon Mechanical Turk

On-Demand Workforce

Human Intelligence Tasks (HIT)

Assignment/Task

RequesterWorkersAmazon Mechanical Turk

Non-Service Specific

Reverse Proxy(Nginx,h2o)

HTTP/2

HTTP/1.1TCPUnix Domain Socket

http { listen 443 ssl http2; upstream app { server unix:/path/to/app.sock; } server { location / { proxy_pass http://app; } location ~ ^/assets/ { root /path/to/webapp/assets; } }}

Hot Deploy

$ start_server --port 8080 \ -- rackup -s Rhebok \ -E production -O MaxWorkers=20 \ -O MaxRequestPerChild=1000 \ -O OobGC=yes \ config.ru

perl: https://metacpan.org/release/Server-Startergolang: https://github.com/lestrrat/go-server-starter

start_server

How works start_server

start_server --port 8080 -- rackup

Rhebok

worker worker worker

socket

fork

©2011 Amazon Web Services LLC or its affiliates. All rights reserved.

User Users Client Multimedia Corporate data center

Traditional server

Mobile Client

Internet AWS Management Console

IAM Add-on Example:IAM Add-on

Amazon Mechanical Turk

On-Demand Workforce

Human Intelligence Tasks (HIT)

Assignment/Task

RequesterWorkersAmazon Mechanical Turk

Non-Service Specific

Socket.for_fd( ENV["SERVER_STARTER_PORT"])

How works start_server

start_server --port 8080 -- rackup

Rhebok

worker worker worker

socket

©2011 Amazon Web Services LLC or its affiliates. All rights reserved.

User Users Client Multimedia Corporate data center

Traditional server

Mobile Client

Internet AWS Management Console

IAM Add-on Example:IAM Add-on

Amazon Mechanical Turk

On-Demand Workforce

Human Intelligence Tasks (HIT)

Assignment/Task

RequesterWorkersAmazon Mechanical Turk

Non-Service Specific

SIGHUP

Rhebok

worker worker worker

forkSocket.for_fd( ENV["SERVER_STARTER_PORT"])

How works start_server

start_server --port 8080 -- rackup

Rhebok

worker worker worker

socket

©2011 Amazon Web Services LLC or its affiliates. All rights reserved.

User Users Client Multimedia Corporate data center

Traditional server

Mobile Client

Internet AWS Management Console

IAM Add-on Example:IAM Add-on

Amazon Mechanical Turk

On-Demand Workforce

Human Intelligence Tasks (HIT)

Assignment/Task

RequesterWorkersAmazon Mechanical Turk

Non-Service Specific

SIGHUP

Rhebok

worker worker worker

SIGTERM

Benchmark

Benchmark environment• Amazon EC2 c3.8xlarge• 32 vcpu

• Amazon Linux

• Ruby 2.2.3

• Unicorn 5.0.0 / rhebok 0.9.0

• patched wrk that supports unix domain socket• https://github.com/kazeburo/wrk/tree/unixdomain2

Benchmark

HelloWorld sinatra rails

557730788

248094

615134557

398898

req/

sec

Rhebokunicorn

ISUCON benchmark• ISUCON• web application tuning contest

• Contestants compete with the scores of benchmark created by organizers

•Web application that becomes the theme of ISUCON is close to the service it is in reality

ISUCON 4 Qualifier43560

41175

SCO

RE

unicorn Rhebok

How to create a high performance Rack Handler and Rhebok internals

Basics of Rack and Rack Handler

Rack

• Rack is specification• interface between webservers that support

ruby and ruby web frameworks

• Rack also is implementation• eg. Rack::Request, Response and

Middlewares

web server interface

unicorn

thin

puma

RackWeb

interface

Rails

sinatra

Padrino

Web Server Framework

Rack Application

app = Proc.new do |env| [ '200', {'Content-Type' => 'text/html'}, ['Hello'] ]end

Rack env hash

•Hash object contains Request Data

• CGI keys• REQUEST_METHOD, SCRIPT_NAME, PATH_INFO,

QUERY_STRING, HTTP_Variables

• Rack specific keys• rack.version, rack.url_scheme, rack.input, rack.errors,

rack.multithread, rack.multiprocess, rack.run_once,rack.hijack?

Response Array

[ '200', { 'Content-Type' => 'text/html', ‘X-Content-Type-Options’ => ‘nosniff’, ‘X-Frame-Options’ => ‘SAMEORIGIN’, ‘X-XSS-Protection’ => ‘1; mode=block’ }, ['Hello',‘world’]]

Response body

• Response body must respond to each

• Array of strings

• Application instance

• File like object

Role of Rack Handler

• Create env from an HTTP request sent from a client

• Call an application

• Create an HTTP response from array and send back to the client

env app arrayHTTP req HTTP res

Create a Rack Handler

module Rack module Handler class Shika def self.run(app, options) slf = new() slf.run_server(app) end def run_server(app) server = TCPServer.new('0.0.0.0', 8080) while true conn = server.accept buf = "" while true buf << conn.sysread(4096) break if buf[-4,4] == "\r\n\r\n" end reqs = buf.split("\r\n") req = reqs.shift.split env = { 'REQUEST_METHOD' => req[0], 'SCRIPT_NAME' => '', 'PATH_INFO' => req[1], 'QUERY_STRING' => req[1].split('?').last, 'SERVER_NAME' => '0.0.0.0', 'SERVER_PORT' => '5000', 'rack.version' => [0,1], 'rack.input' => StringIO.new('').set_encoding('BINARY'), 'rack.errors' => STDERR, 'rack.multithread' => false, 'rack.multiprocess' => false, 'rack.run_once' => false, 'rack.url_scheme' => 'http' } reqs.each do |header| header = header.split(": ") env["HTTP_"+header[0].upcase.gsub('-','_')] = header[1]; end status, headers, body = app.call(env) res_header = "HTTP/1.0 "+status.to_s+" res_header << "+Rack::Utils::HTTP_STATUS_CODES[status]+"\r\n" headers.each do |k, v| res_header << "#{k}: #{v}\r\n" end res_header << "Connection: close\r\n\r\n" conn.write(res_header) body.each do |chunk| conn.write(chunk) end conn.close end end end endend

create socketaccept

read request & create env

run app

create response

Run server

$ rackup -r ./shika.rb -s Shika -E production config.ru

This rack handler hassome problems

• Performance problem• Handle only one request at once

• Stop the whole world when one request lagged

• No TIMEOUT

• No HTTP request parser support HTTP/1.1 spec

Increase concurrency• Multi process

• simple and easy to scale

• Multi thread• lightweight context switch compared to the

process

• IO Multiplexing• Event driven, can handle many connections

Concurrency strategy• Unicorn

• -> multi process

• PUMA• -> multi thread + limited event model

(+ multi process)

• Thin• event model (+ multi process)

Manager

Prefork Architecture

Manager

bind

listen

Prefork Architecture

Worker

accept

Worker

accept

Worker

accept

Worker

accept

Manager

bind

listen

fork fork fork fork

Prefork Architecture

Worker

accept

Worker

accept

Worker

accept

Worker

accept

Manager

bind

listen

fork fork fork fork

Client Client ClientClient

Prefork Architecture

prefork_engine

• https://github.com/kazeburo/prefork_engine

• Ruby port of Perl’s Parallel::Prefork

• a simple prefork server framework

prefork_engineserver = TCPServer.new('0.0.0.0', 8080)pe = PreforkEngine.new({ "max_workers" => 5, "trap_signals" => { "TERM" => 'TERM', "HUP" => 'TERM', },})while !pe.signal_received.match(/^TERM$/) pe.start { # child while true conn = server.accept .... end }endpe.wait_all_children

IO timeout

IO timeout• Unicorn does not have io timeout• send SIGKILL to a long running process

• default timeout 30 sec

E, [2015-12-08T03:13:24.863287 #90217] ERROR -- : worker=0 PID:90243 timeout (61s > 60s), killingE, [2015-12-08T03:13:24.865764 #90217] ERROR -- : reaped #<Process::Status: pid 90243 SIGKILL (signal 9)> worker=0I, [2015-12-08T03:13:24.866176 #90217] INFO -- : worker=0 spawning...

Using select(2)while true connection = @server.accept buf = self.read_timeout(connection) if buf == nil connection.close next end parse_http_header(…)

--

def read_timeout(conn) if !IO.select([conn],nil,nil,READ_TIMEOUT) return nil end return connection.sysread(4096)end

Rhebok supports IO timeout

• Implement read_timeout in C• avoid strange behavior of nonblock +

sysread

• use poll(2) instead of select(2)

$ rackup -s Rhebok -O Timeout=60 config.ru

Parse HTTP request

HTTP parser• HTTP Parser is easy to cause security issue. It's

safer to choose an existing one that is widely used

• There are several fast implementation• Mongrel based - Unicorn, PUMA

• Node.js based - Passenger 5

• PicoHTTPParser - Rhebok, h2o

• pico_http_parser in rubygems• Ruby binding of PicoHTTPParser

pico_http_parser benchmark

0 1 2 4 10

80814118499

140395153002167823

109602

166615203201

231919

455188

# of headers

picohttpparser unicorn

PicoHTTPParser in Rhebok

• uses PicoHTTPParser directly• does not use pico_http_parser.gem

• performs both of reading and parsing the HTTP header in a C function• reduce overhead of create Ruby’s string

contain HTTP header

TCP optimization

TCP_NODELAY•When data is written, TCP does not

send packets immediately. There are some delays.

• TCP uses Nagle’s algorithm to collect small packets in order to send them all at once by default

• TCP_NODELAY disable it

write(“foo”)

write(“bar”)

os/kernel clientApplication

buffering

“foobar”

Nagle’s algorithm

delay

write(“foo”)

write(“bar”)

os/kernel clientApplication

“foo”

“bar”

TCP_NODELAY

Problem of TCP_NODELAY

•When TCP_NODEALY is enable, take care of excessive fragmentation of tcp packet• causes increase network latency

• To prevent fragmentation• concat data in application

• use writev(2)

writev(2)

w/o writev(2)

char *buf1 = “Hello ”;char *buf2 = “RubyKaigi”;char *buf3 = “\r\n”;

write(fd, buf1, strlen(buf1));write(fd, buf2, strlen(buf2));write(fd, buf3, strlen(buf3));

kernel

©2011 Amazon Web Services LLC or its affiliates. All rights reserved.

User Users Client Multimedia Corporate data center

Traditional server

Mobile Client

Internet AWS Management Console

IAM Add-on Example:IAM Add-on

Amazon Mechanical Turk

On-Demand Workforce

Human Intelligence Tasks (HIT)

Assignment/Task

RequesterWorkersAmazon Mechanical Turk

Non-Service Specific

“Hello “ “RubyKaigi” “\r\n”many syscalls

w/o writev(2)char *buf1 = “Hello ”;char *buf2 = “RubyKaigi”;char *buf3 = “\r\n”;char *buf;

str = (char *)malloc(100);

strcat(buf, buf1);strcat(buf, buf2);strcat(buf, buf2);

write(fd, buf, strlen(buf));free(buf); kernel

“Hello RubyKaigi\r\n”one syscall

©2011 Amazon Web Services LLC or its affiliates. All rights reserved.

User Users Client Multimedia Corporate data center

Traditional server

Mobile Client

Internet AWS Management Console

IAM Add-on Example:IAM Add-on

Amazon Mechanical Turk

On-Demand Workforce

Human Intelligence Tasks (HIT)

Assignment/Task

RequesterWorkersAmazon Mechanical Turk

Non-Service Specific

allocate memory

writev(2)ssize_t rv;char *buf1 = “Hello ”;char *buf2 = “RubyKaigi”;char *buf3 = “\r\n”;struct iovec v[3];

v[0].io_base = buf1;v[0].io_len = strlen(buf1);...v[2].io_base = buf3;v[2].io_len = strlen(buf3);

rv = writev(fd, v, 3); kernel

Gatheringbuffers

©2011 Amazon Web Services LLC or its affiliates. All rights reserved.

User Users Client Multimedia Corporate data center

Traditional server

Mobile Client

Internet AWS Management Console

IAM Add-on Example:IAM Add-on

Amazon Mechanical Turk

On-Demand Workforce

Human Intelligence Tasks (HIT)

Assignment/Task

RequesterWorkersAmazon Mechanical Turk

Non-Service Specific“Hello RubyKaigi\r\n”

one syscall

Rhebok internals

• Prefork Architecture

• Effecient network IO

• Ultra Fast HTTP parser

• TCP Optimization

• Implemented C

conclusion

conclusion

• Rhebok is a High Performance Rack Handler

• Rhebok is built on many modern technologies

• Please use Rhebok and feedback to me

end