AWS 상에서 게임 서비스 최적화 방안 :: 박선용 :: AWS Summit Seoul 2016

74
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 박선용 솔루션즈 아키텍트 2016/05/17 AWS 상에서 게임서비스 최적화 방안

Transcript of AWS 상에서 게임 서비스 최적화 방안 :: 박선용 :: AWS Summit Seoul 2016

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

박선용솔루션즈 아키텍트

2016/05/17

AWS 상에서게임서비스최적화방안

오늘의 내용

1. 성능(Performance) 개론2. 단일리전글로벌게임서비스3. AWS상에서시스템최적화4. 정리

1.성능(Performance) 개론

• 좋은성능의시스템을설계하려면?• 2개의중요문제가존재

성능의문제

1. 성능에대한정의2. 바른성능측정

• 당신의관점에따라성능이무엇을의미하는지가달라진다:– 응답시간(Response time)– 출력량(Throughput)– 일관성(Consistency)

성능(Performance)정의: 관점의문제

Application

System libraries

System calls

Kernel

Devices

Workload

성능요소(Performance Factors)

리소스 성능요소 주요지표

CPU 소켓(Sockets), 코어갯수(number of cores), clock frequency, bursting capability

CPU utilization, run queue length

메모리Memory 메모리용량(Memory capacity) Free memory, anonymous paging, thread swapping

네트워크인터페이스

Max bandwidth, packet rate Receive throughput, transmit throughput over max bandwidth

디스크 IOPS(Input / output operations per second), 출력량(throughput)

Wait queue length, device utilization, device errors

리소스활용률(Utilization)• For given performance, how efficiently are resources being used• Something at 100% utilization can’t accept any more work• Low utilization can indicate more resource is being purchased

than needed

• 병목 ==시스템에서최대활용률을나타내는컴포넌트

성능게임성능게임 :같은수치를가지고더나은성능치로보이기위한기술적인트릭수십가지의대표적인기법들이있다

Ratio 게임

시스템 워크로드 1 워크로드 2

A 10 20

B 20 10

측정값

시스템 워크로드 1 워크로드 2 평균값

A 10 20 15

B 20 10 15

평균값비교

시스템 워크로드 1 워크로드 2 평균값

A 0.5 2 1.25

B 1 1 1

B 기준처리량비교시스템 워크로드 1 워크로드 2 평균값

A 1 1 1

B 0.5 2 1.25

A기준처리량비교

성능측정

1. 관점의선정 :측정단위(Metric)선정- IOPS, 지연시간(ms), 출력량(MB/s), 활용도(%)

2. 대상의선정 :측정할시스템의올바른준비-특정병목(100% utilization) 으로인한 왜곡이없는 설정

3. 주체의선정 :적합한워크로드의선정-정확한로드를 줄수 있는적합한 로드제너레이터의 선정-충분한로드가 가해질수 있는로드의 설계

측정의일반적인실수와해결책

현명한사람은다른사람의실수로부터배우고어리석은사람은자신의것으로부터배운다 – H.G Well

1. 목표의부재 or 편향된목표2. 비체계적인접근3. 부정확한메트릭4. 대표성없는워크로드5. 잘못된계산기법6. 부적합한대상(시스템)설정7. 워크로드생성기의부적절한선택

당부의말씀

여기나오는모든수치는예시적인것입니다.

퍼포먼스측정은모두여러분이직접수행해야만가치가있습니다.

2.단일리전글로벌게임서비스

FAQ

1. 한지역에서게임서버를두고전세계서비스를할수있는가?

2. 대전(對戰)게임을여러대륙간의사용자끼리하게할수있는가?

3. 지역간 DB 상호복제가가능한가?

Global network 성능관련문제

지역별평균네트워크지연

http://bit.ly/superdata-latency, See http://bit.ly/verizon-latency

N.America41.7ms

Europe toAsia137.9ms

Asia Pacific97.9ms

Trans-Pacfic103.8ms

Trans-Atlantic79.6ms

LatinAmerica133.2ms

Europe11.6ms

Japan16.8ms

추천모범사례 :전세계적인매시형구조로서비스하지말고지역별로게임서버를둘것

###+ms

###+ms##+ms

###+ms

질문 :하나의지역에서글로벌서비스는불가?

네트워크성능

주요성능관점

응답시간(Response time)일관성(Consistency)

고려사항

라우팅패스패킷손실률 = 패킷재전송

측정도구

pingtraceroutemtriperfscp

Latency의중요성 - TCP 연결

Keep-Alive를 사용해야

SYN

SYN-ACK

ACK

GET /index.jsp

ACK

SYN-ACK

GET /index.jsp

2nd요청

Region

SYN

360ms

360ms

90ms

성능테스트

§ 비교테스트• 서울 (A 망사업자) à AWS China (베이징 리전의 EC2 인스턴스)• AWS 서울 à AWS China (베이징 리전의 EC2 인스턴스)

§ 테스트종류1.Traceroute : 라우팅경로 확인2. MTR : 라우팅 경로의일관성 확인3. Ping : 지연과 지연의범위 확인4.데이터전송 : 전송속도,패킷로스로 인한패킷 재전송횟수 측정.워크로드에필요한 전송크기.여기서는 100KB

§ 중국과의통신 (일반)중국내 경로에따른 인터넷안정성 및 Great Firewall의간섭으로불특정하게트래픽이 영향을받은 것으로알려져 있음

Tracerout비교

traceroute to 54.222.13x.xx (54.222.13x.xx), 128 hops max, 52 byte packets1 [AS56220] 192.168.0.1 (192.168.0.1) 5.307 ms 1.238 ms 0.968 ms6 [AS3786] 1.213.28.145 (1.213.28.145) 2.242 ms

[AS3786] 1.213.28.49 (1.213.28.49) 2.174 ms 2.081 ms7 [AS3786] 1.208.150.37 (1.208.150.37) 2.700 ms

[AS3786] 210.120.248.237 (210.120.248.237) 3.014 ms[AS3786] 1.208.150.229 (1.208.150.229) 3.113 ms

8 [AS3786] 1.213.150.165 (1.213.150.165) 14.654 ms[AS55831] 1.208.104.149 (1.208.104.149) 6.444 ms[AS3786] 1.213.150.165 (1.213.150.165) 4.945 ms

10 [AS3786] 1.213.106.9 (1.213.106.9) 4.124 ms[AS3786] 1.208.148.105 (1.208.148.105) 3.917 ms[AS3786] 1.213.148.57 (1.213.148.57) 3.955 ms

13 [AS4134] 202.97.58.106 (202.97.58.106) 52.624 ms 51.554 msLG DACOM [AS3786] 211.40.6.110 (211.40.6.110) 50.114 ms

14 [AS4134] 202.97.58.117 (202.97.58.117) 50.871 ms 54.061 ms 51.665 ms15 [AS4134] 202.97.53.93 (202.97.53.93) 53.085 ms

[AS4134] 202.97.58.117 (202.97.58.117) 52.425 ms[AS4134] 202.97.53.93 (202.97.53.93) 56.021 ms

16 China Telecom Backbone [AS4134] 202.97.53.93 (202.97.53.93) 54.102 ms 51.934 ms *17 China Networks Inter-Exchange [AS4847] bj141-130-93.bjtelecom.net (219.141.130.93) 52.357 ms * *

A 망 à AWS China (예) traceroute to 54.222.13x.xx (54.222.13x.xx), 30 hops max, 60 byte packets1 ec2-52-79-0-0.ap-northeast-2.compute.amazonaws.com (52.79.0.0) [AS16509] 20.145 ms 20.289 ms 20.346 ms2 Cox Communication Inc. 100.64.1.8 (100.64.1.8) [AS22773] 16.820 ms 100.64.2.8 (100.64.2.8) [AS22773] 20.275 ms 100.64.2.136 (100.64.2.136) [AS22773] 13.769 ms3 100.64.3.135 (100.64.3.135) [AS22773] 13.877 ms 100.64.2.195 (100.64.2.195) [AS22773] 15.923 ms 100.64.3.3 (100.64.3.3) [AS22773] 15.944 ms14 otejbb205.int-gw.kddi.ne.jp (203.181.99.69) [AS2516] 38.586 ms otejbb206.int-gw.kddi.ne.jp (59.128.4.249) [AS2516] 38.112 ms otejbb205.int-gw.kddi.ne.jp (59.128.4.101) [AS2516] 45.558 ms15 tr-ote124.int-gw.kddi.ne.jp (106.187.6.198) [AS2516] 39.897 ms tr-ote124.int-gw.kddi.ne.jp (106.187.6.190) [AS2516] 39.156 ms tr-ote124.int-gw.kddi.ne.jp (106.187.6.194) [AS2516] 40.067 ms16 203.181.102.42 (203.181.102.42) [AS2516] 151.869 ms * 152.305 ms17 202.97.33.49 (202.97.33.49) [AS4134] 153.793 ms 202.97.35.237 (202.97.35.237) [AS4134] 152.355 ms 151.471 ms18 * 202.97.33.125 (202.97.33.125) [AS4134] 153.458 ms19 202.97.50.229 (202.97.50.229) [AS4134] 157.713 ms 202.97.35.153 (202.97.35.153) [AS4134] 209.974 ms 202.97.39.230 (202.97.39.230) [AS4134] 241.158 ms20 202.97.34.189 (202.97.34.189) [AS4134] 184.304 ms * 202.97.34.133 (202.97.34.133) [AS4134] 227.440 ms98.31.110.36.static.bjtelecom.net (36.110.31.98) [AS4847]

AWS 서울à AWS China (예)

Mtr비교mtr --tcp -P 22 54.222.13x.xx--report -c 100Start: Mon May 2 00:06:25 2016HOST: 80e627e.ant Loss% Snt Last Avg Best Wrst StDev1.|-- 172.20.nate.com 0.0% 100 1.8 3.0 1.7 27.3 2.72.|-- ??? 100.0 100 0.0 0.0 0.0 0.0 0.03.|-- 172.28.nate.com 0.0% 100 65.6 48.3 24.7 522.9

54.74.|-- ??? 100.0 100 0.0 0.0 0.0 0.0 0.05.|-- ??? 100.0 100 0.0 0.0 0.0 0.0 0.06.|-- 113.217.252.149 76.0% 100 30.4 57.1 29.0 354.1

68.27.|-- ??? 100.0 100 0.0 0.0 0.0 0.0 0.08.|-- 1.255.25.129 0.0% 100 48.4 51.7 27.9 374.7

44.89.|-- 39.115.132.141 0.0% 100 52.2 49.9 28.3 236.6

29.810.|-- 58.229.15.206 23.0% 100 71.3 84.1 62.2 262.1 28.511.|-- 202.97.121.169 0.0% 100 80.2 84.1 63.8 221.6 20.667.915.|-- ??? 100.0 99 0.0 0.0 0.0 0.0 0.016.|-- 98.31.110.36.static.bjtel 11.2% 98 146.0 147.5 123.6 571.6 48.617.|-- 54.222.1.11 2.1% 97 110.9 123.2 105.2 252.5 16.218.|-- 54.222.1.35 3.1% 97 116.7 120.5 101.8 198.0 12.9

A 망 à AWS ChinaLoss 율이변화하지않는것이중요

mtr --tcp -P 22 54.222.137.37 --report -c 100ST: ip-172-31-10-43 Loss% Snt Last Avg Best WrstStDev1.|-- ??? 100.0 100 0.0 0.0 0.0 0.0 0.0

4.|-- 100.64.17.225 0.0% 100 0.4 6.6 0.3 616.2 61.65.|-- 54.239.122.7 0.0% 100 1.1 6.7 0.7 577.6

57.710.|-- 54.239.53.11 0.0% 100 38.5 45.9 37.2 429.8 39.212.|-- 106.187.29.153 0.0% 100 39.8 44.2 37.8 342.3 31.213.|-- o205.int-gw.kddi.ne. 0.0% 100 42.8 48.4 37.3 458.9 49.914.|-- otejb.int-gw.kddi.ne. 0.0% 100 46.9 46.6 37.1 420.4 44.215.|-- 203.181.102.42 0.0% 100 41.2 58.0 37.0 381.8 52.116.|-- 203.181.102.42 24.0% 100 151.4 162.0 148.8 495.3 43.818.|-- 202.97.33.161 35.0% 100 161.5 210.6 149.8 3159. 373.319.|-- 202.97.33.25 3.0% 100 154.7 212.1 151.1 1222. 146.6

AWS 서울à AWS China

Ping & Scp비교

ping -c 20 54.222.13x.xxPING 54.222.137.37 (54.222.137.37): 56 data bytesRequest timeout for icmp_seq 064 bytes from 54.222.137.37: icmp_seq=1 ttl=46 time=151.090 msRequest timeout for icmp_seq 264 bytes from 54.222.137.37: icmp_seq=3 ttl=46 time=152.218 ms64 bytes from 54.222.137.37: icmp_seq=4 ttl=46 time=151.107 ms64 bytes from 54.222.137.37: icmp_seq=5 ttl=46 time=150.753 ms64 bytes from 54.222.137.37: icmp_seq=6 ttl=46 time=152.004 msRequest timeout for icmp_seq 764 bytes from 54.222.137.37: icmp_seq=8 ttl=46 time=154.614 ms64 bytes from 54.222.137.37: icmp_seq=9 ttl=46 time=154.078 msRequest timeout for icmp_seq 10Request timeout for icmp_seq 11

A 망 à AWS China (ping)

Transferred: sent 105672, received 2232 bytes, in 1.4 secondsBytes per second: sent 76756.9, received 1621.3

tcp.analysis.retransmission : 1

AWS 서울à AWS China (100KB, tcpdump)

Transferred: sent 105672, received 2232 bytes, in 7.0 secondsBytes per second: sent 15009.0, received 317.0

tcp.analysis.retransmission : 24

A 망à AWS China (100KB, tcpdump)

비교성능결과

§ Traceroute 결과• 서울리전과베이징리전간라우팅경로가훨씬일관성이있음

§ Ping 수행결과• A망à AWS China : 53 ~ 173 ms (라우팅경로에따라 ping 응답이차이가큼)• 서울리전à AWS China : 107 ~ 108 ms (편차가적음)

§ 데이터전송(100KB)결과• A망à AWS China : 7.0초 / TCP재전송 24회• 서울리전à AWS China : 1.4초 / TCP 재전송 1회

Region

Edge Location

12 Regions33 Availability Zones54 Edge Locations

AWS optimized network

Internet

다른점

Amazon S3 전송가속Embedded WAN acceleration

S3 BucketAWS EdgeLocation

Uploader

OptimizedThroughput!

지리적으로 멀리떨어진곳으로의이동

최대 300% (6x) 빠름

No firewall mods, no client software

54 글로벌엣지로케이션

엔드포인트를 바꾸면되고,코드를바꿀필요가없다

New~

S3 전송가속

Region

Edge Location

12 Regions32 Availability Zones54 Edge Locations

Need to update

We’re here J

CloudFront 동적컨텐츠글로벌인프라

Elastic Load Balancing

Dynamic content

Amazon EC2

Static content

Amazon S3

* (default)

/error/*/assets/*

Amazon CloudFrontexample.com

CloudFront 동적컨텐츠여러오리진에대한설정

CloudFront

CustomerLocationwww.mysite.com

Path Pattern Matching/*.jpg; /*.php etc.

GET http://mysite.com/images/1.jpg to ORIGIN AGET http://mysite.com/index.php to ORIGIN B

GET http://mysite.com/web/home.css to ORIGIN CGET http://mysite.com/* (DEFAULT) to ORIGIN D

Origin A: S3 bucket

Origin B: www.mysite.com

Origin C: S3 Bucket

Origin D: www.mysite.com

Path Pattern Matching

/*.php

/images/*.jpg

/web/*.css

/*.* (DEFAULT)

CloudFront 동적컨텐츠CloudFront 동작

DEMO!

버지니아리전의웹서버성능비교

웹서버직접호출 CF를통한호출min mean median min mean median

1k 529 949 926 468 693 645

2k 588 1090 1173 481 701 699

4k 879 1158 1129 492 809 750

B망à Us-east-1b : c4.large 단위 : ms

상대적인값만 볼것. 이것은패킷의전달 속도만을측정하는것임.측정하는망이나 위치에따라 값은달라짐

AWS EdgeLocation

Client

OptimizedPath

HTTP(s)게임

Game ServerUS region

HTTP(s)형게임은CloudFront다이나믹컨텐츠지원기능을이용한다.

각글로벌엣지로부터최적화된네트워크경로를통해서게임서비스리전으로연결된다.

AWS Region

Client

OptimizedPath

TCP게임

US region

TCP 형게임은클라이언트가있는곳의 AWS리전에Proxy서버를구축한다.

각 Proxy서버로부터최적화된네트워크경로를통해서게임서비스리전으로연결된다.

Proxy

글로벌 one region게임서비스?§ 성능관련검증내용

• 라우팅 경로확인• 지연시간 확인• 지연시간의 편차확인• 일정 크기파일 전송확인 à패킷 로스및 재전송확인

§ 성능검증결론• 최적화 된인터넷 경로• Keep-Alive

§ AWS 서비스• HTTP(s) : AWS CloudFront동적컨텐츠 지원• TCP : 지역별로 구현된 Proxy 서버 Spot과AWS 각리전 연결

글로벌단일리전게임설계시다음을고려하라§ 지연을인정하라

• 전 세계어느 지역에서든 500ms ~ 1s의 지연허용• 게임 종류에따른 설계반영 :비동기,버퍼링, eventually consistency (UDP)

§ 지원가능한인프라를선택하라• AWS :각 리전, CloudFront,리전별 Proxy 구축

§ 서비스대상범위를지정하고검증하라• 지역의 선정 :아시아,유럽,전 세계?• 각 로컬네트워크 망의사정 고려및 테스트• 지연이 심한경우의 과감한취사 선택

§ 서버아키텍처를과감히변경하라• 서버 프로세싱을최소화 :네트워크가지연시간의 대다수가되도록• 캐시,메모리 DB, NoSQL등으로과감한 설계구조 변경• 게임 서버내의성능 병목지점제거

3. AWS상에서시스템최적화

• 인스턴스 유형을선택하는것이리소스성능튜닝과같은효과를가져옴

• 워크로드에 맞는이상적인인스턴스유형을찾는것이관건

AWS에서컴퓨팅퍼포먼스란?

인스턴스선택 = 성능튜닝

예: HTTP game application 시스템콜

방안 1 : 게임서버설계시시스템 call을최소화하라§ 단일쓰레드 vs 다중쓰레드

• 같은 리소스에경쟁적으로 접근예)파일,소켓,특정메모리 등

§ O/S Lock의최소화• 상호 경쟁적접근 요청이많을 수록 lock은극단적으로비싸짐.

§ 큐활용의제한• 처리 로직사이 이벤트큐의 입출력이전체 비용의대다수를 차지예)비거나차거나 à경쟁관계나 cache coherence발생으로높은 비용요구

• O/S 락을 요구하지않는 원형버퍼등의 활용

§ 기본규칙• 처리량을위해서는 async call을,지연시간절감을위해서는캐시,메모리, NoSQL을

X86 CPU 가상화: Prior to Intel VT-x

• 특권 명령어에대한이진변환 (Binary translation for privileged instructions)

• 반가상화 (Para-virtualization)• PV는 VMM을통과해야하므로,지연이증가• 시스템 콜을해야하는어플리케이션의경우더욱영향을받음

VMM

Application

Kernel

PV

X86 CPU Virtualization: After Intel VT-x• x86프로세서는포펙과골드버그가상화요구를만족하지않았음à일반 가상머신을추가하는것이어려웠음.

• Hardware assisted virtualization (HVM)• PV-HVM 은명령어가느리게에뮬레이트되는경우선택적으로 PV 들이버를 사용한다: • e.g. network and block I/O

KernelApplication

VMM

PV-HVM

방안 2 : EBS와함께 PV-HVM AMIs 를사용하라

Review: C4 Instances

Custom Intel E5-2666 v3 at 2.9 GHzP-state and C-state controls

Model vCPU Memory (GiB) EBS (Mbps)c4.large 2 3.75 500c4.xlarge 4 7.5 750c4.2xlarge 8 15 1,000c4.4xlarge 16 30 2,000c4.8xlarge 36 60 4,000

방안 3 : 고급벡터확장(AVX2)을위한 P-state control을고려하라• 만약어플리케이션이모든코어에서 AVX2 오퍼레이션을과도하게요구하게되면, 프로세서는가진것이상의파워를이끌어내려고시도한다.

• 프로세서는투명하게프리퀀시를줄임• CPU프리퀀시의변화는어플리케이션을느리게할수있다

• Visual Studio 2012에 AVX2서포트시작. /arch:AVX2 스위치옵션이VS 2012 Update 2부터지원.

I/O : Granting in pre-3.8.0 Kernels

• Requires “grant mapping” prior to 3.8.0• Grant mappings are expensive operations due to TLB flushes

read(fd, buffer,…)

I/O : Granting in 3.8.0+ Kernels, Persistent and Indirect

• Grant mappings are setup in a pool once• Data is copied in and out of the grant pool

read(fd, buffer…)Copy to and from grant pool

방안 4 : 3.8+ 커널사용하라

• Amazon Linux 13.09 or later• Ubuntu 14.04 or later• RHEL7 or later• Etc.

디바이스요청 : Enhanced Networking

• SR-IOV 드라이버도메인단계를제거• 물리적인네트워크디바이스가가상함수를인스턴스에노출

• 특별한드라이버필요 :• 인스턴스의 OS 가인식하고있어야함• EC2가 인스턴스에사용할수있음을알려주어야함

Hardware

After Enhanced NetworkingDriver Domain Guest Domain Guest Domain

VMM

Frontend driver

NIC Driver

Backend driver

DeviceDriver

Physical CPU

Physical Memory

SR-IOV Network Device

Virtual CPU Virtual Memory

CPU Scheduling

Sockets

Application

Hardware

After Enhanced NetworkingDriver Domain Guest Domain Guest Domain

VMM

Frontend driver

NIC Driver

Backend driver

DeviceDriver

Physical CPU

Physical Memory

SR-IOV Network Device

Virtual CPU Virtual Memory

CPU Scheduling

Sockets

Application

Hardware

After Enhanced NetworkingDriver Domain Guest Domain Guest Domain

VMM

Frontend driver

NIC Driver

Backend driver

DeviceDriver

Physical CPU

Physical Memory

SR-IOV Network Device

Virtual CPU Virtual Memory

CPU Scheduling

Sockets

Application

방안 5 : Enhanced Networking사용하라

• 아주높은 packets/second • 지연에대한편차가매우낮음• 인스턴스의 OS가지원해야함• 인스턴스나이미지의 SR-IOV프로퍼티확인

5. EBS

EBS란무엇인가?

• 서비스형네트워크블락스토리지

• EBS 볼륨은같은가용존에있는어떠한 EC2

인스턴스에도연결될수있음

• 99.999% (five nines) 가용성으로설계

• 매일 2 백만볼륨이생성

EBS volume types

Magnetic General purpose (SSD) Provisioned IOPS (SSD)

st1, sc1 gp2 io1

EBS volume typesSolid-State Drives (SSD) Hard disk Drives (HDD)

볼륨타입 General Purpose SSD (gp2) Provisioned

IOPS SSD (io1)

Throughput Optimized HDD (st1)

Cold HDD (sc1)

볼륨사이즈 1 GiB - 16 TiB 4 GiB - 16 TiB 500 GiB - 16 TiB 500 GiB - 16 TiB

최대 IOPS/볼륨 10,000 20,000 500 250

최대Throughput/볼륨 160 MiB/s 320 MiB/s 500 MiB/s 250 MiB/s

성능개선을위한핵심컴포넌트 : 4개

EC2 instance

I/O

EBSNetwork link

Workload/ software

Typical block size

Random/Seq?

Max EBS @ 500 MB/s instances

Max EBS @ 1 GB/s instances

Max EBS @ 10 GB/s instances

Oracle DB Configurable:2 KB–16 KBDefault 8 KB

random ~7,800 IOPS ~15,600 IOPS ~48,000 IOPS

Microsoft SQL Server

8 KB w/ 64 KB extents

random ~7,800 IOPS ~15,600 IOPS ~48,000 IOPS

MySQL 16 KB random ~4,000 IOPS ~7,800 IOPS ~48,000 IOPS

PostgreSQL 8 KB random ~7,800 IOPS ~15,600 IOPS ~48,000 IOPS

MongoDB 4 KB sequential ~15,600 IOPS ~31,000 IOPS ~48,000 IOPS

Apache Cassandra

4 KB random ~15,600 IOPS ~31,000 IOPS ~48,000 IOPS

GlusterFS 128 KB sequential ~500 IOPS ~1,000 IOPS ~6,000 IOPS

참조테이블 : AWS상에서예제워크로드표

예제워크로드

트랜젝션 (OLTP)예 : 스토어 website, 아이템거래,게임메타데이터저장벤치마크 : MySQL + sysbench

성능개선을위한부분(컴포넌트)및병목지점확인

최초테스트사양

가용존 : US West (Oregon)인스턴스타입 : m2.4xlarge

vCPU: 8Memory: 68.4GiBEBS-optimized

데이터볼륨 : 500GiB EBS magneticOS: Amazon Linux 2015.03.1CPU: Intel Xeon

병렬처리확장성테스트 -쓰레드수를늘림

MySQL threads

Tran

sact

ions

(n)

Baseline

2 n

EBS 최적화인스턴스

• 대부분의인스턴스패밀리가 EBS-optimized 플래그를지원

• EBS-optimized instances now support up to 4 Gb/s

• Drive 32,000 16K IOPS or 500 MB/s

• Available by default on newer instance types

• EC2 *.8xlarge 는 10 Gb/s 네트워크지원

• 노드당지원되는최대 IOPS 는 ~48,000 IOPS @ 16K I/O

인스턴스변경

가용존: US West (Oregon)인스턴스타입: r3.2xlarge

vCPU: 8Memory: 61 GiBEBS-optimized

EBS volume: 500GiB magneticOS: Amazon Linux 2015.03.1CPU: Intel Xeon E5-2670 v2

25%

EBS최적화 &최신세대인스턴스

MySQL threads

Tran

sact

ions

(n)

Baseline

r3.2xlarge

2 n

볼륨타입선택

EBS magnetic지연 :

Read: 10-40msWrite: 2-10ms

SSD backed지연 :

Read/Write: Single-digit ms

Pre-warming

인스턴스변경: EBS volumes

Availability Zone: US West (Oregon)Instance type: r3.2xlarge

vCPU: 8Memory: 61 GiBEBS-optimized

Boot volume: 8 GiB – EBS general purposeData volume: 500 GiB – EBS general purposeOS: Amazon Linux 2015.03.1

Optimization: Volume selectionTr

ansa

ctio

ns (n

) 19% 50%

MySQL threads

Baseline

r3.2xlarge

r3.2xlarge gp2

2 n

방안 6 : EBS 성능개선을위해최신인스턴스와옵션을사용하라

워크로드최적의인스턴스를선택하라

가급적최신세대인스턴스를사용하라

볼륨퍼포먼스가문제되면 SSD 볼륨타입을선정하라

작은볼륨사이즈에높은 IOPS가필요하면 io1 타입을선택하라

EBS IOPS vs. Throughput

20,000 IOPS PIOPS volume

20,000 IOPS

320 MB/s throughput

You can achieve 20,000 IOPS when driving smaller I/O operations

You can achieve up to 320 MB/s when driving larger I/O operations

스트라이핑 (Striping)

Increases performance, or capacity, or both

Don’t mix volume types

Typically RAID 0 or LVM stripe

Avoid RAID for redundancyEBS

EC2

EBS-optimized instance

4개핵심컴포넌트 : 균형(Balanced) =고른 utilization

EC2

A “boatload” of I/O

Right-sized EBS

1. 게임 서버설계시시스템 call을 최소화하라2. PV-HVM AMIs 를사용하라3. 고급 벡터확장(AVX2)을위한 P-state control 을고려하라4. 3.8+ 커널사용하라5. Enhanced Networking 사용하라6. EBS 성능개선을위해최신의인스턴스와옵션을사용하라

최적화방안정리