Beating the Virtualization Tax for NoSQL Workloads with OSv

Post on 21-May-2015

279 views 1 download

Tags:

description

NoSQL data stores such as Cassandra and Redis are memory and CPU intensive loads which can have unacceptable performance when run under virtualization in production environments. Over the past two years, the OSv project has profiled some of the important bottlenecks for NoSQL applications, and designed a new open-source guest environment to improve performance in key areas such as JVM memory allocation and network throughput. The side effect of this performance work has been to move some difficult tuning tasks off the administrator's to-do list and into the guest OS where they can be handled automatically with better responsiveness.

Transcript of Beating the Virtualization Tax for NoSQL Workloads with OSv

Beating the virtualization tax for NoSQL workloads with OSv

Don Marti and Glauber CostaCloudius Systems

Hardware

Hypervisor

guest OS

JVM

App Server

Java App

What is OSv?

guest OS

C++ App

Hardware

Hypervisor

OSv VM

App Server

Your App

OSv VM

Your App

Before After

Performance and Manageability

■ Performance■ Single address space■ Simplified networking

■ Manageability■ No local users■ No local config files

■ Simplicity: less than 1 second boot time

Trust is not free

Colocated (same CPU)

Apart(different CPU)

Linux 905 ns 13148 ns

OSv 328 ns 1402 ns

■ Context switch microbenchmark

No config files

■ REST API for all configurable items

■ AJAX GUI for config and management (optional)

VM build and deploy

■ Capstan: rapid VM build (and run)■ Docker-like config file■ Takes about 3 seconds to build■ Runs on any OS (written in Go)

1. Network performance

Traditional stack Network Channels

OSv-specific networking

■ Bypass socket layer■ full zero-copy achievable

Memcached benchmark

Requests/s (higher is better)

OSv networking: inbound

■ Handle inbound data without the overhead of the socket layer.

■ Called for every incoming packet.

OSv networking: outbound

■ Send the frame directly to the ifnet device.

■ PF hook is called with relevant locks held.

2. Memory management

Manual memory tuning

■ 240,000 Google hits for [jvm heap size]■ This is a job for a computer, not a human being.

JVM Memory System memory

Lend memory

Memory ballooning

Shrinker API

■ Register a callback■ OSv calls your shrinker■ No manual tuning of cache size

■ Used in osv-memcached

Redis on OSv

“porting” Redis

make CFLAGS="-fPIC" LDFLAGS="-shared" MALLOC=libc

■ -fPIC option■ -shared option■ MALLOC=libc (not required)

■ OSv malloc is integrated with kernel, optimized for small objects.

Details on Redis test

■ Ubuntu 14.04■ Affinity turned on in the Ubuntu guest

■ Both systems: Amazon c3.8xlarge■ 32 vCPUs■ 60GB RAM■ Intel Xeon E5-2680 v2 @ 2.8GHz■ 10 Gigabit networking

Redis details

■ Lua scripting, PUB/SUB, etc, it all works■ Redis uses fork for “BGSAVE” command.

■ This is the only functionality not supported.■ Single address space, no copy on write.■ Less than 20Mb redis image, now bootable

everywhere.

Conclusions

■ Memory and CPU intensive loads can have unacceptable performance under virtualization.

■ The OSv project has profiled important bottlenecks for NoSQL applications.

■ Virtual appliances available: Cassandra, Tomcat, Redis

■ Work in progress. Please join us.

Thank you

Download or try it in the cloud at: http://osv.io/

Follow @CloudiusSystems on Twitter

Begin backup slides

Run most workloads on any cloud

OSv

Rapid VM build and deploy

■ Build an OSv VM in ~3 seconds■ Small VMs (as small as 10MB) for fast deploy■ Boot in ~1 second for resilience and scaling■ Enable continuous delivery

Performance

■ System calls are free■ context switches are cheap (4x speed of recent Linux)

■ Network performance significantly faster■ around 20% with netperf over recent Linux■ more than 50% for some UDP workloads

■ SpecJVM between 3 and 5% faster■ Boots in < 1 second

■ As little as 100-200ms without local filesystem

Virtualization Oriented

■ No spinlocks in the kernel■ Simplest solution to lock holder preemption

■ No complicated hardware model■ Fair scheduler■ Huge pages■ Single address space: one set of page tables

So what about containers?

■ Docker is awesome! Low overhead!■ Small artifacts■ Fast deployment■ Fast startup (container start, not OS boot)

■ Docker is awesome! Easy builds!■ Simple configuration file■ Minimal additional work on top of application build

Docker is awesome

Hypervisors are awesome, too

■ Live migration (load balance, hardware maint)■ Multiple kernel versions available■ Running VMs everywhere is awesome!

■ Public cloud, private cloud, existing hypervisor...■ Security is awesome!

■ Container attack surface is full kernel interface.■ Hypervisor attack surface is small.

■ Fast boot, fast provisioning■ High performance■ Tiny footprint■ Virtualized

■ Live migration■ Elasticity■ MMU access

Everything is awesome

Different customization levels

● JVM-based application○ No effort

● Traditional C/C++ application○ More effort

● Virtio-app○ NFV - highest gain

C++ simplifies flow of control

Run existing applications

■ Open source, POSIX-compatible■ Run native Linux applications■ Run Java applications

Network channels: before

Network channels: after