A Storage Story #ChefConf2013

Post on 20-Jun-2015

1.846 views 0 download

Tags:

Transcript of A Storage Story #ChefConf2013

Kyle Bader

kyle.bader@dreamhost.com

@mmgaggle

DreamHost.com

A Storage Story

About Me

Dad, husband, technologist.

Sr. Systems Engineer

@DreamHost

free software linux internals storage

networking security monitoring

distributed systems automation

DreamHost

Outline

DreamHost Storage History

Anatomy of Ceph

Automating Storage

DreamHost

Destro

- DreamHost's first web server

- Pentium 100

- SCSI storage

- Shared T1 line

DreamHost

DH2DreamHost

NetApp

NetApp Fabric Attached Storage

- 15k Fiber channel drives

- Filer heads serve NFS

- Fast failover

- Large failure domains

- Expensive

- Low density

DreamHost

Coraid

Coraid SAN

- Shelves carry SATA devices, provide AoE volumes

- Head units mount AoE volumes, XFS, NFS shares

- Linux!

- Fast failover

- Large failure domains (single L2 segment)

DreamHost

Thumper

Sun Sunfire X4500

- 45 Drives in a 4U chassis

- Legendary hardware

- Fast failover

- High density

- Large failure domains

- SATA

- Heavy

DreamHost

DreamHost Solution: Hybrid

BlueArc "Titanic"

- Switched fiber channel

- Head units serve NFS

- Tiered storage, FC/SATA

- Fast failover

- Larger failure domain (than NetApp)

- Software bugs :(

- Tiering: find -atime

DreamHost BlueArc

DreamHost UNLIMITED

DreamHost Thoughting..

Mixed Strategy

- Separate email and web storage

- Email IO is heavy random, lots of small files

- Web storage needs to be dense

- FC NAS for email

- SATA RAID for web storage

- SATA ZRAID for backups

DreamHost

Local RAID

Local RAID

- RAID6, RAID10, RAID6

- SATA, SAS disks

- ext3, XFS

- Shrink failure domain

- Great density

- Slower failover

- RAID Controllers..

DreamHost

SighDreamHost

Ceph

Ceph

- Open source

- Build with COTS hardware

- Distributed and replicated

- No single point of failure

- Consist

- Self healing and self managing

DreamHost

Building Blocks

Monitors:

- Maintain cluster map

- Provide consensus for distributed decision making

- Must have an odd number

- These do not serve stored objects to clients

OSDs:

- One per disk (recommended)

- Serve stored objects to clients

- Intelligently peer to perform replication tasks

- Supports object classes

DreamHost

Building Blocks

OSD States

Up available and ready

Down not available

In current member of cluster

Out not member of cluster

DreamHost

CephstoreDreamHost

XFSBTRFSEXT4

ClusterDreamHost

Ceph ConsumersDreamHost

Creating a MapDreamHost

CRUSHDreamHost

CRUSH

- Pseudo-random placement algorithm

- Ensures statistically even distribution

- Repeatable, deterministic

- Rule based configuration

- Replica count

- Infrastructure Topology

- Weighting

CRUSHDreamHost

OSD DOWN!DreamHost

Remap and BackfillDreamHost

Ceph AnatomyDreamHost

RESTful Storage ServiceDreamHost

DreamObjects

DreamObjects:

- Ceph Storage Cluster

- Ubuntu Linux (12.04)

- Managed by Opscode Chef

- S3 and Swift RESTful interfaces

- Highly durable (8 nines)

- 2+ PB raw capacity

DreamHost

DreamCompute

DreamCompute:

- Ceph Storage Cluster (RDB)

- Ubuntu Linux (12.04)

- Managed by Opscode Chef

- OpenStack

- Virtualized L2 and L3 networking

- Highly durable (8 nines)

- 3+ PB raw capacity

DreamHost

AutomateDreamHost

- Bootstrap cluster

- Packages and configuration

- Creates, Destroys and Encrypts OSDS

- Roles map to pdsh genders

- User and SSH key management

- Push monitoring configurations

Hard StuffDreamHost

Key management

Leader election

What we useDreamHost

- Attributes

- Environments

- Search

- No databags

EnvironmentsDreamHost

- Ceph package versions

- VIPs for API endpoints

- Package repository URI

- Ceph configuration data driven by attributes

Gated EnvironmentsDreamHost

Development

Staging

Production

Operational Feedback

- Continuous functional testing

- Metrics, metrics, metrics

- Dashboards

DreamHost

Chef Infrastructure

- Chef cluster per datacenter

- Private Chef from Opscode

- Erchef is awesome

- Migrating legacy automation

DreamHost

Chef for Ceph

- Prototyped Ceph cluster automation with Chef

- Adapted Ceph to ease configuration

- Pushed some automation down into Ceph

- Move towards being CM agnostic

- Simplify Chef recipes

DreamHost

Chef the network

- DreamCompute utilizes ODM switches

- Cumulus Networks provides Linux based OS

- Custom Chef omnibus builds for PPC

- Ohai networking!

DreamHost

Resiliency EngineeringDreamHost

- Amazon (GameDay)

- Etsy (GameDay)

- Google (DiRT)

Infra. a CodeDreamHost

- Bare metal servers configured by code

- Network devices configured by code

- Block storage configured by code

- Virtual networking configured by code

ThanksDreamHost

DreamHost

Sage Weil and Inktank

OpenStack Developers

Opscode

DreamObjectsDreamHost

Questions?

kyle.bader@dreamhost.com

@mmgaggle