Here There Be Turtles: Platform Ops in Public Cloud

32
Here There Be Turtles Platform Ops in Public Cloud Bridget Kromhout @bridgetkromhout

description

When I joined a startup already in progress as their first ops hire, I got a crash course in cloud operations. Running databases in EC2 without being on bare metal presents its own challenges; we also began using Hadoop and HBase on EMR, with tragicomic results. What monitoring existed was a twisty maze of half-measures, so improving our Mean Time To Lost Sleep required trying new tools and alerting strategies. And scaling performance meant relying on best practices and gut-feeling hunches. This talk will have appeal for those curious about AWS, about using MapReduce in the cloud, and about whether MongoDB is really “web scale”. (Spoiler alert: lolol.) Come for the EC2 trivia; stay for the table-flipping.

Transcript of Here There Be Turtles: Platform Ops in Public Cloud

Page 1: Here There Be Turtles: Platform Ops in Public Cloud

Here There Be TurtlesPlatform Ops in Public Cloud

Bridget Kromhout

@bridgetkromhout

Page 2: Here There Be Turtles: Platform Ops in Public Cloud

@bridgetkromhout

Page 3: Here There Be Turtles: Platform Ops in Public Cloud

We are the largest online video distributor of international televised content streaming the world's best movies, documentaries and TV shows on demand with professional subtitles.

@bridgetkromhout

Page 4: Here There Be Turtles: Platform Ops in Public Cloud

@bridgetkromhout

Page 5: Here There Be Turtles: Platform Ops in Public Cloud

Platform ops in public cloud?Do you mean Platform as a Service?

How is this different from Infrastructure as a Service?

@bridgetkromhout

Page 6: Here There Be Turtles: Platform Ops in Public Cloud

@bridgetkromhout

Page 7: Here There Be Turtles: Platform Ops in Public Cloud

(previous gig) SaaS Life

normal traffic

decision to turn off

decision to turnback on

accidental removal

@bridgetkromhout

Page 8: Here There Be Turtles: Platform Ops in Public Cloud

Platform?

@bridgetkromhout

Page 9: Here There Be Turtles: Platform Ops in Public Cloud

@bridgetkromhout

Page 10: Here There Be Turtles: Platform Ops in Public Cloud

@bridgetkromhout

Page 11: Here There Be Turtles: Platform Ops in Public Cloud

AWS Regions*(containing availability zones)

* for some values of regions: Beijing & Sydney too

@bridgetkromhout

Page 12: Here There Be Turtles: Platform Ops in Public Cloud

@bridgetkromhout

Page 13: Here There Be Turtles: Platform Ops in Public Cloud

StorageNo procurement delays!All the IOPS!No waiting!

Yes, cloud storage better than the bad old days in some ways, but with caveats.

@bridgetkromhout

Page 14: Here There Be Turtles: Platform Ops in Public Cloud

Alphabet Soup: EBS, SSDs, pIOPSGo with SSDs for your Elastic Block Store.

EBS-optimized instances = faster network

Provisioned IOPS: guaranteed, but prevent bursting

@bridgetkromhout

Page 15: Here There Be Turtles: Platform Ops in Public Cloud

@bridgetkromhout

Page 16: Here There Be Turtles: Platform Ops in Public Cloud

Story Time!

Data stores and sadness (as a service)

@bridgetkromhout

Page 17: Here There Be Turtles: Platform Ops in Public Cloud

wow. such nosql. very webscale.

@bridgetkromhout

Page 18: Here There Be Turtles: Platform Ops in Public Cloud

@bridgetkromhout

Page 19: Here There Be Turtles: Platform Ops in Public Cloud

“a single write operation holds the lock exclusively, and no other read or write operations may share the lock.”

@bridgetkromhout

Page 20: Here There Be Turtles: Platform Ops in Public Cloud

It’s 4am. Do you know what your EMR cluster is doing?

@bridgetkromhout

Page 21: Here There Be Turtles: Platform Ops in Public Cloud

StatsD

monitoring != alerting

@bridgetkromhout

Page 22: Here There Be Turtles: Platform Ops in Public Cloud

@bridgetkromhout

Page 23: Here There Be Turtles: Platform Ops in Public Cloud

If it moves, we track it. Sometimes we’ll draw a graph of something that isn’t moving yet, just in case it decides to make a run for it. -- Ian Malpass, Etsy

measure all the things

@bridgetkromhout

Page 24: Here There Be Turtles: Platform Ops in Public Cloud

So, back to this platform stuff...

...how exactly do you build and deploy it?

@bridgetkromhout

Page 25: Here There Be Turtles: Platform Ops in Public Cloud

@bridgetkromhout

Page 26: Here There Be Turtles: Platform Ops in Public Cloud

@bridgetkromhout

Page 27: Here There Be Turtles: Platform Ops in Public Cloud

orchestration & config management

Current: Future possibilities:

@bridgetkromhout

Page 28: Here There Be Turtles: Platform Ops in Public Cloud

kitten, not unicorn

@bridgetkromhout

Page 29: Here There Be Turtles: Platform Ops in Public Cloud

“The single strongest signal that you have something to learn...

...is that a difference exists.” -- Aneel Lakhani

@bridgetkromhout

Page 30: Here There Be Turtles: Platform Ops in Public Cloud

“The game has changed.”--Andrew

Clay Shafer

@bridgetkromhout

Page 31: Here There Be Turtles: Platform Ops in Public Cloud

@bridgetkromhout

Page 32: Here There Be Turtles: Platform Ops in Public Cloud

@bridgetkromhout