GOTO Berlin 2016

50
GOTO Berlin | 15.11.2016 | Christian Deger | @cdeger Highway to heaven Building microservices in the cloud

Transcript of GOTO Berlin 2016

PowerPoint-Prsentation

GOTO Berlin | 15.11.2016 | Christian Deger | @cdegerHighway to heavenBuilding microservices in the cloud

16.11.16

Christian DegerChief [email protected]@cdeger

AutoScout24 in Munich, 5 YearsSoftware Engineer, Team Lead, Architect

2

2,4 Million Vehicles

AutoScout24 is the largest online car marketplace Europe-wide.Like AutotraderOur goal is to make the entire experience simple, efficient and stress-free.3

Microservices in the cloud adoption?

Who is doing CI?Who is doing CD?Who is doing Infrastructure as Code?Who is in the cloud?Who plans on moving to the cloud?Who is doing microservices?4

2000 Servers2 Data Centers

MTBF optimized

Proven delivery engineHighly optimized, but of last decadeIT platform supported growth for >6 yearsAvailability = MTBF / (MTBF + MTTR)Proven agile and lean principles5

Dev and Ops Silos

DevelopmentChangeOperationsStability

Development department was driven by change, while the Ops department was driven by the need for stability.Software release where thrown over the wall.New VMs required a ticket and some time.This is not the only handover: Dev/QA, Product/Dev, etc.

6

Ops are on call for services written by devs.Performance degradation: Just buy more servers.Devs unaware of response time and availability.7

There was resistance against change.We are not like NetflixWe have built a private cloud, which I would just call virtualization.

8

NewCEO

Scout24 was sold end of 2013New CEO Greg Ellis beginning of 2014Are you ready for the future?9

Talent?Do you attract

Windows and .NET in the DataCenter did not attract the talent we needed in the internet business.Most of the innovation that matters to us, happen in different ecosystems. JVM for example.Scala was also picked because it attracts talent.10

21st CenturyWhat does a tech company look like?

11

We want to learn and participate from the advances the unicorns in our industry make.Visited Netflix, LinkedIn, AirBnb, etc.

Innovation for us no longer comes from previous enterprise suppliers.Oracle, IBM, Microsoft

12

Great DesignUniversally ConnectedMobile FirstInstant Business ValueMassive Data InsightHighly Available

These are some of the things that we believe we want to become.13

good, but not greatHmm, we are

We are not bad, but we are not great.The new questions triggered something...14

Rebooteverything

So we decided to do a big transformation...15

Project

Tatsu

Fly at the speed of fear - DisruptiveJapanese dragon: Flying beast

Started Nov. 2014 with one team.

.NET / Windows to JVM / LinuxMonolith to MicroservicesData center to AWSDevs + Ops to Collaboration cultureInvolve product people

Why everything at once?We believed we cannot leave one of the changes out.

Death Star Diagrams

Amazon 2008Twitter 2013

These are not a documentation of their architecture, but a visualisation of their service dependencies.Typically by request tracing. https://blog.twitter.com/2013/observability-at-twitterhttps://twitter.com/Werner/status/74167351456714342418

http://scs-architecture.org/Self-Contained Systems = Microservices Flavor

Team 1Team 2Team 3

One business capability is owned, built and run as an SCS by one team.Self-Contained System are vertical slices integrated at the UI.

Netflix style vs. SCS:We are not that experienced in distributed systems.No API Gateway monolith.No UI monolith.

19

Migration strategy

Migration strategy. Vertical slices

samedirection

We wanted to change a lot at the same time.Started with one team and treated the setup and the way of working as an experiment.(e.g. rotating ops team members, due to fear of not getting buy in from Ops).We learned and ramped up to more teams working in the new world.From mostly harmless to becoming a role model.

Wanted to the right thing, they were prepared.21

STRATEGICGOALSGoals of the business sideARCHITECTURALPRINCIPLESHigh-Level PrinciplesDESIGN AND DELIVERY PRINCIPLESTactical measuresREDUCE TIME TO MARKETEstablish fast feedback loops to learn, validate and improve. Remove friction, hand-offs and undifferentiated work.MOBILE FIRSTStart small and use device capabilities.SUPPORT DATA-DRIVEN DECISIONSProvide relevant metrics and data for user and market insights. Validate hypothesis for problems worth solving.

YOU BUILT IT, YOU RUN ITThe team is responsible for shaping, building, running and maintaining its products. Fast feedback from live and customers helps us to continuously improve.ORGANIZED AROUND BUSINESS CAPABILITIESBuild teams around products not projects. Follow the domain and respect bounded contexts. Make boundaries explicit. Inverse Conway Maneuver.LOOSELY COUPLEDBy default avoid sharing and tight coupling.No integration database. Dont create the next monolith.MACRO AND MICRO ARCHITECTUREClear separation. Autonomous micro services within the rules and constraints of the macro architecture. AWS FIRSTFavor AWS platform service over managed service,over self-hosted OSS, over self built solutions.DATA-DRIVEN / METRIC-DRIVENCollect business and operational metrics. Analyze, alert and act on them.ELIMINATE ACCIDENTAL COMPLEXITYStrive to keep it simple. Dont over-engineer.Focus on necessary domain complexity.AUTONOMOUS TEAMSMake fast local decisions. Be responsible. Know your boundaries. Share findings.INFRASTRUCTURE AS CODEAutomate everything: Reproducible, traceable, auditable and tested. Immutable servers.CROSS-FUNCTIONAL TEAMSEngineers from all backgrounds work together in collaborative teams as engineers and share responsibilities. No silos.BE BOLDGo into production early. Value monitoring over tests.Fail fast, recover and learn. Optimize for MTTR not MTBF.SECURITY, COMPLIANCE AND DATA PRIVACYBuild with least privilege and data privacy in mind.Know your threat model. Limit blast radius.COST EFFICIENCYRun your segment in the right balance of cost and value.

ONE SCOUT ITFoster collaboration. Harmonize and standardize tools.Pull common capabilities into decoupled platform services.

Version 2.0Icons made by Freepik from www.flaticon.com are licensed under CC BY 3.0

BEST TALENTAutonomy, Purpose and Mastery: We know why we do things, we decide how to approach them and deliberately practice our skills.

Principles not Dogmas. Not top down.Helps aligning and sharpening culture.Build your own: The discussions and feedback is valuable.Be aware of the gap between the poster and reality.

BuildMeasureLearn

Time to market: the core goal, fast feedbackLean startup.

23

Conways Law

organizations which design systems ... are constrained to produce designs which are copies of the communicationstructuresof these organizations

Inverse Conways Maneuver.24

Autonomous teamsbusiness capabilitiesorganized around

Have empowered, self organizing cells, to get teams that build decoupled services.Products not projects.Respect the domain and bounded contexts.Make fast local decisions.Be responsible.Scale the organisation25

collaborationculture

DevOps for us is not a team, a role or a tool is collaboration.No silos, no handovers, no fingerpointing.No software that is hard to operate.No infrastructure nobody needs.But also to use tools like Slack, Github

26

You build it,you run it.

Freedom and Reponsibility.The team is responsible for shaping, building, running and maintaining its products.Fast feedback from live and customers helps us to continuously improve.The one who feels the pain of being woken up at nightthe one to be able to make the changes-> real resilient services-> Customer first27

We are all engineers!

No more devs and ops -> engineersPut people together in one stable team. All other experiments failed.Do not fall back into old behaviours:Zoom in you find 4 devs on one side and 2 ops on the other -> rotate pairsNot all T-shapes are the same

28

Monitoring is the new testing

Tests only run on delivery. Monitoring runs all the time.

Performance.CD Pipelines.Open OpsGenie Alerts.Costs per day.Page Speed.

Next step: Monitor business KPIs.29

Follow thetrail

Convenience offerings.People will stick with decisions, when they are good enough for them.They will create different solutions, when it is important for them.CD Tool, Language decision, Service Template.30

Templates

Faster bootstrappingCopied not inheritedCollect and share best practices

Example of Follow the trailCopied not inherited:Typically not required to retrofit everything from the templateLoose couplingDisadvantage: No Collect and share best practices: Survival of the fittest.31

GuildsSelf-organizing; common interests; across teams Macro Architecture, Infrastructure, Frontend, QA...Beware of mandelbrot teams

Need because market segments and teams are getting the high bandwith communication and this are cross segment expert groups.

Mandelbrot TeamsIn some teams, this leads to the same separation within the team. Former ops engineers working on infrastructure stories and software engineers working on product stories.This leads to the usual defects: Infrastructure products can have a very high bus factor and bottlenecks, because not the whole team feels responsible, but is waiting for the Ops guy to be available.This is not the default, but observable.

ContinuousDelivery

Fast feedback from real users. Build, Measure, LearnFast feedback on quality of service.Creating value for user is daily business.

We started with one release per month......to many releases per day.

But only code changes were delivered.33

DevOps Survey

Who does continuous delivery?Reasons not to do continuous delivery?34

Forsgren, Nicole and Humble, Jez, The Role of Continuous Delivery in IT and Organizational Performance (October 27, 2015). Forsgren, N., J. Humble (2016). "The Role of Continuous Delivery in IT and Organizational Performance." In the Proceedings of the Western Decision Sciences Institute (WDSI) 2016, Las Vegas, NV. . Available at SSRN: http://ssrn.com/abstract=2681909 or http://dx.doi.org/10.2139/ssrn.2681909DevOps Science

Jezz Humble: What We Learned from Three Years Sciencing the Crap out of Devops

35

Application code in one repository per service.

CIDeployment package as artifact.CDDeliver package to servers

Delivery Pipeline Data Center

Commit stage: Unit tests etc.Additional database migration scripts.Blue/ Green delivery on the instance.

36

Application code and infrastructure specification in one repository per service.

CIDeployment package and infrastructure declaration as artifact.CD 1. Create or update service infrastructure.

2. New instances pull down package and start application.

Delivery Pipeline AWS

Every change goes through the delivery pipeline.High traceability.Delivery: CloudFormation + ASGDependencies: Global stack and Base AMI.

37

TraceableRepeatableReliable(Faster)

38

Cattle,not pets

Phoenix servers.No configuration drift.Security: Alerting on instances that are to old

Tradeoff cycle time:Current cycle time: commit to production is 20-30 minutes: To slow!

39

Separatecode deploymentfeature releasefrom

Who is using feature branches and does CI?No merging of branches anymore.You are not doing CI, when you are branching.Short lived branches < 1 day are ok.Dynamic Feature Toggles:Canary releases.Product can switch on a feature for acceptance.

40

Nostagingenvironment

We only maintain one environment where all services integrate: ProductionBe bold!MTTR over MTBF

41

Consumer driven contractsCanary releasesShadow trafficSemantic monitoring

Integrate in production

Shadow traffic: Best in combination with flagging test data.Semantic monitoring:For example continuous user journey tests.Dont monitor the cell, listen to the heartbeat.Monitor for invariants in your system: Leads : Homepage visits

42

Unlimited Infrastructure with APIs

Automated conformity and security:Alarms for unexpected production changes: Changes not through the pipeline.Alarms for old base AMIs, guarding against missing security patches.

43

Frontend integrationLoosely coupledAutonomous teamsHigh optimization

Loosly coupledNo shared asset pipeline.No origin specific behaviour.Autonomous teamsMicroservices imply frontend integrationSelf contained, generic MS over UI monolithHigh optimisationGoogle pagespeedCaching

Allows hybrid approach: Routing back traffic to DC

PageSpeed Module

css (page+fragment)js (page+fragment)

ngx_pagespeed

css (page)js (page)

css (fragment)js (fragment)

No shared asset pipeline.

Pagesare accessible via (localised) URLare owned by one teamcould be cacheableFragmentsare parts of a pagedont know the original request should send cache headersAssetsShould be combined and minifiedCachingCloudFront Caching: Caching on edge locations. Respects Cache Headers from Jigsaw.PageSpeed Caching: Caches combined assets.Backend Caching: Respects Cache Headers from microservices.

Hamburgers,not cattle

We now are not interested in cattle anymore.Next convenience abstraction: We are actually interested in the meat only.Meaning Lambda functions gives us scalable compute without worrying about infrastructure.

47

Event Streaming

Microservices challange: Collect data for BI from all services.Application events are written directly to Kinesis as Json.Ingesting 1.4 TB per day.Intended to be used for real time processing later.

?

Picture Credits"HotWheels - '69 Ford Torino Talladega by Leap Kye, licensed under CC BY-ND 2.0Enterprise IT Adoption Cycle by Simon Wardley under CC BY-SA 3.0And the future is private by Simon Wardley under CC BY-SA 3.0Leosvel et Diosmani by Ludovic Pron under CC BY-SA 3.0Wandergeselle by Sigismund von Dobschtz under CC BY-SA 3.0Puzzling by Bernd Gessler (Own work) CC BY-SA 3.0

50