The 5 principles of google's cloud

21
ENTERPRISE A R C H I T E C T U R E THE 5 PRINCIPLES OF OF GOOGLE’S ”CLOUD” Patrik Svensson, 2011, [email protected] torsdag den 12 maj 2011

description

I did this inspired by the Google IO 2011. The animations and videos don't show :-)

Transcript of The 5 principles of google's cloud

Page 1: The 5 principles of google's cloud

ENTERPRISEA R C H I TE C T U R E

THE 5 PRINCIPLES OF OF GOOGLE’S ”CLOUD”Patrik Svensson, 2011, [email protected]

torsdag den 12 maj 2011

Page 2: The 5 principles of google's cloud

ENTERPRISEA R C H I TE C T U R E THE VISION OF GOOGLE

torsdag den 12 maj 2011

Page 3: The 5 principles of google's cloud

ENTERPRISEA R C H I TE C T U R E THE 5 PRINCIPLES

• Everything is a service (or an application in

Android)

• Relentless technical focus (thinking at nanoscale)

• Data centers are the foundation

• Code is king, Data is king kong

• Identify and keep track on your users

torsdag den 12 maj 2011

Page 4: The 5 principles of google's cloud

ENTERPRISEA R C H I TE C T U R E

torsdag den 12 maj 2011

Page 5: The 5 principles of google's cloud

ENTERPRISEA R C H I TE C T U R E

#1 EVERYTHING IS A SERVICE (OR AN APPLICATION)

torsdag den 12 maj 2011

Page 6: The 5 principles of google's cloud

ENTERPRISEA R C H I TE C T U R E

#2 RELENTLESS TECHNICAL FOCUS

• Jedis build their own

lightsabres

• Parallelize, Distribute, Cache,

Compress, Redundantize

everything

• Latency is VERY evil Source: http://www.flickr.com/photos/60994749@N07/5557591956/

torsdag den 12 maj 2011

Page 7: The 5 principles of google's cloud

ENTERPRISEA R C H I TE C T U R E

EXAMPLE: ”NUMBERS EVERYONE SHOULD KNOW”

1,000,000 ns = 1 ms

1,000,000,000 ns = 1 s

Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems”

torsdag den 12 maj 2011

Page 8: The 5 principles of google's cloud

ENTERPRISEA R C H I TE C T U R E

#3 DATA CENTERS ARE THE FOUNDATION

torsdag den 12 maj 2011

Page 9: The 5 principles of google's cloud

ENTERPRISEA R C H I TE C T U R E ECONOMIES OF SCALE

• ~40 data centers in 2009, 1000,000 machines

Source: http://techcrunch.com/2008/04/11/where-are-all-the-google-data-centers/

torsdag den 12 maj 2011

Page 10: The 5 principles of google's cloud

ENTERPRISEA R C H I TE C T U R E

torsdag den 12 maj 2011

Page 11: The 5 principles of google's cloud

ENTERPRISEA R C H I TE C T U R E

#4 CODE IS KING, DATA IS KING KONG

Linux

Protocol Buffers, JsonPython, Java, C++

Sawzall, Dremel, Percolator

App Engine, Gmail, Search, Index OpenID, OAuth, Google

Accounts available for most services

GFS masterGoogle Work Queue,

Chubby,Netscalar, Google HTTP Server, (Spanner)

MapReduce

Enterprise Architecture

Technical Architecture i.e. which technologies do we use

"We need:Cooling, Power,

Perimeter Networks, Containers, Racks,

Switches & Hardware at low cost that scale"

"We need:One Distributed File Systems, Distributed One Shared memory,

& common data formats to get scale

and low cost"

Google Container-based Data Centers

"We need to build applications and services, application-,

integration- & data platforms, parallell computing platforms & use an open source OS, upon our data center/data platform"

GFS,BigTable,

Protocol Buffers

"We need scheduling synchronization, lock services, i.e. various

forms of control mechanisms for data and

code"

"We need to identify our users to be able to

interact, differentiate and customize the user

experience"

Android, Chrome

Implementation Architecture i.e. how do we implement the technologies

DATA CENTERS DATA CODE CONTROL USERS

torsdag den 12 maj 2011

Page 12: The 5 principles of google's cloud

ENTERPRISEA R C H I TE C T U R E ABOUT DATA

0

50

100

150

200

Structured, Numerical Unstructured, Textual Communication, Traffic

+20 Petabyte/day

~10 Terabyte/day

~2,5 Terabyte

"Google's mission is to organize the world's

information and make it available to all"

torsdag den 12 maj 2011

Page 13: The 5 principles of google's cloud

ENTERPRISEA R C H I TE C T U R E DATA CENTER ”ENTRY”

• The same entry to each Data Center

• ~50 caching (using Squid)

• Built their own HTTP servers/farms

Source: Ed Austin, ”The Anatomy of the Google Architecture”

torsdag den 12 maj 2011

Page 14: The 5 principles of google's cloud

ENTERPRISEA R C H I TE C T U R E INSIDE THE CONTAINERS

• Customized commodity servers, is customized racks in

containers (+1000 servers), organized into clusters

• All containers ”cloned” and look the same

Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems”

torsdag den 12 maj 2011

Page 15: The 5 principles of google's cloud

ENTERPRISEA R C H I TE C T U R E

THE SAME HW, OS AND FILESYSTEM EVERYWHERE

Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems”

torsdag den 12 maj 2011

Page 16: The 5 principles of google's cloud

ENTERPRISEA R C H I TE C T U R E BIGDATA AS DATABASE

Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems”

torsdag den 12 maj 2011

Page 17: The 5 principles of google's cloud

ENTERPRISEA R C H I TE C T U R E

BIGDATA IS COLUMN-BASED

Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems”

torsdag den 12 maj 2011

Page 18: The 5 principles of google's cloud

ENTERPRISEA R C H I TE C T U R E BIGDATA NEEDS GFS

• Use GFS to store data and logs

Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems”

torsdag den 12 maj 2011

Page 19: The 5 principles of google's cloud

ENTERPRISEA R C H I TE C T U R E

MAPREDUCE - A PARALLELL COMPUTING PLATFORM

Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems”

torsdag den 12 maj 2011

Page 20: The 5 principles of google's cloud

ENTERPRISEA R C H I TE C T U R E

ABOUT CODING AT GOOGLE

• Linux as operating system everywhere - is open source, highly customized for this (Android is also

a higly customized version of Linux)

• Serialization/Integration - Protocol buffers (RPC) runs at nano speed, internally used for

”everything”, Json and RESTful used for external API’s

• Application-oriented Programming languages - mainly Python, Java and C++

• Data-oriented programming languages - Percolator, Sawzall, Dremel for various data

processing task (so specialised tools for data!)

• The Business Applications - Gmail, Search, App Engine etc - built upon data center

infrasctructure, data platform and above

torsdag den 12 maj 2011

Page 21: The 5 principles of google's cloud

ENTERPRISEA R C H I TE C T U R E

#5 IDENTIFY AND KEEP TRACK OF YOUR USERS

• You need a google account to start

Android properly

• OpenSocial is a collaborate effort to

compete against Facebook

• OpenID is an identity standard and OAuth

is a standard for authorizing services

• Google is identifying and tracking every

step you take within their domains

torsdag den 12 maj 2011