Building reliable systems from unreliable components
-
Upload
arnon-rotem-gal-oz -
Category
Technology
-
view
4.595 -
download
2
description
Transcript of Building reliable systems from unreliable components
![Page 1: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/1.jpg)
Building reliable systems from unreliable components
Arnon Rotem-Gal-Oz
![Page 2: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/2.jpg)
Arnon Rotem-Gal-Oz
![Page 3: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/3.jpg)
What’s in a 9
![Page 4: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/4.jpg)
0.99 reliability
We have a nice little legacy business component
![Page 5: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/5.jpg)
0.99
0.99
0.99 0.99
0.99
0.99 0.99
0.99
0.99
And we move it to SOA
![Page 6: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/6.jpg)
Failsafe hardware
Status Technologies FT Server
![Page 7: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/7.jpg)
Or try to detect failure , handle it and minimize its effect on the business service
© Rosendahl
![Page 8: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/8.jpg)
BEFORE WE BEGIN
![Page 9: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/9.jpg)
SOA
Service
describes
End Point Exposes
Messages Sends/Receives
Contracts
Binds to
Service Consumer
implements
Policy governed by
Sends/Receives
Adheres to
Component
Relation
Key
Understands
Serves
![Page 10: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/10.jpg)
SOA is derived from other styles
Pipes and Filters
Client Server
Distributed Agents
Layered System
Stateless Comm.
SOA
![Page 11: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/11.jpg)
SOA vs. REST
Pipes and Filters
Client Server
Uniform Interface
Virtual Machine
Distributed Agents
Layered System
Replicated Repository
Code On Demand
Stateless Comm.
Cacheable
REST SOA
![Page 12: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/12.jpg)
![Page 13: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/13.jpg)
3G Video Calls
MMS Dedicated
Client
Mobile Integration
Applications
Acquisition Interactions branding
System
Ad Management
Interactions Reference
Data Links
Resources
Monitoring
Targeted Advertizing
Campaign Mgmt.
Usage Datmart
Reporting
Billing
Data mining & Statistics
Reports
Link Managment
Publishing tools
integration
Interaction Designer
Web Front-end
Data Interfaces
3rd parties
![Page 14: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/14.jpg)
Load balancer
Web Server
(IIS/Apache
App Server
App Server
App Server
App Server
App Server
App Server
App Server
Web Server
(IIS/Apache
MMS Gateway
3G Gateway
3G Gateway
Load balancer
Firewall
Firewall
BI & Reporting
DB Links
Datamart usage
DB References
Links
Registeration
Sync. Server
DMZ
Operational
Backend
Advertizing clients
Fire
wal
l Web Server
(IIS/Apache
NMS Paper Editor
Smart phones
DMZ
Camera Phones
Admin Console
![Page 15: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/15.jpg)
CHALLENGE – SERVICE AVAILABILITY
![Page 16: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/16.jpg)
What’s the effect of a failure - Server
E1 line = 30 concurrent video
calls
Call Flow Service
![Page 17: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/17.jpg)
request
reaction
Edge Service Instance
Dispatcher
Distribute
End point
Service Business logic
Service Instance
![Page 18: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/18.jpg)
What’s the effect of a failure - Channel
E1 line = 30 concurrent video
calls
Call Flow Channel
Call Flow Channel Call Flow
Channel Call Flow Channel Call Flow
Channel Call Flow Channel Call Flow
Channel
Call Flow Channel
Call Flow Channel
Call Flow Channel
Call Flow Channel
Call Flow Channel
Call Flow Channel
Call Flow Channel
Call Flow Channel
Call Flow Channel Call Flow
Channel
![Page 19: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/19.jpg)
Service Instance with NLB
Service Instance
NLB Driver
Cluster Host
NIC Driver
TCP/IP
Windows Kernel
NIC
NLB Driver
Cluster Host
NIC Driver
TCP/IP
Windows Kernel
NIC
Virtual IP : 1.1.1.1
Real IP : 1.1.1.2 Real IP : 1.1.1.3
Service Instance Edge
Windows Host
NIC Driver
TCP/IP
Windows Kernel
NIC
Real IP : 1.1.1.4
![Page 20: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/20.jpg)
Virtual Endpoint
![Page 21: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/21.jpg)
Request/Reply
Service
EndPoint
Synchronous processing
1. Request 2.
3. Reply
Service Consumer
![Page 22: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/22.jpg)
Things look Cool & Simple ™
Session negations
Image extraction
identification Translation
to links
Render resutls
3G Call
![Page 23: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/23.jpg)
IVP (RV) 3G GW
(RV)
3G VAS (Cestel)
WS
Resource Manager
SIP Listner
RTP Image Extractor
Alg. Engine
Dispatcher
WebConnector
3G Builder (Cestel)
WebRenderer
Turn out Complicated & Ugly ™
![Page 24: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/24.jpg)
Relation
Key
SOA Component Pattern Component
Concern/attribute
Edge pipeline
Perform Task
EndPoint
Service
Request
Reaction
EndPoint
pipeline
Perform Task
EndPoint
pipeline
Perform Task
EndPoint
Queue
Request 2
Request 1
Parallel Pipelines
![Page 25: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/25.jpg)
Inversion of Communications
![Page 26: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/26.jpg)
Consumer view
var sendMmsEvent = new SendMmsEvent()
{
FromNumber = simpleMessageDetails.DialedNumber,
Subject = mmsContents.Subject,
ToNumber = simpleMessageDetails.Sender,
ImageExtension = mmsContents.ImageExtension,
ImageAsByteArray = mmsContents.Image,
TextAsByteArray = mmsContents.Text
};
eventBroker.RaiseEvent(sendMmsEvent);
http://www.flickr.com/photos/crimson_wolf/2851737125/sizes/l/
![Page 27: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/27.jpg)
Service view
public interface ImPostOffice : ImContract, IHandleSendCoupon,
IHandleSendSms,IHandleStatus,IHandleAdminStatus,IHandleWapLink,
IHandleSendMms
{
}
[ServiceContract]
public interface IHandleSendMms
{
[OperationContract]
int SendMms(SendMmsEvent eventOccured);
}
[ServiceContract]
[DataContract]
public class SendMmsEvent : ImEvent
{
/// <summary>
/// end user's number. should be in international format: +[country-code]number. Example: +491737692260
/// </summary>
[DataMember]
public string ToNumber { get; set; }
/// <summary>
/// service's number, usually a short-code. Example: 84343
/// </summary>
[DataMember]
public string FromNumber { get; set; }
/// <summary>
/// Text, as byte array. Use Encoding classes to do it.
/// </summary>
[DataMember]
public byte[] TextAsByteArray { get; set; }
/// <summary>
/// Image, as byte array. Can be: jpg, gif, png, bmp. (jpg rulez!!)
/// </summary>
[DataMember]
public byte[] ImageAsByteArray { get; set; }
/// <summary>
/// Remeber <c>ImageAsByteArray</c>? - This is where you manaually tell us what's the extension. Yes, we can inspect the signature, but why?
/// </summary>
[DataMember]
public string ImageExtension { get; set; }
/// <summary>
/// the mms message should have a subject. just put something there.
/// </summary>
[DataMember]
public string Subject { get; set; }
![Page 28: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/28.jpg)
Edge translates external structures to internal ones
public int SendMms(SendMmsEvent eventOccured)
{
var eventContext = eventOccured.ToString();
if (log.IsDebugEnabled)
log.Debug("inside 'SendMms', event context = [" + eventContext + "]");
var fromNumber = eventOccured.FromNumber;
var sender = mmsSenderFactory.Get(fromNumber);
if (null == sender)
{
if (log.IsWarnEnabled)
log.Warn("cannot get mms sender derived from '" + (fromNumber ?? "null") + "'");
return 0;
}
IMmsSubmitResponse response;
try
{
var mmsMessageDetails = new MmsMessageDetails(eventOccured.ToNumber,
eventOccured.TextAsByteArray,
eventOccured.ImageAsByteArray,
eventOccured.ImageExtension,
eventOccured.Subject);
response = sender.Submit(mmsMessageDetails);
}
catch (Exception ex)
{
log.Error("cannot send mms message, context = [" + eventContext + "]", ex);
return 0;
}
if (log.IsInfoEnabled)
{
var responseMessage = (null == response) ? "null" : response.ToString();
log.Info("sent mms with event context = [" + eventContext + "], response = [" +
responseMessage + "]");
}
![Page 29: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/29.jpg)
Sagas tie instances together for conversations
![Page 30: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/30.jpg)
SIP
Image
Extractor
Identification
Call Recovery
![Page 31: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/31.jpg)
Alternative : Orchestration
request
Workflow Engine
Workflow instance
Manage Process
route request
Host Workflows
Schedule
Orchestration platform
Service Service
reaction
Auxiliary tools
Coordinator
Protocol
Offline designer
monitor
![Page 32: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/32.jpg)
Be Wary of Nano-Services
![Page 33: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/33.jpg)
CHALLENGE - MANAGEMENT
![Page 34: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/34.jpg)
Edge
Watchdog Edge
Report
EndPoint
Service
Request
Monitor
EndPoint
Watchdog Agent
Monitor
Heal
Log
Reports
Monitor
Monitor
Monitor
Blogjecting Watchdog
![Page 35: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/35.jpg)
Blogjects concept is about collaborating objects
![Page 36: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/36.jpg)
WatchDog
Service A Service B WDWatcher
![Page 37: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/37.jpg)
SIP
Image
Extractor
Identification
Call Recovery
WatchDog
![Page 38: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/38.jpg)
![Page 39: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/39.jpg)
Status
Service Monitor Edge/Service
In Commands
Metrics collection
Policy governance
Security monitoring
Fault Monitoring Reporting &
Dashboarding
Control
Edge/Service
Status
Monitor Act
Collect
Notify
Service Monitor
![Page 40: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/40.jpg)
3 2 1 root
/
Sessions/
Abcde/
Efgh/
Resources/ Dispatchers/ Xyz/
RESTful resource management
![Page 41: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/41.jpg)
http://devrig:52141/RM/Sessions/abc/
• ATOMPUB
– Session details
• URI (ID)
• State (start/end/status etc.)
• Resources – Knows status
– URI for the Resource representation on the RM
– URI for the Resource itself
![Page 42: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/42.jpg)
Keep the BIT
http://www.flickr.com/photos/37643027@N00/2050024263/sizes/o/
![Page 43: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/43.jpg)
SIP
Image
Extractor
Identification X
Call Recovery
WatchDog
WatchDog
Liveliness Monitor
3G Call
![Page 44: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/44.jpg)
CHALLENGE – MULTI-TENNANCY (LIES, I TELL YOU, ALL LIES)
![Page 45: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/45.jpg)
Same event different subscribers
Call Flow
Player (interaction Renderer)
Call Flow Bridge to 3rd
Party
Play Movie Event
![Page 46: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/46.jpg)
Routing [ServiceContract]
[Participate("3G")]
public interface ImPlayer : ImContract, IHandleCallStarted, IHandleCallEnded,
IHandlePlayMovie,IHandleCallAborted
{
}
[ServiceContract]
[Participate("3GPartner")]
public interface ImXsightsGateWay : ImContract,
IHandleCallAborted,
IHandlePlayMovie,
IHandleReadyForSearch,
IHandleSearchStarted,
IHandleJoinThirdParty
{
}
![Page 47: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/47.jpg)
Initiator A
Initiator B
Participant B Capacity : 1
Raise a saga initiating event
Now what ?!
Participant A Capacity : 2
![Page 48: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/48.jpg)
Reservation Pattern
![Page 49: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/49.jpg)
Event Broker
Service Host
Resource Allocator
Business Logic
Control Edge
Service Host
Service Host
Business Logic
Event Broker
Service Host
Resource Allocator
Control Edge
Service Instance #1
Service Host
Business Logic
Event Broker
Service Host
Resource Allocator
Control Edge
Service Instance #2
Service Host
Business Logic
Event Broker
Service Host
Resource Allocator
Control Edge Reservation
![Page 50: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/50.jpg)
Good old 2PC to the
rescue
![Page 51: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/51.jpg)
Takeaways
• Things break • Decouple • Fail Fast • Monitor & Detect • Compensate
• Throw state unto others
![Page 52: Building reliable systems from unreliable components](https://reader035.fdocuments.net/reader035/viewer/2022081400/554f36ecb4c905cd048b4dc6/html5/thumbnails/52.jpg)
Arnon Rotem-Gal-Oz
@arnonrgo http://arnon.me
http://www.cloudvalue.com