Using Redundant Data Paths and Clock Domains in Ethernet …...Using Redundant Data Paths and Clock...
Transcript of Using Redundant Data Paths and Clock Domains in Ethernet …...Using Redundant Data Paths and Clock...
2017©Excelfore
Using Redundant Data Paths and Clock Domains in Ethernet TSN
for Mission-Critical Network Reliability
Presentedby: ShrikantAcharyaChiefTechnologyOfficer,ExcelforeCorp.
ContributingAuthors: AnoopBalakrishnan,ExcelforeCorp.ShiroNinomiya,ExcelforeCorp.
2017©Excelfore
ADAS Controller
Actuators
Driver Reaction
Actuators
Sensors
Mission-CriticalAutomotiveNetworking
EverythingWorking
Together
NetworkFailure
EnhancedSafety
Problems!
2017©Excelfore
ADASInfotainment Body/ChassisPowertrain
VehicleGateway
PowertrainController/Gateway
ADASController/Gateway
BodyController/Gateway
eSync
CAN
LIN
CAN
Ethe
rnet
TSN
Ethe
rnet
TSN
CAN
Ethe
rnet
ASILD ASILBASILD ASILB
Ethe
rnet
AVB
LVD
S
………….………….
EthernetorOBDDiagnosticPort
High-Speed Ethernet
High-Speed Ethernet
High-Speed Ethernet
RepresentativeApproachtoNext-GenVehicleNetwork(PhysicalDomains/NoRedundancy)
CloudServer
IVIHeadUnit/Gateway
2017©Excelfore
eSync
CAN
LIN
CAN
Ethe
rnet
TSN
Ethe
rnet
TSN
CAN
Ethe
rnet
Ethe
rnet
TSN
Ethe
rnet
High-Speed Ethernet TSN
High-Speed Ethernet TSN
High-Speed Ethernet TSN
Ethernet-CentricNext-GenVehicleNetwork(LogicalDomains/NoRedundancy)
CloudServer
Gateway/Switch Gateway/Switch Gateway/Switch
Powertrain
Controller
VehicleGateway………….………….………….………….
EthernetorOBDDiagnosticPort
IVI
HeadUnit
Body
Controller
ADAS
Co
ntroller
FailureofaDeviceontheNetworkFailureofaNetworkLink
RedundancytoAddress:
X
X Gateway/Switch
High-Speed Ethernet TSN
VLANsCreatetheDomains
2017©Excelfore
Mission-CriticalNetworkRedundancy
ThreeLevelsofHardwareRedundancy:1. RedundantLinksBetweenNetworkGateway/Switches2. DaisyChainingEndDevicestoaNetworkGateway/Switch3. DaisyChainingEndDevicestoRedundantNetworkGateway/Switches
KeySoftwareConceptsforRedundancyinTSNNetworking:A. RedundantDataPaths– IEEE802.1CBB. TimingandSynchronization– IEEE802.1AS/802.1BAC. RedundantClockDomains– IEEE802.1ASrev
2017©Excelfore
RedundantLinksBetweenSwitches
Switch/Gateway
EndPoints
EndDeviceA
EndDeviceB
EndDeviceC
EndDeviceD
EndPointsSwitch/Gateway
EndDeviceW
EndDeviceX
EndDeviceY
EndDeviceZ
PositiveAttributes:
• ProtectionfromFailureofNetworkLinkonHighspeedBackbone
• Maximumof2SwitchHopsRetainsTSNGuaranteedLatency(<2mson100MbpsEthernet)
Shortcomings:
• NoProtectionfromFailureofNetworkLinktoEndDevices
• NoProtectionfromGateway/SwitchDeviceFailure
X1 2
2017©Excelfore
DualEthernetNodes:KeyHardwareFeature
• LimitationofSingleNodeEndPoints• RedundantPathsonlyatSwitchNodes,notatEndPoints
• FrameReplicationattheSwitch• NoFrameReplicationatEndPoint
• EnhancedRedundancywithDualNodeEndPoints• EndPointscanReplicateFramesfromaTalker• DaisyChainingofEndPointsImprovesRedundancy• DaisyChainingofEndPointImprovesUtilizationofSwitchPorts• AutomotiveProcessorsSupportDualEthernetNodes:
• NXPi.MX6Family• TIJacintoJ6Family
2017©Excelfore
DaisyChainingDualNodeEndDevices
OneLinkFailureDoesnotDisconnectDevicesOneDeviceFailureDoesNot DisconnectOtherDevicesCarefulAnalysisofSwitchHopsRequiredtoEnsureGuaranteedLatency
EndPointA
Switch(Relay)A
EndDeviceA
EndDeviceA
EndDeviceB
EndDeviceC
EndDeviceD
XXEndDevicewith2PortsMayhavea3PortSwitch:• 2ExternalPorts• 1InternalPort
2017©Excelfore
RedundantLinksbetweenSwitches/DualNodeEndPoints
EndDeviceA
EndDeviceB
EndDeviceC
EndDeviceD
EndDeviceW
EndDeviceX
EndDeviceY
EndDeviceZ
X
X
PositiveAttributes:• ProtectionfromFailureofAnyOneNetworkLink• NetworkisStillProtectedfromEdgeDeviceFailure• BetterNodeUtilizationattheSwitch• Maximumof6SwitchHops(3+2+1-or- 1+2+3)
RetainsTSNGuaranteedLatencywithAnyOneFailure
Shortcomings:
• NoProtectionfromGateway/SwitchDeviceFailure
Switch/Gateway
EndPointsEndPointsSwitch/Gateway
1
2
3
4 5
6
3HopsfromEndPointtoBackbone 2HopsintheBackbone 1HopfromBackbonetoEndPoint
2017©Excelfore
RedundancyImpact
• HardwareCosts• EndPointsNeedTwoExternalEthernetNodes
• SoftwarePerformance(higherimpactwithhigherpayloads,utilizationdoubles)_• OverheadofReplicationontheEndPoint
• Allpackets=processingdoubled• Ifoverheadforpackettransmission=10%,withreplication=20%
• OverheadofReplicationontheSwitch(UtilizationisDoubled)• Dependshowmanypacketsneedtobereplicatedtothevariousports• Alsoimpactedishowmanydeletionsarehappening
• NetworkBandwidth• AggregatesBandwidthLoadofDaisy-ChainedEndPoints• OverallNetworkTrafficonSomeLinksMayIncreasebyMultiple(discussedlater)
• DaisyChainingMitigatesthePortUtilizationattheSwitch• EndPointswithSingleNodesCanNotDaisyChain
• MaybeAppropriateforNonMission-CriticalTasks
2017©Excelfore
FullRedundancyin End-to-EndNetworkConnections
Switch(Relay)E
Switch(Relay)F
Switch(Relay)G
Switch(Relay)H
LinkA-E
LinkE-F
LinkF-W
LinkE-G LinkF-H
LinkD-G
LinkG-H
LinkH-Z
X
EndDeviceA
EndDeviceB
EndDeviceC
EndDeviceD
EndDeviceW
EndDeviceX
EndDeviceY
EndDeviceZ
EndPointsEndPoints
X
LossofAnySingleNetworkLink,orAnyNetworkSwitch,isRecoverableLossofAnyEndPointDoesNotAffectConnectivityofOtherEndPoints
2017©Excelfore
Switch(Relay)E
Switch(Relay)F
Switch(Relay)G
Switch(Relay)H
LinkA-E
LinkE-F
LinkF-W
LinkE-G LinkF-H
LinkD-G
LinkG-H
LinkH-Z
X
EndDeviceA
EndDeviceB
EndDeviceC
EndDeviceD
EndDeviceW
EndDeviceX
EndDeviceY
EndDeviceZ
EndPointsEndPoints
X
2msEnd-to-EndLatencyGuaranteedon100MbitNetwork-ForAnyOneFailureNoMorethan7SwitchHops
ControlLatency:AnalyzetheNumberofHops
1
2
3
4
5 6
3HopsfromEndPointtoBackbone 3HopsintheBackbone 1HopfromBackbonetoEndPoint
7
2017©Excelfore
eSync
CAN
LIN
CAN
Ethe
rnet
TSN
Ethe
rnet
TSN
CAN
Ethe
rnet
Ethe
rnet
TSN
Ethe
rnet
High-Speed Ethernet TSN
High-Speed Ethernet TSN
High-Speed Ethernet TSN
Reminder:Ethernet-CentricNext-GenVehicleNetwork(LogicalDomains/NoRedundancy)
CloudServer
Gateway/Switch Gateway/Switch Gateway/Switch
Powertrain
Controller
VehicleGateway
………….………….………….………….
EthernetorOBDDiagnosticPort
IVI
HeadUnit
Body
Controller
ADAS
Co
ntroller
FailureofaDeviceontheNetworkFailureofaNetworkLink
RedundancytoAddress:
Gateway/Switch
High-Speed Ethernet TSN
VLANsCreatetheDomains
2017©Excelfore
FullRedundancyin End-to-EndNetworkConnections
Gateway/Switch
Gateway/Switch
Gateway/Switch
Gateway/Switch
LossofAnySingleNetworkLink,orAnyNetworkSwitch,isRecoverableLossofAnySingleNetworkLinkorSwitchPreservesGuaranteedLatencyLossofAnyEndPointDoesNotAffectConnectivityorLatencyofOtherEndPoints
ASILD ASILD
ASILD ASILD
VLANsCreatetheDomains
2017©Excelfore
SoftwareImplicationsofRedundant NetworkPaths
FrameReplicationandEliminationforReliabilityIEEE802.1CB
2017©Excelfore
Simple End-to-EndNetworkConnections(NoRedundancy)
EndPointA
EndPointZ
SwitchE
SwitchF
LinkA-E LinkE-G LinkF-Z
● LinkA-E● LinkE-F● LinkF-Z● SwitchE● SwitchF
FailureinAnyOneMakestheConnectionFail
X X
2017©Excelfore
FRER(FrameReplicationandEliminationforReliability)
Switch(Relay)
PacketA PacketA
PacketA’
Replication1xIncoming“PacketA”“PacketA”isReplicated2x“PacketA”SentOut
Switch(Relay)
PacketA PacketA
PacketA’
Elimination2xIncoming“PacketA”1x“PacketA” isEliminated1x“PacketA” SentOut
2017©Excelfore
Identifying “PacketA”
DestinationAddress SourceAddress R-TAG VLAN-TAG
EthernetHeader
F1C1 Reserved SeqNum
(optionally‘HSR-TAG’or‘PRP-TAG’)
• DestinationAddress+SourceAddress+VlanID+Seq.NumbercanIdentifythePacket
• This PacketIdentificationisSufficientforReplicationandEliminationbyRelaySystem(Switch)
2017©Excelfore
FrameEliminationandReplicationExplained
EndPointA
EndPointZ
Switch(Relay)E
Switch(Relay)F
Switch(Relay)H
LinkE-F
LinkF-H
1.ManyRedundantPaths2.Bandwidth (BW)UtilizationisDoubled3.SwitchEisSimpleReplication(~5%overhead).4.ComputationComplexityisIncreasedonSwitchF
andH(~30%overhead).
PacketA
PacketA”’
BWin =1BWout=2Onlyreplication
BWout=1
BWin =2BWout=2OneReplicationOneElimination
BWin =2BWout=2OneReplicationOneElimination
BWin=2OneElimination
PacketA”
2017©Excelfore
SoftwareImplementation(Replication)
PHY1
MAC1
PHY2
MAC2
PHY3
MAC3
PHY4
MAC4
PHY5
MAC5
• CheckR-TAGintheIncomingPacketsfromMAC1IfnotExit,thenInsertR-TAG
• KeepTrackintheInternalTableforPACKETID• ReplicateandSendtoRequestedPorts(MAC4,MAC5)
2017©Excelfore
SoftwareImplementation(Elimination)
PHY1
MAC1
PHY2
MAC2
PHY3
MAC3
PHY4
MAC4
PHY5
MAC5
• CheckR-TAGintheIncomingPacketsfromMAC1andMAC2• KeepTrackintheInternalTableforPACKETID• EliminateReplicatedPacketsandSendtoRequestedPorts(MAC4)
IfMAC4doesnot RequestR-TAG,RemoveIt
2017©Excelfore
DesignImplicationforReplication/Elimination
• Software SolutionLayer2SoftwarecanImplementthisLogic– RequiresIDCheckonEachPacket
ThisImpactsLatencyfromAdditionalProcessingProcessorUtilizationMayExceedCapacityUnderHeavyTraffic(~40Mbits/SecondofVideoData)
• SuggestedHardware AccelerationR-TagInsertionorEliminationPACKETIDLook-UpTable(e.g.MACAddr,VLANID,SequenceNo.)
2017©Excelfore
RedundancyofGrandMaster Clock
NoDisruptionofNetworkDevicesbyGMFailureIEEE802.1AS-Rev
2017©Excelfore
EndPointA
ClockA
EndPointB
ClockB
EndPointC
ClockC
EndPointY
ClockY
EndPointZ
ClockZ
Switch (Relay)F Switch(Relay) G
PrimaryGM
SecondaryGM
CurrentDiagramforClockSync
X
2017©Excelfore
• EndPointAFailsGMClock(ClockA)isLostontheNetwork
• NetworkStartsBMCA(IEEE1588BestMasterClockAlgorithm)Chooses OneofamongClockBtoClockZ asNewGMClock
• ClockY BecomesNewGMClock
• SwitchingGMfromClockAtoClockYProcedureRequiresMultipleSecondsAllDevicesLoseSynchronizationDuringProcedure
CurrentProcedureforClockSyncImplementation
2017©Excelfore
EndpointA
ClockA
EndpointB
ClockB
EndpointC
ClockC
EndpointY
ClockY
EndpointZ
ClockZ
BridgeF BridgeG
PrimaryGM
Hot-StandbyGM
DiagramforRedundantGMClockSyncImplementation
X
2017©Excelfore
• PrimaryGMisClockASecondaryGMisClockY
• TwoDomainsof802.1ASClockareRunningSeparately
• NormalCircumstance:GMintheSecondaryDomainisNotOperational
• UponFailureofPrimaryGM:NetworkSeamlesslySwitches toSecondaryGM
• NoDevicesLosetheirSystemSynchronization
Note:ManagementofMultipleDomainsofPTPMessagesisCurrentlyBeingDefinedin802.1AS-rev
ProcedureforRedundantGMClockSyncImplementation
2017©Excelfore
Implementation ofRedundantGM(Updating thegPTPKernel)
FollowingFunctionsMustBeinUpdatedgPTP:
• Handling of MultipleDomainsofSYNCMessagesOurExample isTwoDomains– CouldbeMore
• ManageClocks ofMultipleDomainsKeepTrackofPrimaryGMandSecondaryStand-byGMSecondaryGMMustBeSynchronizedtothePrimaryGM
(RequiredforSeamlessSwitching)
• If PrimaryGMFailsEachgPTPEndDeviceSwitchestoSecondaryGMNo ImpactfromClockDiscontinuityonAny gPTPEndDeviceSwitchingfromPrimaryGMtoSecondaryGMisSeamless
2017©Excelfore
Replacement ofMalfunctioningGM– aProposal(Updating thegPTPKernel)
CaseofaMalfunctioningGM(Clockisdegraded,butnotlost)
• TwoGMsInadequateforRedundantClockDomainswithHotStandby• WhichGMisCorrectinaDispute?
• RequiresThirdGMtoAuditClockBehavior
• ImplementationoftheAuditorGM• OneGMContestsThatOtherGMisMalfunctioning• AuditorChecksStatusofBothGMs• AuditorRendersDecisionandNotifiesAllGMs• AuditorSendsMalfunctionNotificationtoGM
ItsurrendersandceasestobeGM
2017©Excelfore
EndpointA
ClockA
EndpointB
ClockB
EndpointC
ClockC
EndpointY
ClockY
EndpointZ
ClockZ
BridgeF BridgeG
PrimaryGM
Hot-StandbyGM
DiagramforRedundantGMClockSyncImplementation
Auditor
2017©Excelfore
PerformanceImpactofGMClockRedundancy
• NetworkTrafficAdditional~1%OverheadinRedundantSyncMessagesat40Mbits/second
• SoftwareSolutiononEachgPTP NodeGMACSoftwareComplexitywillIncrease
- EachPHY/GMACReceives2xtheNumberofSyncMessages- ValidateandProcesstheSecondarySyncMessages- InputProcessingRequiresMorePerformanceinPHY/GMAC
• SuggestedHardwareAccelerationDetectionofClockDomainIDKeepingTrackofSeparateSyncMessagesandTimeStamps
2017©Excelfore
802.1ASRevSpecvs.Implementation
• StandardOnlyWarrantsHowHot-PlugGMSetupEnvisagedHowtomanagemultipledifferentdomainsofPTPmessagesstillunderdefinition
• DetectionofMalfunctioningGMisNotPartoftheStandard• LefttoIndividualImplementation• Minimum:ThirdGMforMonitoring
- MonitoringandRegularReview- ImplicationsforStartupTime- AddedCosttoImplement- InputProcessingRequiresMorePerformanceinPHY/GMAC
• CostImplicationforThirdGM• ComplexityLefttoSystem/NetworkImplementer
2017©Excelfore
SummaryofOpportunitiesforHardwareAcceleration
ForFrameReplicationandEliminationforReliability:• R-TagInsertionorElimination• PACKETIDLook-UpTable(e.g.MACAddr,VLANID,SequenceNo.)
ForRedundancyofGrandMaster Clock:• DetectionofClockDomainID• KeepingTrackofSeparateSyncMessagesandTimeStamps
2017©Excelfore
SummaryandConclusion
• AutomotiveNetworkingMustAddressMissionCriticalRequirements
• EthernetTSNHasStructuresforRedundantLinkstoMissionCriticalEndDevice
• RedundantDataPathsEnsureMissionCriticalNetworkLinks
• CarefulAnalysisofNetworkHopsEnsuresGuaranteedLatency
• RedundantClockDomainsCouldEnsureSeamlessContinuityofMissionCriticalNetworkOperation