ONOS Open Network Operating System
-
Upload
onlab -
Category
Technology
-
view
438 -
download
12
Transcript of ONOS Open Network Operating System
ONOS Open Network Opera.ng System
Experimental Open-‐Source Distributed SDN OS
So3ware Defined Networking TE
Network OS
Mobility
Network Virtualiza6on
Sco9 Shenker, ONS ‘11
Global Network View
Packet Forwarding
Packet Forwarding
Packet Forwarding
Packet Forwarding
Packet Forwarding
Openflow
Rou6ng
Abstract Network Model
Logically Centralized NOS – Key Ques.ons TE
Network OS
Mobility
Network Virtualiza6on
Global Network View
Packet Forwarding
Packet Forwarding
Packet Forwarding
Packet Forwarding
Packet Forwarding
Openflow
Rou6ng
Abstract Network Model
Fault Tolerance?
Scale-‐out?
How to realize a Global Network View
Related Work
Distributed control plaQorm for large-‐scale networks
Focus on reliability, scalability, and generality
State distribu.on primi.ves, global network view, ONIX API ONIX
Other Work
Helios, Hyperflow, Maestro, Kandoo distributed control planes
NOX, POX, Beacon, Floodlight, Trema controllers
Mo.va.on
Ø Build an open source distributed NOS Ø Learn and share with community
Ø Target WAN use cases
Ø Demo Key Func6onality Ø Fault-‐Tolerance: Highly Available Control plane
Ø Scale-‐out: Using distributed Architecture
Ø Global Network View: Network Graph abstrac6on
Ø Non Goals Ø Performance op6miza6on
Ø Support for reac6ve flows
Ø Stress tes6ng
Phase 1: Goals December 2012 – April 2013
Host
Host
Host
Titan Graph DB
Cassandra In-‐Memory DHT
Instance 1 Instance 2 Instance 3
Network Graph Eventually consistent
Distributed Registry Strongly Consistent Zookeeper
ONOS core
Floodlight
ONOS core
Floodlight
ONOS core
Floodlight
ONOS High Level Architecture
How does ONOS scale-‐out?
ONOS: Scale-‐out using control isola.on
Distributed Network OS
Instance 2
Instance 3
Instance 1
Network Graph
Simple Scale-‐out Design
Ø An instance is responsible for building & maintaining a part of network graph
Ø Control capacity can grow with network size
Distributed Registry for Control Isola.on
ONOS: Master Elec.on
Distributed Network OS
Instance 2
Instance 3
Instance 1
Master Switch A = NONE
Master Switch A = NONE
Distributed Registry
Host
Host
Host
A
B
C
D
E
F
Master Switch A = NONE
Candidates = ONOS 1, ONOS 2, ONOS3
Master Switch A = NONE
Candidates = ONOS 1, ONOS 2, ONOS3
Master Switch A = NONE
Master Switch A = NONE
Candidates = ONOS 1, ONOS 2, ONOS3
Master Switch A = ONOS 1
Candidates = ONOS 2, ONOS 3
Master Switch A = ONOS1
Candidates = ONOS 2, ONOS3
Master Switch A = ONOS1
Candidates = ONOS 2, ONOS3
Fault-‐tolerance using Distributed Registry
Master Switch A = ONOS1
Candidates = ONOS 2, ONOS 3
Master Switch A = ONOS1
Candidates = ONOS 2, ONOS 3
Master Switch A = ONOS 1
Candidates = ONOS 2, ONOS 3
ONOS: Control Plane Failover
Distributed Network OS
Instance 2
Instance 3
Instance 1
Distributed Registry
Host
Host
Host
A
B
C
D
E
F
Master Switch A = NONE
Candidates = ONOS 2, ONOS3
Master Switch A = NONE
Candidates = ONOS 2, ONOS3
Master Switch A = NONE
Candidates = ONOS 2, ONOS3
Master Switch A = ONOS2 Candidates = ONOS3
Master Switch A = ONOS2 Candidates = ONOS3
Master Switch A = ONOS2 Candidates = ONOS3
Network Graph to realize global network view
Cassandra In-‐memory DHT
Id: 1 A
Id: 101, Label
Id: 103, Label
Id: 2 C
Id: 3 B
Id: 102, Label
Id: 104, Label
Id: 106, Label
Id: 105, Label
Network Graph
Titan Graph DB
ONOS Network Graph Abstrac.on
Network Graph
port
switch port
device
port
on port
port
port
link switch
on
device
host host
Ø Network state is naturally represented as a graph Ø Graph has basic network objects like switch, port, device and links Ø Applica.on writes to this graph & programs the data plane
Example: Path Computa.on App on Network Graph
port
switch port
device
Flow path Flow entry
port
on port
port
port
link switch
inport
on
Flow entry
device
outport switch switch
host host
flow flow
• Applica.on computes path by traversing the links from source to des.na.on • Applica.on writes each flow entry for the path
Thus path computa.on app does not need to worry about topology maintenance
Example: A simpler abstrac.on on network graph?
Logical Crossbar
port
switch port
device
Edge Port
port
on port
port
port
link switch
physical
on
Edge Port
device
physical
host host
• App or service on top of ONOS • Maintains mapping from simpler to complex
Thus makes applica.ons even simpler and enables new abstrac.ons
Virtual network objects
Real network objects
Ø Demo Key Func6onality
ü Fault-‐Tolerance: Highly Available Control plane
ü Scale-‐out: Using distributed Architecture
ü Global Network View: Network Graph abstrac6on
Phase 1: Goals December 2012 – April 2013
How is Network Graph built and maintained by ONOS?
Switch Manager Switch Manager Switch Manager
Network Graph: Switches
OF OF
OF OF
OF OF
Network Graph and Switches
SM
Network Graph
Switch Manager SM Switch Manager SM Switch Manager
Link Discovery Link Discovery Link Discovery
Network Graph and Link Discovery
SM
Network Graph: Links
SM SM
Link Discovery Link Discovery Link Discovery
LLDP LLDP LLDP
Network Graph and Link Discovery
Network Graph
SM SM SM
Link Discovery Link Discovery Link Discovery
LD LD LD
Devices and Network Graph
Device Manager Device Manager Device Manager
Network Graph: Devices
SM SM SM LD LD LD
Device Manager Device Manager Device Manager
PKTIN
PKTIN
PKTIN Host
Host
Host
Devices and Network Graph
How Applica.ons use Network Graph?
SM SM SM LD LD LD
Device Manager Device Manager Device Manager
Host
Host
Host
DM DM DM
Path Computa.on Path Computa.on Path Computa.on
Network Graph
Path Computa.on with Network Graph
SM SM SM LD LD LD
Host
Host
Host
DM DM DM
Path Computa.on Path Computa.on Path Computa.on
Network Graph: Flow Paths
Flow 1
Flow 4
Flow 7
Flow 2
Flow 5
Flow 3
Flow 6
Flow 8
Flow entries Flow entries Flow entries
Flow entries Flow entries Flow entries
Flow entries Flow entries Flow entries
Flow entries Flow entries Flow entries
Flow entries Flow entries Flow entries
Flow entries Flow entries Flow entries
Flow entries Flow entries Flow entries
Flow entries Flow entries Flow entries
Path Computa.on with Network Graph
SM SM SM LD LD LD
Host
Host
Host
DM DM DM
Flow Manager
Path Computa.on Path Computa.on
Network Graph
PC PC PC Path Computa.on
Flow Manager Flow Manager
Flow 1
Flow 4
Flow 7
Flow 2
Flow 5
Flow 3
Flow 6
Flow 8
Flow entries Flow entries Flow entries
Flow entries Flow entries Flow entries
Flow entries Flow entries Flow entries
Flow entries Flow entries Flow entries
Flow entries Flow entries Flow entries
Flow entries Flow entries Flow entries
Flow entries Flow entries Flow entries
Flow entries Flow entries Flow entries
Network Graph and Flow Manager
SM SM SM LD LD LD
Host
Host
Host
DM DM DM
Flow Manager
Network Graph: Flows PC PC PC
Flow Manager Flow Manager Flowmod Flowmod Flowmod
Flowmod Flowmod Flowmod
Flow 1
Flow 4
Flow 7
Flow 2
Flow 5
Flow 3
Flow 6
Flow 8
Flow entries Flow entries Flow entries
Flow entries Flow entries Flow entries
Flow entries Flow entries Flow entries
Flow entries Flow entries Flow entries
Flow entries Flow entries Flow entries
Flow entries Flow entries Flow entries
Flow entries Flow entries Flow entries
Flow entries Flow entries Flow entries
Network Graph and Flow Manager
SM SM SM LD LD LD
Host
Host
Host
DM DM DM
Flow Manager
Network Graph
PC PC PC
Flow Manager Flow Manager
FM FM FM
Flow 1
Flow 4
Flow 7
Flow 2
Flow 5
Flow 3
Flow 6
Flow 8
Flow entries Flow entries Flow entries
Flow entries Flow entries Flow entries
Flow entries Flow entries Flow entries
Flow entries Flow entries Flow entries
Flow entries Flow entries Flow entries
Flow entries Flow entries Flow entries
Flow entries Flow entries Flow entries
Flow entries Flow entries Flow entries
Network Graph and Flow Manager
ONOS uses two different consistency models. Why?
Host
Host
Host
Titan Graph DB
Cassandra In-‐Memory DHT
Instance 1 Instance 2 Instance 3
Network Graph Eventually consistent
Distributed Registry Strongly Consistent Zookeeper
ONOS core
Floodlight
ONOS core
Floodlight
ONOS core
Floodlight
ONOS High Level Architecture
Consistency Defini.on
Ø Strong Consistency: Upon an update to the network state by an
instance, all subsequent reads by any instance returns the last updated
value.
Ø Strong consistency adds complexity and latency to distributed data
management.
Ø Eventual consistency is slight relaxa.on – allowing readers to be behind
for a short period of .me.
Strong Consistency using Registry
Distributed Network OS
Instance 2
Instance 3
Network Graph
Instance 1
A = Switch A
Master = NONE A = ONOS 1
Timeline
All instances Switch A Master = NONE
Instance 1 Switch A Master = ONOS 1 Instance 2 Switch A Master = ONOS 1 Instance 3 Switch A Master = ONOS 1
Master elected for switch A
Registry Switch A Master = NONE Switch A
Master = ONOS 1 Switch A
Master = ONOS 1 Switch A
Master = NONE Switch A
Master = ONOS 1
Cost of Locking
All instances Switch A Master = NONE
Why Strong Consistency is needed for Master Elec.on
Ø Weaker consistency might mean Master elec.on on
instance 1 will not be available on other instances.
Ø That can lead to having mul.ple masters for a switch.
Ø Mul.ple Masters will break our seman.c of control
isola.on.
Ø Strong locking seman.c is needed for Master Elec.on
Eventual Consistency in Network Graph
Distributed Network OS
Instance 2
Instance 3
Network Graph
Instance 1
SWITCH A STATE= INACTIVE
Switch A State = INACTIVE
Switch A STATE = INACTIVE
All instances Switch A STATE = ACTIVE
Instance 1 Switch A = ACTIVE Instance 2 Switch A = INACTIVE Instance 3 Switch A = INACTIVE
DHT
Switch Connected to ONOS
Switch A State = ACTIVE
Switch A State = ACTIVE
Switch A STATE = ACTIVE
Timeline
All instances Switch A STATE = INACTIVE
Consistency Cost
Cost of Eventual Consistency
Ø Short delay will mean the switch A state is not ACTIVE on
some ONOS instances in previous example.
Ø Applica.ons on one instance will compute flow through the
switch A while other instances will not use the switch A for
path computa.on.
Ø Eventual consistency becomes more visible during control
plane network conges.on.
Why is Eventual Consistency good enough for Network State?
Ø Physical network state changes asynchronously Ø Strong consistency across data and control plane is too hard Ø Control apps know how to deal with eventual consistency
Ø In the current distributed control plane, each router makes its
own decision based on old info from other parts of the network
and it works fine
Ø Strong Consistency is more likely to lead to inaccuracy of
network state as network conges6ons are real.
Host
Host
Host
Titan Graph DB
Cassandra In-‐Memory DHT
Instance 1 Instance 2 Instance 3
Network Graph Eventually consistent
Distributed Registry Strongly Consistent Zookeeper
ONOS core
Floodlight
ONOS core
Floodlight
ONOS core
Floodlight
ONOS High Level Architecture
Ø Is graph the right abstrac.on? Ø Can it scale for reac.ve flows? Ø What is the Concurrency requirement on graph? Ø Is it large enough to use a NoSQL backend? Ø What about No.fica.ons or publish/subscribe on Graph?
Ø Are we using the right technologies for Graph? Ø Is DHT a right choice? Ø Cassandra has latencies in order of millisecond – is it ok? Ø What throughput do we need? Ø Titan is good for rapid prototype – is it good enough for produc.on?
Ø Have we got our Consistency and Par..on Tolerance right? Ø What is the latency impact? Ø Should we pick Availability over Consistency?
ONOS -‐ Ques.ons
What is Next for ONOS
ONOS Core
ONOS Apps
Performance benchmarks and improvements
Reac.ve flows and low-‐latency forwarding
Events, callbacks and publish/subscribe API
Expand graph abstrac.on for more types of network state
ONOS Northbound API
Service chaining
Network monitoring, analy.cs and debugging framework
Community
Release as open source and/or contribute to Open DayLight. Build and assist developer community outside ON.LAB Support deployments in R&E networks
www.onlab.us