Spoken in the words of the Open Networking Foundation – one of the key concepts to understanding SDN is the separation of control plane and data plane. Typically a network is comprised of many routers and switches, each exchanging table information to build topologies. Each of these network devices has their own individualized control plane for brain-like functions such as route or MAC learning. Each network device also has its own data plane for forwarding packets. The challenge is each device has its own perspective of the network, and the only way you can view that perspective is by connecting to that device via a CLI and issue commands or configurations. The same is applicable for other devices like firewalls, load balancers, not just routers, and switches.
A control plane is the brain of the operations. It typically runs in software and builds the necessary tables for forwarding, such as the RIB or MAC tables. This table is typically sent as a copy down to the forwarding plane and installed in hardware to allow for high throughput forwarding of traffic. These two planes (and usually an additional management plane for things like SSH, SNMP, etc.) are traditionally running on every single network device in a network.
OpenFlow is an open standard for a communications protocol that enables the control plane to break off and interact with the forwarding plane of multiple devices from some central point, decoupling roles for higher functionality and programmability.
Before I go further, it must be noted that:
- SDN is not OpenFlow.
- SDN is much more than a split control and data plane.
- SDN is cool, as is OpenFlow, but they are not the same.
- You’re not alone in understanding the vague concept of SDN. Just look at Wikipedia
A need for network abstraction
Application developers typically have no need to worry about underlying hardware when writing applications. The hardware has been abstracted by the operating system. Often times, even the Operating System itself has been abstracted from the hardware via hypervisors or containerization. This layer of abstraction is a relatively new concept in the networking industry, with OpenFlow as a freedom fighter creating an open interface for network abstraction layers.
This abstraction capability could be done with a controller layer. You can manipulate flow tables and flow entries on network devices without directly connecting to the network devices. The application developer can use an API to communicate to the controller, and the controller takes care of the details needed to update the network devices flow tables. The beauty of SDN is in the Application layer. OpenFlow is one (of many) possible means to achieve the abstraction needed for SDN.
It’s important to note here that OpenFlow does not update configurations on network devices. OpenFlow updates the flow tables of network devices. If packet A needs to get to packet B, OpenFlow logic makes it so. If you need to configure NTP on the network devices, that is not a job for OpenFlow. You will still need to configure settings on devices using some other protocol like SNMP, NETCONF, OVSDB, and so on.
Components of an OpenFlow Switch
Directly from the OpenFlow Switch Specification document:
“An OpenFlow Logical Switch consists of one or more flow tables and a group table, which perform packet lookups and forwarding, and one or more OpenFlow channels to an external controller (Figure 1). The switch communicates with the controller and the controller manages the switch via the OpenFlow switch protocol.”
Using the OpenFlow switch protocol, the controller can add, update, and delete flow entries in flow tables, both reactively (in response to packets) and proactively.
Reactive Flow Entries are created when the controller dynamically learns where devices are in the topology and must update the flow tables on those devices to build end-to-end connectivity. For example, since the switches in a pure OpenFlow environment are simply forwarders of traffic, all rational logic must first be dictated and programmed by the controller. So, if a host on switch A needs to talk to a host switch B, messages will be sent to the controller to find out how to get to this host. The controller will learn the host MAC address tables of the switches and how they connect, programming the logic into the flow tables of each switch. This is a reactive flow entry.
Proactive Flow Entries are programmed before traffic arrives. If it’s already known that two devices should or should not communicate, the controller can program these flow entries on the OpenFlow endpoints ahead of time.
Traffic Matching, Pipeline Processing, and Flow Table Navigation
In an OpenFlow network, each OpenFlow switch contains at least 1 flow table and a set of flow entries within that table. These flow entries contain match fields, counters and instructions to apply to matched packets.
Typically, you’ll have more than a single flow table, so it’s important to note that matching starts at the first flow table and may continue to additional flow tables of the pipeline. The packet will first start in table 0 and check those entries based on priority. Highest priority will match first (e.g. 200, then 100, then 1). If the flow needs to continue to another table, goto statement tells the packet to go to the table specified in the instructions.
This pipeline processing will happen in two stages, ingress processing and egress processing.
If a matching entry is found, the instructions associated with the specific flow entry are executed. If no match is found in a flow table, the outcome depends on the configuration of the Table-miss flow entry.
Table-miss flow entry
The Table-miss flow entry is the last in the table, has a priority of 0 and a match of anything. It’s like a catch-all, and the actions to be taken depend on how you configure it. You can forward the packet to the controller over the OpenFlow Channel, or you could drop the packet, or continue with the next flow table.
“OpenFlow ports are the network interfaces for passing packets between OpenFlow processing and the rest of the network. OpenFlow switches connect logically to each other via their OpenFlow ports…”
There are three types of ports that an OpenFlow switch must support: physical ports, logical ports, and reserved ports.
Physical ports are switch-defined ports that correspond to a hardware interface on the switch. This could mean one-to-one mapping of OpenFlow physical ports to hardware-defined Ethernet interfaces on the switch, butt doesn’t necessarily have to be one-to-one. OpenFlow switches can have physical ports that are actually virtual, and map to some virtual representation of a physical port. This is similar to the way you virtualize hardware network interfaces in compute environments.
Logical ports are switch-defined ports that do not correspond directly to hardware interfaces on the switch. Examples of these include LAGs, tunnels and loopback interfaces. The only differences between physical ports and logical ports is that a packet associated with a logical port may have an extra pipeline field called Tunnel-ID, and when packets are received on logical ports that require communication to the controller, both the logical port and underlying physical port are reported to the controller.
The OpenFlow reserved ports specify generic forwarding actions such as sending to the controller, flooding, or forwarding using non-OpenFlow methods, such as “normal” switch processing.
There are several flavors of required reserved ports:
CONTROLLER port represents the OpenFlow Channel used for communication between the switch and controller.
In hybrid environments, you’ll also see
FLOOD ports to allow interaction between the OpenFlow pipeline and the hardware pipeline of the switch.
OpenFlow-only Switches vs. OpenFlow-hybrid switches
There are two types of OpenFlow switches: OpenFlow-only, and OpenFlow-hybrid.
OpenFlow-only switches are “dumb switches” having only a data/forwarding plane and no way of making local decisions. All packets are processed by the OpenFlow pipeline, and can not be processed otherwise.
OpenFlow-hybrid switches support both OpenFlow operation and normal Ethernet switching operation. This means you can use traditional L2 Ethernet switching, VLAN isolation, L3 routing, ACLs and QoS processing via the switch’s local control plane while interacting with the OpenFlow pipeline using various classification mechanisms.
You could have a switch with half of its ports using traditional routing and switching, while the other half is configured for OpenFlow. The OpenFlow half would be managed by an OpenFlow controller, and the other half by the local switch control plane. Passing traffic between these pipelines would require the use of a NORMAL or FLOOD reserved port.
OpenFlow Protocol supports 3 message types, each with their own setup of sub-types:
Controller-to-switch Messages and initiated by the controller and used to directly manage or inspect the switch. These messages include:
- Features – switch needs to request identity
- Configuration – set and query configuration parameters
- Modify-State – also called ‘flow mod’, used to add, delete and modify flow/group entries
- Read-States – get statistics
- Packet Outs – controller send message to the switch, either full packet or buffer ID.
- Barrier – Request or reply messages are used by controller to ensure message dependencies have been met, and receive notification.
- Role-Request – set the role of its OpenFlow channel
- Asynchronous-Configuration – set an additional filter on asynchronous message that it wants to receive on OpenFlow Channel
Asynchronous messages are initiated by the switch and used to update the controller of network events and changes to the switch state. These messages include:
- Packet-in – transfer the control of a packet to the controller
- Flow-Removed – inform controller that flow has been removed
- Port Status – inform controller that switch has gone down
- Error – notify controller of problems
Symmetric messages are initiated either by the switch or controller and sent without solicitation. These messages include:
- Hello – introduction or keep-alive messages exchanged between switch and controller
- Echo – sent from either switch or controller, these verify liveness of connection and used to measure latency or bandwidth
- Experimenter – a standard way for OpenFlow switches to offer additional functionality within the OpenFlow message type space.
OpenFlow Connection Sequence
- Switch can initiate the connection to the controller’s IP and default transport port (TCP 6633 pre-OpenFlow 1.3.2, TCP 6653 post), or a user-specified port.
- Controller can also initiate the connection request, but this isn’t common.
- TCP or TLS Connection Established
- Both send an
OFPT_HELLOwith a populated version field
- Both calculate the negotiated version to be used
- If this cannot be agreed upon an
OFPT_ERRORmessage is sent
- If each support the version, the controller sends an
OFPT_FEATURES-REQUESTto gather the Datapath ID of the switch, along with the switch’s capabilities.
We’ll take a look at this process in the section below
Observing OpenFlow Messages in Wireshark
- Mininet is at 192.168.0.11
- ODL is at 192.168.0.12
First I execute mininet, specifying the use of OpenFlow1.3, and initiate a ping between host h1 and host h2.
mininet@mininet-vm:~$ sudo mn --controller=remote,192.168.0.12 --mac --topo=single,2 --switch=ovsk,protocols=OpenFlow13 *** Creating network *** Adding controller *** Adding hosts: h1 h2 *** Adding switches: s1 *** Adding links: (h1, s1) (h2, s1) *** Configuring hosts h1 h2 *** Starting controller c0 *** Starting 1 switches s1 ... *** Starting CLI: mininet> h1 ping h2 PING 10.0.0.2 (10.0.0.2) 56(84) bytes of data. 64 bytes from 10.0.0.2: icmp_seq=1 ttl=64 time=0.746 ms 64 bytes from 10.0.0.2: icmp_seq=2 ttl=64 time=0.365 ms ^C --- 10.0.0.2 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1001ms rtt min/avg/max/mdev = 0.365/0.555/0.746/0.191 ms
The topology generated in mininet looks like this on the OpenDaylight controller:
We can see that host 1 is connected to s1-eth1 and host 2 is connected to s1-eth2. Notice there is a LOCAL management interface, which is the local management plane of the switch used to send traffic directly to the control plane.
The mininet topology can be viewed by looking at the Open vSwitch (OVS) database:
mininet@mininet-vm:~$ sudo ovs-vsctl show 0b8ed0aa-67ac-4405-af13-70249a7e8a96 Bridge "s1" Controller "ptcp:6634" Controller "tcp:192.168.0.12:6633" is_connected: true fail_mode: secure Port "s1" Interface "s1" type: internal Port "s1-eth2" Interface "s1-eth2" Port "s1-eth1" Interface "s1-eth1" ovs_version: "2.0.2"
We can see the flows created by dumping flows in mininet:
mininet@mininet-vm:~$ sudo ovs-ofctl -O OpenFlow13 dump-flows s1 OFPST_FLOW reply (OF1.3) (xid=0x2): cookie=0x2b00000000000011, duration=401.344s, table=0, n_packets=7, n_bytes=518, priority=2,in_port=1 actions=output:2,CONTROLLER:65535 cookie=0x2b00000000000010, duration=401.344s, table=0, n_packets=6, n_bytes=476, priority=2,in_port=2 actions=output:1,CONTROLLER:65535 cookie=0x2b0000000000000b, duration=405.347s, table=0, n_packets=0, n_bytes=0, priority=100,dl_type=0x88cc actions=CONTROLLER:65535 cookie=0x2b0000000000000b, duration=405.34s, table=0, n_packets=0, n_bytes=0, priority=0 actions=drop
Using a Wireshark display filter for openflow_v4 (which means OpenFlow 1.3), we can view the packets caught:
Notice that the first packet caught is a HELLO message sent from the switch (192.168.0.11) to controller (192.168.0.12). This is exactly what is expected per the OpenFlow Specification document. This is the initialization of the OpenFlow Channel, which is the datapath for the OpenFlow protocol between the controller and device, which in this case is a mininet switch.
Back to this drawing, we’re seeing the communication between the OpenFlow switch’s OpenFlow Channel and the controller. Also notice this is using the Control Channel, meaning this traffic is using the LOCAL management port of the switch, and not going through the pipeline of the switch itself.
The next message type we see is an
OFPT_FEATURES_REQUEST from the controller to the switch
Among other capability attributes, this message is used to find the Datapath Identifier (DPID) of the switch. The DPID uniquely identifies a data path in the OpenFlow topology and is dynamically created by combining the device MAC address in the lower 48 bits, prepended with some 16-bit string to be determined by the implementer.
The switch replies with an
The OpenFlow Specification states that the n_tables field describes the number of tables supported by the switch, each of which can have a different set of supported match fields, actions and number of entries. This is included in the features reply message, as seen here:
If the controller needs to understand the size, types, and order in which the tables are constructed, the controller sends an
OFFMP_TABLE_FEATURES message. The switch must then return these tables in the order in which these packets traverse the tables.
Multipart messages are used to encode request or reply that could potentially carry are large amounts of data, which may not fit into a single message. Normal messages are limited to 64KB. Multipart messages are encoded in sequence. These types of messages are primarily used to request statistics or state information.
In our packet capture we can see this exchange of information. Looking at the
OFPT_MULTIPART_REPLY, OFPMP_DESC packet we can see attributes such as the manufacturer and hardware/software versions.
OFPT_MULTIPART_REPLY, OFPMP_PORT_DESC message you’ll see information about the switch ports sent to the controller. Anything from type, to status, to speed is included in this message. In this case, the switch port is still down.
PACKET-IN and PACKET-OUT Messages
PACKET_IN messages are sent from the switch to the controller. I initiated a ping from h1 to h2 in mininet.
mininet> h1 ping h2 PING 10.0.0.2 (10.0.0.2) 56(84) bytes of data. 64 bytes from 10.0.0.2: icmp_seq=1 ttl=64 time=0.658 ms 64 bytes from 10.0.0.2: icmp_seq=2 ttl=64 time=0.446 ms ^C --- 10.0.0.2 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1001ms rtt min/avg/max/mdev = 0.446/0.552/0.658/0.106 ms
PACKET_IN messages are seen.
We first see a
PACKET_IN message from the switch to the controller because h1 doesn’t yet know how to get to h2. This is and ARP message, encapsulated by OpenFlow over TCP.
The next packet in succession is the ARP reply from h2:
Followed by the ICMP echo
And echo reply
These flows can be seen by looking at the Open vSwitch (OVS) database.
mininet@mininet-vm:~$ sudo ovs-ofctl -O OpenFlow13 dump-flows s1 OFPST_FLOW reply (OF1.3) (xid=0x2): cookie=0x2a00000000000023, duration=394.662s, table=0, n_packets=18, n_bytes=2321, idle_timeout=300, hard_timeout=600, priority=10,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02 actions=output:2 cookie=0x2a00000000000022, duration=394.662s, table=0, n_packets=19, n_bytes=1424, idle_timeout=300, hard_timeout=600, priority=10,dl_src=00:00:00:00:00:02,dl_dst=00:00:00:00:00:01 actions=output:1 cookie=0x2b00000000000011, duration=401.344s, table=0, n_packets=7, n_bytes=518, priority=2,in_port=1 actions=output:2,CONTROLLER:65535 cookie=0x2b00000000000010, duration=401.344s, table=0, n_packets=6, n_bytes=476, priority=2,in_port=2 actions=output:1,CONTROLLER:65535 cookie=0x2b0000000000000b, duration=405.347s, table=0, n_packets=0, n_bytes=0, priority=100,dl_type=0x88cc actions=CONTROLLER:65535 cookie=0x2b0000000000000b, duration=405.34s, table=0, n_packets=0, n_bytes=0, priority=0 actions=drop
The idle timeout will expire when there are no hits on the flow entry. The flow is then removed. If I were to fire up another connection between these hosts, I would not capture the packets at the controller since the flows are already programmed on the switch.
This is true if I have a single switch in the topology or 50 switches in the topology.
mininet@mininet-vm:~$ sudo mn --controller=remote,192.168.99.104 --mac --topo=linear,50 *** Creating network *** Adding controller *** Adding hosts: h1 h2 h3 h4 h5 h6 h7 h8 h9 h10 h11 h12 h13 h14 h15 h16 h17 h18 h19 h20 h21 h22 h23 h24 h25 h26 h27 h28 h29 h30 h31 h32 h33 h34 h35 h36 h37 h38 h39 h40 h41 h42 h43 h44 h45 h46 h47 h48 h49 h50 *** Adding switches: s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 s12 s13 s14 s15 s16 s17 s18 s19 s20 s21 s22 s23 s24 s25 s26 s27 s28 s29 s30 s31 s32 s33 s34 s35 s36 s37 s38 s39 s40 s41 s42 s43 s44 s45 s46 s47 s48 s49 s50 *** Adding links: (h1, s1) (h2, s2) (h3, s3) (h4, s4) (h5, s5) (h6, s6) (h7, s7) (h8, s8) (h9, s9) (h10, s10) (h11, s11) (h12, s12) (h13, s13) (h14, s14) (h15, s15) (h16, s16) (h17, s17) (h18, s18) (h19, s19) (h20, s20) (h21, s21) (h22, s22) (h23, s23) (h24, s24) (h25, s25) (h26, s26) (h27, s27) (h28, s28) (h29, s29) (h30, s30) (h31, s31) (h32, s32) (h33, s33) (h34, s34) (h35, s35) (h36, s36) (h37, s37) (h38, s38) (h39, s39) (h40, s40) (h41, s41) (h42, s42) (h43, s43) (h44, s44) (h45, s45) (h46, s46) (h47, s47) (h48, s48) (h49, s49) (h50, s50) (s2, s1) (s3, s2) (s4, s3) (s5, s4) (s6, s5) (s7, s6) (s8, s7) (s9, s8) (s10, s9) (s11, s10) (s12, s11) (s13, s12) (s14, s13) (s15, s14) (s16, s15) (s17, s16) (s18, s17) (s19, s18) (s20, s19) (s21, s20) (s22, s21) (s23, s22) (s24, s23) (s25, s24) (s26, s25) (s27, s26) (s28, s27) (s29, s28) (s30, s29) (s31, s30) (s32, s31) (s33, s32) (s34, s33) (s35, s34) (s36, s35) (s37, s36) (s38, s37) (s39, s38) (s40, s39) (s41, s40) (s42, s41) (s43, s42) (s44, s43) (s45, s44) (s46, s45) (s47, s46) (s48, s47) (s49, s48) (s50, s49) *** Configuring hosts h1 h2 h3 h4 h5 h6 h7 h8 h9 h10 h11 h12 h13 h14 h15 h16 h17 h18 h19 h20 h21 h22 h23 h24 h25 h26 h27 h28 h29 h30 h31 h32 h33 h34 h35 h36 h37 h38 h39 h40 h41 h42 h43 h44 h45 h46 h47 h48 h49 h50 *** Starting controller c0 *** Starting 50 switches s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 s12 s13 s14 s15 s16 s17 s18 s19 s20 s21 s22 s23 s24 s25 s26 s27 s28 s29 s30 s31 s32 s33 s34 s35 s36 s37 s38 s39 s40 s41 s42 s43 s44 s45 s46 s47 s48 s49 s50 ... *** Starting CLI: mininet> mininet> h1 ping h50 PING 10.0.0.50 (10.0.0.50) 56(84) bytes of data. 64 bytes from 10.0.0.50: icmp_seq=1 ttl=64 time=6.16 ms 64 bytes from 10.0.0.50: icmp_seq=2 ttl=64 time=0.370 ms ^C --- 10.0.0.50 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1001ms rtt min/avg/max/mdev = 0.370/3.268/6.167/2.899 ms
I hope this gives you an idea of some of the basic operations of OpenFlow.
Overlaid & OpenFlow
I was inspired to learn more about OpenFlow after attending Networking Field Day 13. NEC presented a couple of super interesting use cases showcasing SDN with OpenFlow for Unified Communications optimization and security threat control in enterprise networks. The NEC ProgrammableFlow Controller is wildly impressive, a perfect example of what dreams may come with OpenFlow. Check out their presentations here:
This article is a small compilation of some notes I took on OpenFlow while cruising through the GNS3 SDN & OpenFlow course by David Bombal. While this is barely scratching the surface of OpenFlow, and especially SDN, I find it valuable to go through the motions and understand what is happening at some fundamental level when discussing technologies like this.
I’ve been playing around with several OpenFlow controllers: OpenDaylight, ONUS, HP VAN, and Floodlight, but the real beauty comes out north of this at that application layer. For now, I’m still in the weeds and having a blast with Mininet and OpenFlow. OpenFlow isn’t the only protocol out there, and many solutions are starting to trend toward BGP as the protocol for software-defined networks. That said, don’t put all your study time in one basket. Exciting times are ahead, so stay sharp and keep relevant!