First field demonstration of cloud datacenter workflow automation employing dynamic optical transport network resources under OpenStack and OpenFlow orchestration

Thomas Szyrkowiec; Achim Autenrieth; Paul Gunning; Paul Wright; Andrew Lord; Jörg-Peter Elbers; Alan Lumb

doi:10.1364/OE.22.002595

1. Introduction

Optical transport networks currently, are statically configured and provide customers with dedicated, fixed bandwidth connections for extended time durations (years). Datacenter (DC) operators are well-versed in exploiting dynamic compute and storage resources through virtualization. They anticipate the same flexibility from their inter-datacenter network.

Network operators too appreciate an opportunity to harness automation and apply virtualization techniques to provide customers with dedicated, fixed bandwidth connections of shorter time contingent upon a dynamic, re-configurable optical transport network that permits the release of dormant statically configured bandwidth - through appropriate incentives - to redistribute bandwidth demand, flatten peak network load and increase network utilization. Clearly a mechanism to orchestrate cloud resources within, and optical transport resources between the datacenters is required [1–3].

OpenStack (http://www.openstack.org/) is the leading open source datacenter orchestration platform used within datacenters operated by cloud service providers and large enterprises alike. Development is supported by a broad base of developers and a growing cohort of commercial software and hardware vendors. OpenStack comprises an expanding collection of independent service modules that includes the management and dynamic orchestration of virtualized compute (‘nova’); storage (‘cinder’ & ‘swift’); and networking (‘neutron’) resources hosted on the hardware servers within datacenters. OpenStack neutron was conceived to automate virtualized Layer 2 & Layer 3 connectivity services between virtual machines (VMs) and physical servers within a datacenter. An SDN (Software-Defined Networking) controller can be run as a network backend for OpenStack using a neutron plugin. OpenFlow is an open SDN protocol receiving much attention, even though extensions are necessary for the optical domain [4,5]. Indeed recent publications have reported experiments using OpenStack with OpenFlow-enabled layer 2 switches where the WAN (Wide Area Network) between datacenter locations consisted of fixed optical connections [6] or reported the orchestration of datacenter and optical transport resources, but based on proprietary software and confined to a laboratory environment [7].

In the ECOC-PDP [8], we reported and demonstrated, for the first time to our knowledge, the orchestration of elastic datacenter and inter-datacenter optical transport network resources using vendor-independent and exclusively open source OpenStack and OpenFlow frameworks over field-installed fiber. The programmatic control that was outlined could allow a datacenter operator to dynamically request optical lightpaths from a transport network operator to support transient, bandwidth intensive, inter-datacenter workflows such as the movement of virtual machine clusters or large storage workloads.

2. Motivation and benefit

End-to-end workflow automation offers many benefits that include: reducing the lead-time for tenants and datacenter operators to access innovative federated services across a geographically distributed pool of compute, storage and network resource; removing the barriers-to-entry imposed by statically configured optical transport to more closely match tenant and datacenter requirements for compute, storage and optical transport network usage that tracks their revenue streams. A datacenter operator could, in practice, accommodate variable bandwidth demands within workflows by closely tracking the aggregated requirements of its tenants using bandwidth flexing. Or perhaps be provided with access to a fixed quantity of bandwidth that can be mobilized based on workload between the network operator’s demarcation points which we term bandwidth steering. It then remains for the network operator to decide on the ratio of proactive service configuration that utilizes forward-planning of resource consumption, to reactive service configuration, that is triggered when a threshold is crossed (particular to their optical transport network policy) and flexing to accommodate transient increases in traffic flow. We envisage that a DC operator would use an SDN controller to apply and establish the packet forwarding state of elements under its control within its network domain. And the network operator would administer an optical network controller (ONC), acting effectively as an “optical hypervisor”, to establish lightpaths between datacenters under its control and across its network domain. The ONC advertises several isolated subsets (or 'virtual slices') of the network operators optical transport resource that it makes available to DC operators and delegates an appropriate level of control of the resource to a DC operator northbound through an OpenFlow Agent.

3. Use cases

Typical workflows inside a datacenter include: a) storage migration; b) virtual machine migration; c) active-active storage replication and d) distributed applications [9]. Storage migration involves either the back-up or transfer of data for future usage, to reduce access time for example. Virtual machines with live workflows are typically migrated because of the need for more computational resources that can be met elsewhere. Active-active storage replication ensures the synchronization and coherence of replicated data across different locations. Finally distributed applications require a communication between virtual machines hosted on physically separate, dispersed hardware servers.

The exemplar workflows listed above highlight how critical it is to have reliable and elastic connectivity between datacenters deployed across two or more distinct geographical locations. This connectivity is most often provided by the transport network operator. To maximize the utilization of the network the datacenter operator might offer virtual slices to his tenants who can order additional bandwidth (“bandwidth flexing”) or redirect the communication channel to another datacenter (“bandwidth steering”). These two use cases are represented in Fig. 1.

Fig. 1 Bandwidth flexing and steering.

Download Full Size | PDF

In the first use case (“bandwidth flexing”) the datacenter operator contracts a network ‘slice’ from the transport network operator which is then virtualized by the datacenter operator to form a geographically extended datacenter. This, in turn, can be allocated amongst its tenants. Datacenter tenants can easily request connectivity between the locations of the distributed datacenter. A basic connectivity product might be extended by servicing requests for additional bandwidth of limited duration (e.g. hours, days). By offering this bandwidth flexing the datacenter operator can increase its utilization of the leased inter-datacenter transport network and increase revenue. This flexibility can also be used to switch or steer the extended bandwidth to a different datacenter. Here a tenant might order a bandwidth product that can change termination points on demand. By doing this, the tenant can engage and allow a different set of servers access higher bandwidths and so exchange data more quickly. This can be a necessity for certain storage or virtual machine migration applications. The measurements we now report were based on the scenarios outlined above.

4. Demonstrator hardware and software components

Figure 2 outlines the main components of the demonstrator. Three discrete datacenters (Datacenter A, Datacenter B & Datacenter C) were emulated with HP DL380p G8 hardware servers. The server – ‘Datacenter A’ - was nominated as the primary server that also hosted the OpenStack cloud management application and the Floodlight OpenFlow controller. All three servers ran Open vSwitch (version 1.4.0) and the nova compute component which provided the ability to instantiate multiple virtual machines.

Fig. 2 Demonstrator components.

Download Full Size | PDF

The inter-datacenter network consisted of three ADVA FSP3000 colorless ROADM (Reconfigurable Optical Add-Drop Multiplexer) nodes interconnected to form a ring using field-installed, ducted optical fiber within BT's network. A 10GbE client-side port of each ROADM was connected to a GbE Network Interface Card (NIC) in the server of the datacenter via an ADVA XG-210 demarcation / aggregation device. Two of the ROADMs were directionless (Node A and Node C) and one node was directed (Node B). The routes shown in Table 1 were chosen by the control plane for this constellation.

Table 1. Optical Paths between the Three Datacenters Chosen by the Control Plane

View Table

The version of Ubuntu server was changed to 12.04 LTS (3.5.0-41-generic kernel) from version 13.04 – because of the broader compatibility due to the more stable and extensive driver support available – that was used and reported previously [8]. The OpenStack version was updated to 2013.1.3 (‘Grizzly’) to include bug fixes released through the Ubuntu Cloud Archive. QEMU/KVM was used on each server as the virtualization environment together with Open vSwitch v.1.4.0 enabling the creation and interconnection of the VMs created in OpenStack. We further optimized the software interworking and timing performance by recompiling the ONC to solve some issues with the 32 bit software running on a 64 bit Linux system. A DHCP relay was installed in Datacenter B and Datacenter C to provide IP addresses via the out-of-band management network during times where no optical path was established across the data network. The out-of-band management network was used for management of the OpenStack cloud.

5. Logical components

Figure 3 presents four layers with the main logical components used in the demonstrator. The Application Layer, that included the OpenStack Horizon dashboard, used a RESTful (Representational State Transfer) API (Application Programming Interface) to invoke services from the OpenStack Orchestration Layer. This layer provided direct access to storage and compute pools on the HP hardware servers via OpenStack cinder and nova respectively. The Control Layer comprised a Floodlight controller with additional modules that provided the network backend to the OpenStack neutron plug-in via the Virtual Network Filter through a REST API. The additional modules that were incorporated into the Floodlight controller accessed the ADVA Optical Network Controller acting as an OpenFlow Virtual Switch Agent which, in turn, communicated with each FSP3000 node via SNMP (Simple Network Management Protocol). The Resource Layer was comprised of four main resource pools: Compute, Storage, Networking (emulated by virtual NICs and vSwitches) and optical transport. Automated device discovery and inventory by the Floodlight controller was via link layer discovery protocol (LLDP) for the Open vSwitches within the servers and the attached VM instances that were contained within each HP hardware server. The forwarding plane connectivity of the FSP3000 nodes was advertised via GMPLS (Generalized Multi-Protocol Label Switching) and polled (using SNMP) by the ADVA Optical Network Controller. A Link Inserter application provided, via a RESTful API or configuration file, the static connectivity mapping (patch cabling) between the FSP3000 optical nodes (which ordinarily appear as a virtual layer 2 switch to the Floodlight controller) and the Layer 2 network elements, because the FSP3000 optical nodes do not support inband LLDP-based topology discovery. Using the link inserter the connectivity between the ONC and the Open vSwitches in the datacenters was committed to Floodlight. The compute and storage inventory pool were accessible via the standard OpenStack Horizon dashboard.

Fig. 3 Logical components.

Download Full Size | PDF

6. Test procedure

A pusherAgent application is a simple graphical user interface (GUI) that enabled lightpath creation on-demand through a web interface. It sent requests for a flow (lightpath) to the Floodlight OpenFlow controller. The Floodlight OpenFlow controller translated this to a flow modification (flow mod) which in turn was forwarded to the optical network controller. The ONC triggered the setup of the requested lightpath.

The ONC presented the optical ring topology, consisting of three ROADMs, as one (virtual) switch to Floodlight OpenFlow controller. By creating port-based flows on this virtual switch a lightpath setup was triggered on the optical hardware. This enabled the OpenFlow controller and in turn OpenStack to create lightpaths on demand. The detailed steps involved during the setting-up of a lightpath were handled by the ONC.

Automated measurements were recorded using a combination of small shell scripts and python programs. The bandwidth flexing measurements recorded the time elapsed during the setting-up of a lightpath. Before each measurement was commenced, all pre-existing connectivity was torn down. Whereupon the measurements were started beginning with the creation of a flow using the pusherAgent. A traffic source sent 100 UDP packets per second. As soon as the first packet arrived at the receiver the measurements concluded and the elapsed time was noted. The reception of the first UDP packet corresponded to the instantiation of the lightpath and indicated that connectivity had been established at the transport layer. This test was repeated one hundred times to obtain the average set-up time.

To measure bandwidth steering which corresponds to switching over an existing lightpath we chose one lightpath source and two destinations. To keep the roles from the previous paragraph two datacenters were nominated as traffic sources and one as receiver. The receiver triggered a redirection of the lightpath. Each measurement comprised three components: the deletion of any pre-existing connection; the creation of a lightpath between the source and the other destination datacenter; and finally waiting for the first UDP packet to be received to confirm the flow of traffic along the datapath. This was followed by a measurement of the bandwidth steering in the opposite direction. The senders had the same functionality and packet creation rate as above. Once again each measurement was repeated one hundred times.

7. Measurements

The measurements we report are divided into three sections: 1) Bandwidth flexing; 2) Bandwidth Steering and 3) Storage Migration. First the isolated results for the two use cases bandwidth flexing and bandwidth steering are presented. Then a big data transfer based on a requested lightpath is evaluated before we assess the results.

7.1 Bandwidth flexing

The first set of measurements considered the time elapsed for bandwidth flexing. Due to the current limitations of OpenStack and the SDN controller flexing was modeled as simple lightpath activation. Measurements commenced with a request issued by the user for a new connection between two datacenters, e.g. Datacenter A ↔ Datacenter B. The optical connection was established following this request and the time was stopped on the reception of the first packet on the receiver side. The setup time depended on the number of hops on the route which was chosen for the lightpath creation and included all the intermediate steps following the request. The time distribution of recorded values is shown on the left hand side of Fig. 4. The optical path assigned between Datacenter A ↔ Datacenter B was direct and was measured as 15.8s on average. The optical path assigned between Datacenter A ↔ Datacenter C and Datacenter B ↔ Datacenter C included an additional optical hop. This influenced the results namely increasing the mean setup time to 24.3s and 25.2s. The additional hop imposed a penalty of between eight to ten seconds. The differences in the two hop cases can be traced back to the different fiber and hardware properties on the two different routes.

Fig. 4 Frequency distribution of the measurement results for bandwidth flexing (left) and bandwidth steering (right).

Download Full Size | PDF

7.2 Bandwidth steering

The second set of measurements performed considered the steering of connections in order to release and re-use resources to improve the network utilization. In this case the connectivity between one source and two destination DCs was toggled back and forth. Measurements commenced with the request for a teardown of the existing connection immediately followed by a request to establish a lightpath to the other datacenter. It terminated with reception of the first incoming packet from the remote datacenter.

Figure 4 (RHS) illustrates that the results can be grouped into three regions, each one containing two peaked distributions. Region 1 was centered at ~44s and showed the fastest toggle times which we attribute to the one hop optical path which can be setup more rapidly than a two hop optical path. Region 2 was centered at ~48s. The crucial difference here was that the one hop optical path had to be torn down and a new two hop optical path created. This led to slightly slower setup times which in turn caused an additional 4s delay. Region 3 described a two hop tear down and two-hop setup. It consistently was the slowest and accounted for a group mean of ~52s.

7.3 Storage migration

Data transfer – storage migration - between two datacenters was then investigated. The identified steps for simulating this transfer are depicted left hand side by the flowchart in Fig. 5. With the source VM running in Datacenter A, the actual workflow started with the creation of a second VM in Datacenter B followed by a request for a lightpath. 100 GiB of data were transferred using the shell tool nc. The lightpath was released upon conclusion of the data transfer and the source VM was then terminated. A Wireshark representation of the OpenFlow control messages and related communications captured with tcpdump as the measurements progressed is shown on the right hand side of Fig. 5 and explained next.

Fig. 5 Data transfer flowchart and trace.

Download Full Size | PDF

The first box depicts the setup of the lightpath. The pusherAgent is called via an HTTP-request (Hypertext Transfer Protocol) and its timestamp is set as reference point for the following packets. It triggers Floodlight’s staticflowentrypusher to send the flow mod to the optical VSwitch which is confirmed in the fourth line. This process is repeated because a lightpath corresponds to flows in both directions (line seven). The GUI updates its information in the next lines until the initial request is confirmed by the last HTTP-message.

The second box shows the start of the data transmission. It starts with the TCP [SYN] request to initialize the transfer. This leads to a packet-in at the first Open vSwitch and is answered by the controller with a packet-out. Additionally a flow mod is sent to the Open vSwitch on the second node. The packet reaches the second node in the next lines and triggers another packet-in and packet-out. Then the TCP connection is confirmed by the other side through a [SYN,ACK]. With the following [ACK] message the transmission starts. The end of the transmission is visualized in the third box. The transmission in this exemplary recording took 921 seconds and the TCP connection is torn down at the end by [FIN,ACK].

After this the pusherAgent is used once again for deleting the flows and thereby tearing down the lightpath. The process is shown in the fourth box. There are two calls to the pusherAgent, one for every direction. It sends a delete request to Floodlight’s staticflowentrypusher which results in a flow mod deleting the flow. The first HTTP- request is finally confirmed in the last line.

The measurements revealed that ~16min30s elapsed to complete the data migration. As expected a high proportion (~92%) of the elapsed time of 15min15s was spent exclusively on data transfer, limited by the GbE NICs used in the experiments.

7.4 Assessment

The simulated data migration shows that for transferring large amounts of data it can be very attractive to acquire additional bandwidth for a limited time. The improvement will be more pronounced if higher-performance NICs with greater data throughput are used. In the case of OpenStack a solution for including a flexible network for VM migration is essential. Only then can the effect of a fast and flexible lightpath setup demonstrate the benefits that would accrue by enabling the dynamic usage of the available bandwidth capacity. With high bandwidth-on-demand services new revenue streams become possible for datacenter owners with concomitant benefits. With the possibility of steering connections a reduced amount of equipment can be used more efficiently and the utilization of underused fiber resources can be increased.

8. Conclusions

We show the first fully open-sourced, SDN-based orchestration of datacenter and optical network resources over field installed optical fiber. Use of OpenStack and OpenFlow to control commercially available equipment paves the way towards a new era of dynamic application and network automation where the results showed that provisioning times were reduced to tens of seconds - a major improvement over typical provisioning times (days, months).

Acknowledgments

The research leading to these results has received funding from the European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement n.317999 in the project IDEALIST and under grant agreement n.608528 in the project STRAUSS. We greatly appreciate Spyros Christou from Openreach & Eli Karpilovski and Marina Varshaver from Mellanox for useful discussions.

References and links

1. J. Elbers and A. Autenrieth, “Extending network virtualization into the optical domain,” in Optical Fiber Communication Conference/National Fiber Optic Engineers Conference 2013, OSA Technical Digest (online) (Optical Society of America, 2013), paper OM3E.3. [CrossRef]

2. S. Gringeri, N. Bitar, and T. J. Xia, “Extending software defined network principles to include optical transport,” IEEE Commun. Mag. 51(3), 32–40 (2013). [CrossRef]

3. D. Mcdysan, “Software defined networking opportunities for transport,” IEEE Commun. Mag. 51(3), 28–31 (2013). [CrossRef]

4. N. McKeown, T. Anderson, H. Balakrishnan, G. Parulkar, L. Peterson, J. Rexford, S. Shenker, and J. Turner, “OpenFlow: enabling innovation in campus networks,” SIGCOMM Comput. Commun. Rev. 38(2), 69–74 (2008). [CrossRef]

5. Open Networking Foundation, “Software-defined networking: the new norm for networks,” ONF White Paper (2012), https://www.opennetworking.org/images/stories/downloads/sdn-resources/white-papers/wp-sdn-newnorm.pdf.

6. S. Mizuno, H. Sakai, D. Yokozeki, K. Iida, and T. Koyama, “IaaS platform using OpenStack and OpenFlow overlay technology,” NTT Tech. Rev. 10(12), 1–7 (2012).

7. J. Zhang, Y. Zhao, H. Yang, Y. Ji, H. Li, Y. Lin, G. Li, J. Han, Y. Lee, and T. Ma, “First demonstration of enhanced software defined networking (eSDN) over elastic grid (eGrid) optical networks for data center service migration,” in Optical Fiber Communication Conference/National Fiber Optic Engineers Conference 2013, OSA Technical Digest (online) (Optical Society of America, 2013), paper PDP5B.1. [CrossRef]

8. T. Szyrkowiec, A. Autenrieth, P. Gunning, P. Wright, A. Lord, J. Elbers, and A. Lumb, “First field demonstration of cloud datacenter workflow automation employing dynamic optical transport network resources under OpenStack & OpenFlow orchestration,” presented at the 39th European Conference and Exhibition on Optical Communication (ECOC 2013), 22–26 Sept. 2013. [CrossRef]

9. M. Auster, N. Damouny, and J. Harcourt, “OpenFlow ™ - enabled hybrid cloud services connect enterprise and service provider data centers,” ONF Solution Brief (2012), https://www.opennetworking.org/images/stories/downloads/sdn-resources/solution-briefs/sb-hybrid-cloud-services.pdf.

Required connectivity	Optical path assignment
Datacenter A to Datacenter B	Node A ↔ Node B
Datacenter A to Datacenter C	Node A ↔ Node B ↔ Node C
Datacenter B to Datacenter C	Node B ↔ Node A ↔ Node C

First field demonstration of cloud datacenter workflow automation employing dynamic optical transport network resources under OpenStack and OpenFlow orchestration

Abstract

1. Introduction

2. Motivation and benefit

3. Use cases

4. Demonstrator hardware and software components

5. Logical components

6. Test procedure

7. Measurements

7.1 Bandwidth flexing

7.2 Bandwidth steering

7.3 Storage migration

7.4 Assessment

8. Conclusions

Acknowledgments

References and links

Cited By

Figures (5)

Tables (1)

Optics Express