Supporting Virtualization Standard for Network …ceur-ws.org/Vol-1464/ewili15_13.pdfSupporting...

Supporting Virtualization Standard for Network Devices inRTEMS Real-Time Operating System

Jin-Hyun KimDepartment of Smart ICT

ConvergenceKonkuk University

Seoul 143-701, [email protected]

Sang-Hun LeeDepartment of ComputerScience and Engineering

Konkuk UniversitySeoul 143-701, Korea

[email protected]

Hyun-Wook JinDepartment of ComputerScience and Engineering

Konkuk UniversitySeoul 143-701, [email protected]

ABSTRACTThe virtualization technology is attractive for modern em-bedded systems in that it can ideally implement resourcepartitioning but also can provide transparent software devel-opment environments. Although hardware emulation over-heads for virtualization have been reduced significantly, thenetwork I/O performance in virtual machine is still not sat-isfactory. It is very critical to minimize the virtualizationoverheads especially in real-time embedded systems, becausethe overheads can change the timing behavior of real-timeapplications. To resolve this issue, we aim to design andimplement the device driver of RTEMS for the standardizedvirtual network device called virtio. Our virtio device drivercan be portable across different Virtual Machine Monitors(VMMs) because our implementation is compliant with thestandard. The measurement results clearly show that ourvirtio can achieve comparable performance to the virtio im-plemented in Linux while reducing memory consumption fornetwork buffers.

Categories and Subject DescriptorsD.4.7 [Operating Systems]: Organization and Design—real-time systems and embedded systems

General TermsDesign, Performance

KeywordsNetwork virtualization, RTEMS, Real-time operating sys-tem, virtio, Virtualization

1. INTRODUCTIONThe virtualization technology provides multiple virtual ma-chines on a single device, each of which can run own operat-ing system and applications over emulated hardware in anisolated manner [19]. The virtualization has been applied tolarge-scale server systems to securely consolidate different

EWiLi’15, October 8th, 2015, Amsterdam, The Netherlands.Copyright retained by the authors.

services with high system utilization and low power con-sumption. As modern complex embedded systems are alsofacing the size, weight, and power (SWaP) issues, researchersare trying to utilize the virtualization technology for tem-poral and spatial partitioning [5, 21, 12]. In the partitionedsystems, a partition provides an isolated run-time environ-ment with respect to processor and memory resources; thus,virtual machines can be exploited to efficiently implementpartitions. Moreover, the virtualization can provide a trans-parent and efficient development environment for embeddedsoftware [10]. For example, if the number of target hard-ware platforms is smaller than that of software developers,they can work with virtual machines that emulate the targethardware system.

A drawback of virtualization, however, is the overhead forhardware emulation, which causes higher software executiontime. Although the emulation performance of instructionsets has been significantly improved, the network I/O per-formance in virtual machine is still far from the ideal perfor-mance [13]. It is very critical to minimize the virtualizationoverheads especially in real-time embedded systems, becausethe overheads can increase the worst-case execution time andjitters, thus changing the timing behavior of real-time ap-plications. Few approaches to improve the performance ofnetwork I/O virtualization in the context of embedded sys-tems have been suggested, but these are either proprietaryor hardware-dependent [7, 6].

In order to improve the network I/O performance, usually aparavirtualized abstraction layer is exposed to the devicedriver running in the virtual machine. Then the devicedriver explicitly uses this abstraction layer instead of access-ing the original I/O space. This sacrifices the transparencyof whether the software knows it runs on a real machine ora virtual machine, but can improve the network I/O per-formance avoiding hardware emulation. It is desirable touse the standardized abstraction layer to guarantee porta-bility and reliability; otherwise, we would have to modifyor newly implement the device driver for different VirtualMachine Monitors (VMMs) and have to manage differentversions of device driver.

In this paper, we aim to design and implement the vir-tio driver for RTEMS [2], a Real-Time Operating System(RTOS) used in spacecrafts and satellites. virtio [17] is thestandardized abstraction layer for paravirtualized I/O de-

vices and is supported by several well-known VMMs, such asKVM [8] and VirtualBox [1]. To the best of our knowledge,this is the first literature that presents detail design issuesof the virtio front-end driver for RTOS. Thus, our studycan provide insight into design choices of virtio for RTOS.The measurement results clearly show that our virtio canachieve comparable performance to the virtio implementedin Linux. We also demonstrate that our implementation canreduce memory consumption without sacrificing the networkbandwidth.

The rest of the paper is organized as follows: In Section2, we give an overview of virtualization and virtio. We alsodiscuss related work in this section. In Section 3, we describeour design and implementation of virtio for RTEMS. Theperformance evaluation is done in Section 4. Finally, weconclude this paper in Section 5.

2. BACKGROUNDIn this section, we give an overview of virtualization anddescribe virtio, the virtualization standard for I/O devices.In addition, we discuss the state-of-the-art for network I/Ovirtualization.

2.1 Overview of Virtualization and virtioThe software that creates and runs the virtual machines iscalled VMM or hypervisor. The virtualization technology isgenerally classified into full-virtualization and paravirtual-ization. The full-virtualization allows legacy operating sys-tem to run in virtual machine without any modifications.To do this, VMMs of full-virtualization usually perform bi-nary translation and emulate every detail of physical hard-ware platforms. KVM and VirtualBox are examples of full-virtualization VMMs. On the other hand, VMMs of paravir-tualization provide guest operating systems with program-ming interfaces, which are similar to the interfaces providedby hardware platforms but much simpler and lighter. Thus,the paravirtualization requires modifications of guest oper-ating systems and can present better performance than full-virtualization. Xen [3] and XtratuM [12] are examples ofparavirtualization VMMs.

virtio is the standard for virtual I/O devices. It was initiallysuggested by IBM [17] and recently became an OASIS stan-dard [18]. The virtio standard defines paravirtualized in-terfaces between front-end and back-end drivers as shown inFig. 1. The paravirtualized interfaces include two virtqueuesto store send and receive descriptors. Because virtqueues arelocated in a shared memory between front-end and back-enddrivers, the guest operating system and VMM can directlycommunicate each other without hardware emulation. ManyVMMs, such as KVM, VirtualBox, and XtratuM, supportvirtio or its modification. General-purpose operating sys-tems, such as Linux and Windows, implement the virtiofront-end driver.

2.2 Related WorkThere has been significant research on network I/O virtual-ization. The most of existing investigations are, however, fo-cusing on the performance optimization for general-purposeoperating systems [20, 15, 22, 16, 4]. Especially, the ap-proaches that require supports from network devices are not

Figure 1: virtio.

suitable for embedded systems, because their network con-trollers are not equipped with sufficient hardware resourcesto implement multiple virtual network devices. Thoughthere is an architectural research on efficient network I/Ovirtualization in the context of embedded systems [6], italso highly depends on the assist from network controller.The software-based approach for embedded system has beenstudied in a very limited scope that does not consider thestandardized interfaces for network I/O virtualization [7].

Compared to existing research, virtio can be differentiatedin that it does not require hardware support and can bemore portable [17, 18]. The studies for virtio have mainlydealt with the back-end driver [11, 14]. However, there areseveral additional issues for the front-end driver on RTOSdue to inherent structural characteristics of RTOS and theresource constraint of embedded systems. In this paper, wefocus on the design and implementation issues of the virtiofront-end driver for RTOS.

3. VIRTIO FOR RTEMSIn this section, we suggest the design of virtio front-enddriver for RTEMS. Our design can efficiently handle hard-ware events generated by the back-end driver and mitigatememory consumption for network buffers. We have imple-mented the suggested design on the experimental systemthat runs RTEMS (version 4.10.2) over the KVM hypervi-sor as described in Section 4.1, but it is general enough toapply to other system setups.

3.1 InitializationThe virtio network device is implemented as a PCI device.Thus, the front-end driver obtains the information of the vir-tual network device through PCI configuration space. Oncethe registers of the virtio device are found in the configu-ration space, the driver can access the I/O memory of thevirtio device by using the Base Address Register (BAR). Thevirtio header, which has the layout shown in Fig. 2, locates

Figure 2: virtio header in I/O memory.

in that I/O memory region and is used for initialization.

Our front-end driver initializes the virtio device through thevirtio header as specified in the standard. For example,the driver decides the size of the virtqueues by reading thevalue in Queue Size region. Then the driver allocates thevirtqueues in the guest memory area and lets the back-enddriver know the base addresses of the virtqueues by writingthese to the Queue Address region. Thus, both front-endand back-end drivers can directly access the virtqueues bymeans of memory referencing without expensive hardwareemulation.

The front-end driver also initializes the function pointers ofthe general network driver layer of RTEMS with the actualnetwork I/O functions implemented by the front-end driver.For example, the if_start pointer is initialized by the func-tion that transmits a message through the virtio device.This function adds a send descriptor to the TX virtqueueand notifies it to the back-end driver. If the TX virtqueueis full, this function intermediately queues the descriptor tothe interface queue described in Section 3.2.

3.2 Event HandlingThe interrupt handler is responsible for hardware events.However, since the interrupt handler is expected to finishimmediately relinquishing the CPU resources as soon as pos-sible, the actual processing of hardware events usually takesplace later. In general-purpose operating systems, such de-layed event handling is performed by the bottom half thatexecutes in interrupt context with a lower priority than theinterrupt handler. In regard to network I/O, demultiplex-ing of incoming messages and handling of acknowledgmentpackets are the examples that the bottom half performs.However, RTOS usually do not implement a framework forbottom half; thus, we have to use a high-priority thread asa bottom half. The interrupt handler sends a signal to thisthread to request the actual event handling, where thereis a tradeoff between signaling overhead and size of inter-rupt handler. If the bottom half thread handles every hard-ware event aiming for a small interrupt handler, the signal-ing overhead can increase in proportional to the number ofinterrupts. For example, it takes more than 70 µs per in-terrupt in RTEMS for signaling and scheduling between athread and an interrupt handler on our experimental system

Figure 3: Hardware event handling.

described in Section 4.1. On the other hand, if the interrupthandler takes care of most of events to reduce the signalingoverhead, the system throughput can be degraded, becauseinterrupt handlers usually disable interrupts during its exe-cution.

In our design, the interrupt handler is only responsible formoving the send/receive descriptors between interface queuesand virtqueues when the state of virtqueues changes. Fig. 3shows the sequence of event handling, where the interfacequeues are provided by RTEMS and used to pass networkmessages between the device driver and upper-layer proto-cols. When a hardware interrupt is triggered by the back-end driver, the interrupt handler first checks if the TX virtqu-eue has available slots for more requests, and moves thesend descriptor that is stored in the interface queue waitingfor the TX virtqueue to become available (steps 1 and 2 inFig. 3). Then the interrupt handler sees whether the RXvirtqueue has used descriptors for incoming messages, andmoves these to the interface queue (steps 3 and 4 in Fig. 3).Finally, the interrupt handler sends a signal to the bottomhalf thread (step 5 in Fig. 3) so that the actual processing forreceived messages can be processed later (step 6 in Fig. 3).It is noteworthy that the interrupt handler handles multipledescriptors at a time to reduce the number of signals. Inaddition, we suppress the interrupts with the aid from theback-end driver.

3.3 Network Buffer AllocationOn the sender side, the network messages are intermedi-ately buffered in the kernel due to the TCP congestion andflow controls. As the TCP window moves, the buffered mes-sages are sent as many as the TCP window allows. Thus, alarger number of buffered messages can easily fill the win-dow size and can achieve higher bandwidth. On the receiverside, received messages are also buffered in the kernel un-til the destination task becomes ready. A larger memoryspace to keep the received messages also can enhance thebandwidth, because it increases the advertised window sizein flow control. Although a larger TCP buffer size is benefi-cial for network bandwidth, the operating system limits thetotal size of messages buffered in the kernel to prevent themessages from exhausting memory resources. However, wehave observed that the default TCP buffer size of 16 KBytein RTEMS is not sufficient to fully utilize the bandwidthprovided by Gigabit Ethernet. Therefore, in Section 4.2,

Figure 4: Controlling the number of preallocatedreceive buffers.

we heuristically search the optimal size of the TCP bufferthat promises high bandwidth without excessively wastingmemory resources.

Moreover, we control the number of preallocated receivebuffers (i.e., mbuf ). The virtio front-end driver is supposedto preallocate a number of receive buffers that matches theRX virtqueue, each of which occupies 2 KByte of memory.The descriptors of the preallocated buffers are enqueued atthe initialization phase so that the back-end driver can di-rectly place incoming messages in those buffers. This canimprove the bandwidth by reducing the number of inter-rupts, but reserved memory areas can waste memory re-sources. Therefore, it is desirable to size the RX virtqueuebased on the throughput of the front and back-end drivers.If the front-end driver can process more messages than theback-end driver, we do not need a large number of preallo-cated receive buffers. However, we need a sufficient numberof buffers if the front-end driver slower than the back-enddriver. The back-end driver of KVM requires 256 preallo-cated receive buffers, but we have discovered that 256 buffersare excessively large for Gigabit Ethernet as discussed inSection 4.2.

Fig. 4 shows how we control the number of preallocated re-ceive buffers. The typical RX virtqueue has the used ringand available ring areas, which store descriptors for usedand unused preallocated buffers, respectively. Our front-enddriver introduces the empty ring area to the RX virtqueuein order to limit the number of preallocated buffers. At theinitialization phase, we fill the descriptors with preallocatedbuffers until desc_head_idx reaches to the threshold definedas sizeof(virtqueue)−sizeof(empty ring). Then, wheneverthe interrupt handler is invoked, it enqueues the descriptorsof new preallocated buffers as many as vring_used.idx −used_cons_idx (i.e., size of used ring). The descriptors inthe used ring are retrieved by the interrupt handler as men-tioned in Section 3.2. Thus, the size of empty ring is con-stant. We show the detail analysis of tradeoff between theRX virtqueue size and network bandwidth in Section 4.2.

4. PERFORMANCE EVALUATIONIn this section, we analyze the performance of our designsuggested in the previous section. First, we measure the im-pact of network buffer size on network bandwidth. Then, wecompare the bandwidth and latency of our implementationwith those of Linux.

Figure 5: Experimental System Setup.

4.1 Experimental System SetupWe implemented the suggested design in RTEMS version4.10.2. The mbuf space was configured in huge mode sothat it is capable of preallocating 256 mbufs for the RXvirtqueue. We measured the performance of the virtio ontwo nodes that were equipped with Intel i5 and i3 proces-sors, respectively, as shown in Figure 5. The Linux version3.13.0 was installed to the former, and we installed the Linuxversion 3.16.6 on the other node. The two nodes are con-nected directly through Gigabit Ethernet.

We ran the ttcp benchmarking tool to measure the networkbandwidth with 1448 Byte messages, which is the maximumuser message size that can fit into one TCP segment overEthernet. We separately measured the send bandwidth andreceive bandwidth of virtio by running a virtual machineonly on the i5 node with KVM. We reported the bandwidthon the i3 node that ran Linux without virtualization. Theintention behind such experimental setup is to measure theperformance with the real timer because the virtual timerin virtual machines is not accurate [9].

4.2 Impact of Network Buffer SizeAs mentioned in Section 3.3, we analyzed the impact of net-work buffer size on bandwidth. Fig. 6 shows the variationof send bandwidth with different sizes of the TCP buffer.We can observe that the bandwidth increases as the kernelbuffer size increases. However, the bandwidth does not in-crease anymore after 68 KByte of TCP buffer size, because68 KByte is sufficient to fill the network pipe of GigabitEthernet. Fig. 7 shows the experimental results for receivebandwidth, which also suggests 68 KByte as the minimumbuffer size for the maximum receive bandwidth. Thus, weincreased the TCP buffer size from 16 KByte to 68 KBytefor our virtio.

We also measured the network bandwidth varying the sizeof the RX virtqueue as shown in Fig. 8. This figure showsthat the network bandwidth increases until the virtqueuesize becomes only 8. Thus, we do not need more than 8preallocated buffers for Gigabit Ethernet though the defaultvirtqueue size is 256.

In summary, we increased the in-kernel send and receiveTCP buffer sizes from 16 KByte to 68 KByte, which requiresadditional memory resources of 104 KByte (= (68−16)×2)for higher network bandwidth. However, we reduced thenumber of preallocated receive buffers from 256 to 8 withoutsacrificing the network bandwidth, which saved 496 KByte(= 256 × 2 − 8 × 2) of memory, where the size of mbuf is2 KByte as mentioned in Section 3.3. Thus, we saved 392

Figure 6: Tradeoff between TCP buffer size and sendbandwidth.

Figure 7: Tradeoff between TCP buffer size and re-ceive bandwidth.

KByte (= 496− 104) in total while achieving the maximumavailable bandwidth over Gigabit Ethernet.

4.3 Comparison with LinuxWe compared the performance of RTEMS-virtio with that ofLinux-virtio to see if our virtio can achieve comparable per-formance to the optimized one for general-purpose operatingsystem. As shown in Fig. 9, the unidirectional bandwidthof RTEMS-virtio is almost the same with that of Linux-virtio, which is near the maximum bandwidth the physicalhardware can provide in one direction. Thus, these resultsshow that our implementation can provide quite reasonableperformance with respect to bandwidth.

We also measured the round-trip latency in a way that twonodes send and receive the same size message in a ping-pong manner repeatedly for a given number of iterations.We reported the average, maximum, and minimum laten-cies for 10,000 iterations. Fig. 10 shows the measurementresults for 1 Byte and 1448 Byte messages. As we can see inthe figure, the average and minimum latencies of RTEMS-virtio are comparable to those of Linux-virtio. However, themaximum latency of RTEMS-virtio is significantly smallerthan that of Linux-virtio (the y-axis is log scale) meaningRTEMS-virtio has a lower jitter. We always observed the

Figure 8: Tradeoff between RX virtqueue size andreceive bandwidth.

Figure 9: Comparison of bandwidth.

maximum latency in the first iteration of every measure-ment for both RTEMS and Linux. Thus, we presume thatthe lower maximum latency of RTMES is due to its smallerworking set.

5. CONCLUSIONS AND FUTURE WORKIn this paper, we have suggested the design of the virtiofront-end driver for RTEMS. The suggested device drivercan be portable across different Virtual Machine Monitors(VMMs) because our implementation is compliant with thevirtio standard. The suggested design can efficiently han-dle hardware events generated by the back-end driver andcan reduce memory consumption for network buffers, whileachieving the maximum available network bandwidth overGigabit Ethernet. The measurement results have showedthat our implementation can save 392 KByte of memoryand can achieve comparable performance to the virtio im-plemented in Linux. Our implementation also has a smallerjitter of latency than Linux thanks to smaller working set ofRTEMS. In conclusion, this study can provide insights intovirtio from the viewpoint of the RTOS. Furthermore, thediscussions in this paper can be extended to apply to otherRTOS running in virtual machine to improve the networkperformance and portability.

Figure 10: Comparison of latency.

As future work, we plan to measure the performance of ourvirtio on a different VMM, such as VirtualBox, to showthat our implementation is portable across different VMMs.Then we will release the source code. In addition, we in-tend to extend our design for dynamic network buffer sizingand measure the performance on real-time Ethernet, suchas AVB.

6. ACKNOWLEDGMENTSThis research was partly supported by the National SpaceLab Program (#2011-0020905) funded by the National Re-search Foundation (NRF), Korea, and the Education Pro-gram for Creative and Industrial Convergence (#N0000717)funded by the Ministry of Trade, Industry and Energy (MOT-IE), Korea.

7. REFERENCES[1] Oracle VM VirtualBox. http://www.virtualbox.org.

[2] RTEMS Real Time Operating System (RTOS).http://www.rtems.org.

[3] P. Barham, B. Dragovic, K. Fraser, S. Hand,T. Harris, A. Ho, R. Neugebauer, I. Pratt, andA. Warfield. Xen and the art of virtualization. ACMSIGOPS Operating Systems Review, 37(5):164–177,2003.

[4] Y. Dong, X. Yang, J. Li, G. Liao, K. Tian, andH. Guan. High performance network virtualizationwith sr-iov. Journal of Parallel and DistributedComputing, 72(11):1471–1480, 2012.

[5] S. Han and H.-W. Jin. Resource partitioning forintegrated modular avionics: comparative study ofimplementation alternatives. Software: Practice andExperience, 44(12):1441–1466, 2014.

[6] C. Herber, A. Richter, T. Wild, and A. Herkersdorf. Anetwork virtualization approach for performanceisolation in controller area network (can). In 20thIEEE Real-Time and Embedded Technology andApplications Symposium (RTAS), 2014.

[7] J.-S. Kim, S.-H. Lee, and H.-W. Jin. Fieldbusvirtualization for integrated modular avionics. In 16thIEEE Conference on Emerging Technologies &Factory Automation (ETFA), 2011.

[8] A. Kivity, Y. Kamay, D. Laor, U. Lublin, and

A. Liguori. kvm: the linux virtual machine monitor. InLinux Symposium, pages 225–230, 2007.

[9] S.-H. Lee, J.-S. Seok, and H.-W. Jin. Barriers toreal-time network i/o virtualization: Observations ona legacy hypervisor. In International Symposium onEmbedded Technology (ISET), 2014.

[10] P. S. Magnusson, M. Christensson, J. Eskilson,D. Forsgren, G. Hallberg, J. Hogberg, F. Larsson,A. Moestedt, and B. Werner. Simics: A full systemsimulation platform. IEEE Computer, 35(2):50–58,2002.

[11] M. Masmano, S. Peiro, J. Sanchez, J. Simo, andA. Crespo. Io virtualisation in a partitioned system. In6th Embedded Real Time Software and SystemsCongress, 2012.

[12] M. Masmano, I. Ripoll, A. Crespo, and J. Metge.Xtratum: a hypervisor for safety critical embeddedsystems. In 11th Real-Time Linux Workshop, pages263–272, 2009.

[13] A. Menon, A. L. Cox, and W. Zwaenepoel. Optimizingnetwork virtualization in xen. In USENIX AnnualTechnical Conference, pages 15–28, 2006.

[14] G. Motika and S. Weiss. Virtio networkparavirtualization driver: Implementation andperformance of a de-facto standard. ComputerStandards & Interfaces, 34(1):36–47, 2012.

[15] H. Raj and K. Schwan. High performance and scalablei/o virtualization via self-virtualized devices. In ACMSymposium on High-Performance Parallel andDistributed Computing, June 2007.

[16] K. K. Ram, J. R. Santos, Y. Turner, A. L. Cox, andS. Rixner. Achieving 10gbps using safe andtransparent network interface virtualization. In ACMSIGPLAN/SIGOPS International Conference onVirtual Execution Environments (VEE), March 2009.

[17] R. Russell. virtio: towards a de-facto standard forvirtual i/o devices. ACM SIGOPS Operating SystemsReview, 42(5):95–103, 2008.

[18] R. Russell, M. S. Tsirkin, C. Huck, and P. Moll.Virtual I/O Device (VIRTIO) Version 1.0. OASIS,2015.

[19] L. H. Seawright and R. A. MacKinnon. Vm/370 - astudy of multiplicity and usefulness. IBM SystemsJournal, 18(1):4–17, 1979.

[20] J. Sugerman, G. Venkitachalam, and B.-H. Lim.Virtualizing i/o devices on vmware workstation’shosted virtual machine monitor. In USENIX AnnualTechnical Conference, June 2001.

[21] S. H. VanderLeest. Arinc 653 hypervisor. In 29thIEEE/AIAA Digital Avionics Systems Conference(DASC), 2010.

[22] L. Xia, J. Lange, and P. Dinda. Towards virtualpassthrough i/o on commodity devices. In Workshopon I/O Virtualization (WIOV), December 2008.

Supporting Virtualization Standard for Network …ceur-ws.org/Vol-1464/ewili15_13.pdfSupporting...

Documents

Transcript of Supporting Virtualization Standard for Network …ceur-ws.org/Vol-1464/ewili15_13.pdfSupporting...