|
Modern commodity servers are able to saturate 10 Gb/s Ethernet networks. However, VM guests incur significant additional overheads due to context-switching, data movement and passing packets between the host domain and guests. The Xen virtual machine architecture imposes relatively low overheads on networking compared with other virtualization technologies, yet its performance is well below that of native operating systems. The privileged host VM quickly becomes a bottleneck as all network traffic must pass through it. This I/O bottleneck makes virtualization effectively impractical for certain classes of application, including many HPC applications and high performance servers. As 10 Gb/s Ethernet moves more into the mainstream, the set of applications for which this is a barrier to adoption of virtualization will likely increase. This paper shows how allowing guest operating systems direct access to the I/O hardware eliminates the bottleneck and software overheads usually associated with virtualized I/O, and allows a virtualized guest to achieve performance comparable with a system running natively. Doing so requires additional support from the hardware in order to multiplex the device between multiple guests that may access it concurrently and also to enforce isolation so that guests cannot gain privileges or compromise system integrity. The rest of this paper is structured as follows. Section 2 outlines the architecture for accelerated networking that we have added to Xen, while Section 3 expands to describe our implementation and the results we have obtained with a virtualization-aware network adapter. Section 4 shows how the same techniques can be applied to user-level applications. Finally, Section 5 concludes.
Architecture Xen Paravirtualized Network I/O Paravirtualized network I/O in Xen is achieved through a pair of interlinked drivers; netfront the \frontend driver" in the guest, and netback the \backend driver" in the host domain. The frontend and backend communicate through a region of shared memory and send each other virtual interrupts using event channels. Together these form a channel that supports the transfer of packets between host domain and guest. The upper edge of the frontend driver presents the interface of a standard network device driver, allowing it to interface to the bottom of the guest's network stack. The backend appears likewise and is usually configured to connect to a software bridge in the host OS. This allows it to communicate with the host's network stack, other virtual machines' backend drivers, and physical network interfaces and so the network beyond. Packets that arrive from a physical network interface are routed by the bridge to the appropriate backend drivers, which in turn forward them to the corresponding guests' frontend drivers. These then pass them on to the network stack as if they had arrived directly at the guests. Packets sent by guests follow the same path in reverse. Packets sent from one guest to another are routed between the corresponding backend drivers by the bridge.
Acceleration Architecture We have extended the Xen netfront/netback architecture with a plugin interface that provides an opportunity to accelerate network performance. We have preserved the existing netfront/netback channel and added an optional \fast path" implemented by the plugins. To account for as many different kinds of hardware as possible (existing and future designs), the plugin interface makes as few assumptions as possible. By preserving the existing netfront/netback mechanism it is possible to support migration and also hardware that accelerates only a subset of traffic with other traffic taking the \slow path". Various acceleration techniques are possible, but the main model anticipated is for the plugin driver in the guest to access the network adapter hardware directly to send and receive packets, bypassing the host domain and associated overheads. This is illustrated in Figure 1. The netfront and netback drivers each accept an \accelerator plugin." The frontend accelerator communicates with netfront and implements the data path, whereas the backend accelerator communicates with netback and handles control functions.2 See Section 3 for details of our implementation of accelerators for the Solarstorm SFC4000 controller. Accelerated Transmit Path. Packets to be transmitted are passed from the guest kernel to the netfront driver. If an accelerated plugin is present it is given the opportunity to send the packet via the fast path, and it indicates whether or not it did so.
|