|
Microsoft SQL Sever has enjoyed phenomenal success as a database server. Its relatively low cost, steadily increasing functionality,1 and ease of deployment have all combined to accelerate its growth. However, that same growth has led to a phenomenon that IT administrators commonly call SQL Server “sprawl”—the rampant, often uncontrolled, proliferation of SQL Server databases, whether under the purview of centralized IT administration, or tucked away under desks or in closets where departmental users have placed them independently of their IT departments. This sprawl is costly because it can result in inefficient use of hardware, software, and administrative resources. Certainly, there are hardware and maintenance costs that are magnified by this kind of inefficiency, but as we are all becoming aware, poor utilization can also rapidly increase energy consumption. And, not least, server sprawl can also rob SQL Server application users of resource availability and business productivity. HP’s PolyServe Software for Microsoft SQL Server is a clustered file system configuration that aims to combat SQL Server sprawl, consolidate SQL Server environments, and improve application user resource availability and response time. According to users we recently interviewed, it successfully accomplishes these goals, and has advantages over better-known server virtualization approaches. In this paper, we discuss HP PolyServe’s inner workings, some of the relevant user experiences gleaned from our discussions, and how HP PolyServe compares to alternative approaches.
Clustering the File System Speaking generically, HP PolyServe is a clustered file system (CFS). A CFS enables sharing of volumes, file systems, and even individual files among applications on multiple servers as though they were running on a single system. A typical single-host file system coordinates access to data among multiple applications running concurrently, and maintains “metadata” to keep track of where directories and files are physically located; their ownership and access rights; when they were created, modified, and accessed; and other housekeeping information. Most file systems store their metadata on the same physical disks that contain the files, using a formally-defined “on-disk file structure.” A cluster file system extends that same functionality across multiple servers, while keeping application programming interfaces (APIs) largely unchanged. The best-performers allow direct I/O access to files on shared disks by each participating server. To accomplish this, a server’s CFS manipulates on-disk file structures the same way, but coordinates its activities with other cluster nodes—typically through the use of some form of “lock manager.” In the case of HP PolyServe, a Distributed Lock Manager (DLM) allows concurrent read/write access to the underlying data, and can even provide locks at the byte level. HP’s PolyServe team worked directly with Microsoft, licensing its Installable File System (IFS) Kit to develop a Windows-compatible file system. There is some overhead for a CFS relative to the non-clustered case of independent servers with independent file systems,2 because two nodes have to be prevented from working on the same chunk of data simultaneously (using the aforementioned locks). However, each node’s having direct access to a common set of data can be a big performance win relative to approaches that interpose a “heavyweight” intermediate abstraction layer. Early CFSs mostly saw duty as failover mechanisms for Unix servers. If one server failed, another could quickly take over because it had direct access to all of the failed server’s storage. CFSs also gained considerable popularity in high performance computing (HPC) where a large number of nodes often access a large quantity of common data. (Furthermore, HP data tends to be “read-mostly”—that is, read more often than it’s written—which makes coordinating access easier.) Although it, too, made inroads into smaller-scale HPC clusters, PolyServe actually started out in 1999 with a focus on commercial applications. Its initial product, UnderStudy, was a classic HAclustering product for Linux, providing application and hardware monitoring and failover for two servers. Over time, PolyServe went on to establish itself in a diverse set of applications including SQL Server, BizTalk, SAP, Windows Media Server, Citrix, SharePoint, Dynamics, Tibco, and others. In February of 2007, HP acquired PolyServe.
HP PolyServe Basics In the universe that includes all possible virtualized computing solutions, HP PolyServe lives between the operating system and the storage subsystem. It is neither server virtualization (in the sense of virtual machines) nor storage virtualization (in the sense of abstracted storage devices). HP PolyServe essentially creates a pool of server and storage hardware resources that are dedicated to running a collection of SQL Server databases. All SQL Server files are stored in a single repository, here depicted as sharable SAN storage. This architecture can simplify management of SQL Server by pooling database instances and allowing administrators to move them. Storage is likewise all part of a common pool.
|