Replication has become a catchall phrase that while gaining in allure is also gaining in confusion – especially in the mid-tier where data is just as important as the high-end, but IT staffing and budgets are far more limited. IT people as well as vendors have a tendency to lump all data movement functions together as replication, regardless of the method or the reason.
Replication Methodologies and Applications Author: Steve Duplessie Title: Senior Analyst
Abstract: Replication has become a catchall phrase that while gaining in allure is also gaining in confusion - especially in the mid-tier where data is just as important as the high-end, but IT staffing and budgets are far more limited. IT people as well as vendors have a tendency to lump all data movement functions together as replication, regardless of the method or the reason. The goal of this paper is to understand the differences in current replication methods as well as the application, or reason, for implementing a given replication schema. Replication Applications The most common use of replication technologies as it relates to the storage sector has been, and will continue to be for Disaster Recovery (DR) purposes. Traditional Content Distribution applications peaked in the late 90's during the Internet bubble, but are now experiencing a comeback in the global enterprise - predominately for Intranet applications. Replication is also widely used for local (non-remote) applications such as application testing. Remote office backup and remote office IT consolidation (or elimination) are this year's hot applications for replication. Disaster Recovery (DR) DR requirements, whether government-imposed or just for good business sense, have dictated that IT operations replicate primary data to a secondary remote site. In the event that the primary data or data center become inaccessible, the data at the second site can be brought up with a mirror application instance, and users are redirected to the new site. Why replicate? Simply, because of time to recovery. Having FedEx ship a billion backup tapes to a DR site and start the process of recovering is no longer a practical method for most enterprises. Having the data be no more than a day old, and effectively instantly accessible is now the standard at all size shops. Regulated shops may even have "synchronous" copies of the data, which is "lock-step" accurate, but comes with serious latency and application impact - not to mention costs in the stratosphere. To date, most DR/replication has been done by the IT elite - the big guys with the big staff and the big bucks. Historically, this has been because the cost and complexity of these solutions were so great that the mid-market IT operation simply couldn't play. Today, that isn't the case. Enterprise Strategy Group 20 Asylum Street, Milford, MA 01757 y 508-482-0188 y www.enterprisestrategygroup.com 1 DR, like many IT terms, has a wide range of meanings. When users consider replication technologies for disaster recovery purposes, they first need to identify their specific objectives. Not all methods are the same, nor are the costs. Users need to take simple steps and to understand their short and long term goals. DR Questions To Be Asked: 1. What is my ultimate objective? a. If the answer is "doomsday insurance" where as a last resort we need to protect our information assets in case of catastrophic failure of our primary site, then we will most likely have different recovery time objectives (RTOs - i.e. how long it takes to get back up after all hell breaks loose) and recovery point objectives (RPOs - i.e. from what specific point in time can we recover) vs. our ultimate objective being our ability to "follow the sun" and make sure our data is up to date and usable locally to our remote offices as each work day begins. Having to guarantee transactional integrity for stock market orders to comply with government regulations will have a different set of criteria to contend with than most sites who simply want to avoid the days often required to recover from tape. 2. Can I Be Out Of Sync, And If So, By How Much? a. When the mandate is "thou shall have absolute guaranteed lock-step integrity on both sites," the requirement is pretty clear. You will run synchronously (where the local site does not acknowledge a write until it is committed on the remote site - just like with local mirroring). If that is the case, you will need to make sure you have all the infrastructure in place to support peak demand, including big fat pipes sized for the worst case. This infrastructure is expensive and will probably be used very little most of the time. While easier to architect because the rules are cut-and-dried, this is obviously the most ex... [download for more]