TIME WAS RUNNING out for Oregon’s Department of Consumer and Business Services. The old data storage system wasn’t fast enough to perform full backups on all the data the department was creating, even when DCBS scheduled the task to happen over weekends.
“It was taking us 24 hours to perform one full set of backups, but we needed two sets so we could send one offsite,” says Ellen Murphy, backup administrator for DCBS, which includes Workers’ Compensation, Insurance, Oregon Occupational Safety and Health Administration (OR-OSHA), Building Codes and other state offices and divisions.
From 1999 to 2003, the department had seen its data volumes rise from 200 gigabytes to 600GB, and Murphy expected that trend would only accelerate. In fact, data volumes have jumped to 3 terabytes (TB) over the past three years. “We had to resolve the issue of not having enough time to do full backups,” she recalls.
The Oregon DCBS’s answer to the data explosion and the time needed to back it up involved the wholesale replacement of its storage hardware and software with a new system that included an additional layer of hard-disk arrays to the infrastructure. Known as disk-to-disk-to-tape (D2D2T) storage architecture, this alternative now sends data from 80 DCBS departmental servers to the mid-layer disks at a rate of about 25 megabytes per second (MBps). The procedure takes a couple of hours to complete.
Similarly, each week, the department performs a full backup of all 3 TB of data, which takes about five hours. It’s more than enough time to finish before everyone begins arriving on Monday morning, and it allows Murphy to stop spending her weekends worrying about backups.
D2D2T storage architectures merge the performance advantages of hard drives with the low-cost backup and archival capabilities of tape. They do this by using the additional hard-drive storage layer as a data staging area that frees the production systems from the time-consuming task of communicating directly with the tape resources for data downloads.
The architecture offers both speed and manageability advantages over traditional two-tier approaches. Launching a D2D2T system does present technical and managerial challenges. But with the right technology choices and overall backup strategy, public sector CIOs can see quick benefits from D2D2T.
D2D2T’s first advantage results from hard drives that run faster than tape drives, so the time production servers take to download new data to the second hard drive layer is much faster than when the downloads go directly to slower tape drives.
Once the information is on the intermediate layer, the middle disk array can send data to the backup tape units whenever it’s convenient for IT people. They don’t have to watch the clock to make sure everything happens before employees begin to file in the next morning.
A second D2D2T advantage is faster data recovery times. If someone accidentally overwrites a recent file, the file may still reside on the second-layer hard drive. If so, the file may be retrieved in seconds rather than the hour or more it might take someone from IT to locate the file on a tape cartridge and call it up.
“This gives us hot-recovery capabilities instead of having to load a tape and find the file,” says Ruth Schall, MIS director for the city of Kenosha, Wis., which launched D2D2T two years ago. “In the past, if someone asked for a file recovery, I’d have to track down the tape and then locate the individual file before I could do the restore.” That could take from half an hour up to three hours.
An ancillary benefit is easier management of backups. Hardware glitches, improper software settings and human error all can contribute to stalled overnight backups that may not be discovered until the next morning.
“If a hiccup occurred [over a weekend], the backup process just stopped, and we didn’t find out about it until Monday morning,” Murphy recalls. “I’d end up spending the whole week trying to catch up because my production servers were in use,” and it was difficult to steal time to complete the backups, she says.
Now, because transfers from the staging disk array to the tape target can occur during regular hours, IT staff is on hand to quickly discover and fix any problems.
Finally, this approach not only brings faster hard-drive performance to backup operations, it also makes good use of an inherent tape advantage, namely its portability. Organizations can save full backups to tape cartridges that can be protected in onsite fireproof safes or sent offsite.
The building blocks of D2D2T storage consist of the middle-tier hardware and data-management software that creates a seamless data flow from production servers to tape drives.
A number of storage vendors, such as Exabyte, now sell preconfigured storage arrays designed with D2D2T in mind. The arrays consist of Small Computer System Interface (SCSI) or serial advanced technology attachment (serial ATA or SATA) hard drives that can offer total capacities of hundreds of megabytes or gigabytes worth of data, depending on the size and number of drives. The arrays typically connect to production servers and tape servers via Internet SCSI (iSCSI), Fibre Channel networking or parallel SCSI direct connect.
“The key is to architect the system to serve as a staging area for the tape backup and a holding area that is available for hot recoveries,” says Kerry Brock, Exabyte’s vice president of marketing. “With this setup, the production disk is backed up to its backup as quickly as possible.”
The “glue” to this architecture is the backup software used to keep data flowing smoothly. If an organization plans to use its existing backup software, the program must be able to accept a disk-based middle tier without forcing the organization to significantly change its backup procedures.
“If it’s not seamless, then you have to institute new backup procedures, which means retraining your people,” says Greg Schulz, senior analyst with the storage analysis firm Evaluator Group, which is headquartered in Greenwood Village, Colo., and author of Resilient Storage Networks: Designing Flexible Scalable Data Infrastructures .
To gauge the applicability of middle-tier disk arrays, Schulz advises agencies to ask some fundamental performance questions before proceeding. Is it compatible with my current backup software? What tape drives does it support? How many can it emulate? What tape libraries and which versions of them does it support? What additional modules do I need, and are these free or ones that I’ll have to pay for?
The backup software should also automate the data movement process as much as possible, says Phil Johnson, network administrator for Washington state’s Department of Transportation (DOT), South Central Region. Each day at 8 p.m., his backup software automatically asks the production servers to download any changes made to files throughout the day to the middle disk array.
The remaining component in the D2D2T architecture, the tape drive, is an area made easier with carousels that automate the old process of manually loading tape cartridges as they become full of data.
Kenosha moved to D2D2T to ease demands on its production servers. Each night, the front line servers copy user files to the intermediate hard-drive server instead of having tapes run backup directly off the production server. During the day, the intermediate storage server sends the data to a 10-tape carousel, Exabyte’s VXA-2 PacketLoader. Each VXA-2 can hold 1.6TB of compressed data and can transfer data at rates of up to 43.2GB per hour.
Before the city installed the new VXA-2 units, the IT staff had to manually change tapes when they became full of data. The new system changed that. “If it needs to change a tape, it can do that itself and not wait for someone to attend to it,” says Schall.
Kenosha’s D2D2T scheme carries out a succession of nightly, monthly and bimonthly backups. Data moves from the first array to the second hard-drive tier as a batch job managed by the Linux tool Rsync.
“The batch runs around 2 a.m., and everything that’s created the previous day is sent to the second disk array,” Schall explains. The secondary array then begins transferring the updated data to the tape carousel at 8:30 a.m. “It’s backing up while we’re here in case there are any problems,” she says.
One question IT departments face when considering D2D2T projects is whether or not to use a clean-slate approach—new hardware and software purchased with D2D2T in mind that replaces existing resources. Or do they build on the current setup and add minimal new hardware and software?
The answer basically comes down to what tactical and strategic goals the organization hopes to achieve with D2D2T. If the move is designed to address a specific “pain point,” such as faster recoveries, adding a mid-level disk array to the existing backup scheme may result in minimal cost and organizational disruption.
However, agencies often choose a more ambitious approach if D2D2T becomes a component in a larger data management strategy that also seeks to address issues like data retention and information lifecycle management policies.
Oregon’s DCBS chose the more ambitious track when it went with D2D2T in 2003. But there was a catch: Like state agencies everywhere, DCBS had a tight budget and had to perform its storage makeover on a shoestring.
Fortunately, the department could tap the IT expertise from an in-house staff. It chose two preconfigured D2D2T arrays that consist of eight 250GB hard drives and data management software.
The total cost was about $60,000. DCBS saved about $20,000 by choosing D2D2T units that used iSCSI rather than Fibre Channel as the networking technology that sent data through the D2D2T infrastructure, Oregon’s Murphy adds.
Hardware was the easy part. Murphy says the disk and tape systems were running in a day. Thornier issues arose around implementing new backup procedures.
“Did we want to run synthetic backups [incremental backups merged with an existing full backup]?” she says. “Could we back up field office locations to a central site? What was the best disaster recovery plan we could put together? We ran through quite a few tests to answer these questions. It took about three months before we had everything figured out.”
She admits to having second thoughts about using the “forklift approach” of installing a new storage system, which her department ran in parallel while the technical questions were hashed out. “Would I do that differently?” she wonders. She concludes that “trying to patch together an old system that wasn’t capable of handling our needs” would ultimately have been more difficult—and perhaps unsuccessful.
In nearby Washington state, the DOT’s South Central Region faced a slightly different challenge. In addition to improving how it managed backups, the agency also wanted the revision to include disaster recovery capabilities, Johnson says. He settled on a $40,000 disk-to-disk unit that included software for all of his goals and could also plug into the tape library already running at the department.
The department’s total data pool now stands at about 700GB, with about 15 percent to 20 percent of new or revised data every day. It takes about three hours to do incremental backups of the six departmental servers. In the past, because of time constraints, Johnson ran incremental backups daily and performed full backups, which could span 15 hours or more, on weekends.
Johnson’s biggest challenge with his D2D2T installation is becoming comfortable with the “synthetic backup” scheme used by his management software. “For me it’s a different paradigm from what I was used to,” he says.
The first day the appliance went live, Johnson ran a full backup, and since then he has run differential backups to refresh the data. The department no longer performs full backups. “This required a whole shift in my thinking, as well as training from the vendor,” Johnson says. “I wouldn’t buy the equipment without a commitment from the vendor for training.”
If he did the project again, Johnson would rethink his decision to stick with his existing tape library. “I’d get a bigger tape library,” he says.
Larger tape capacity would mean less human intervention to switch tapes when they get full. Johnson estimates that the department staff spends about an hour a day managing tape cartridges.
Based on this experience, Johnson says the first step for agencies contemplating a move to D2D2T is to know how much data they’ll be backing up initially and to estimate how much growth they expect in the immediate future.
1. Agencies must first determine the current amount of data they’re storing and use recent growth trends to estimate their near-term needs. The results will help them decide on the right capacities for the disk arrays and tape carousels that go into a D2D2T implementation.
2. Next, determine how the backup architecture fits into larger tactical and strategic considerations. For example, will efficiency assure that incremental and full backups can be completed in the off-hours windows when production servers aren’t being used?
Is providing for “hot” recoveries—the ability to pull recent files from mid-tier disk arrays rather than hunting through tape cartridges if a file becomes damaged or accidentally overwritten—an additional goal? If so, agencies need to decide how long data resides on intermediary arrays before being removed.
3. Analyze the financial impact of adding a mid-tier array to existing disk and tape resources. Some projects will see a minimum of expense and operational disruption by upgrading existing operations. However, if the current hardware and software are old and underpowered, a clean-slate approach may ultimately prove a better investment.
4. Evaluate preconfigured storage arrays designed with D2D2T in mind and weigh the price/performance trade-offs of units that use Internet Small Computer System Interface (iSCSI) or Fibre Channel to transfer data through the architecture.
5. Judge data management software with an eye on how well it supports hardware choices. Also consider the scheduling and management tools for automating the downloading of data from production servers to the mid-tier arrays and onto the tape carousels.
6. Choose tape carousels able to hold enough cartridges to minimize manual intervention.
• Faster incremental and full backups
• Increased availability of production servers
• Faster-than-tape disk-based file recoveries
• Scheduled backups during normal business hours for easier monitoring
• Better ability to capitalize on the inherent advantages of disk and tape technologies