Recovery Efforts
When Phil Lowder came to Linn County, Iowa, as IT director in 2008, he expected to ease into the job during the first two weeks: He'd meet with county administrators and introduce himself to his IT staff. But a devastating flood that damaged county buildings and crippled the IT infrastructure changed those plans.
In the ensuing weeks, he worked tirelessly to get critical infrastructure back up. And over the past two years, he's developed a continuity of operations strategy and deployed new technology to make sure such downtime never happens again. His investments include HP blade servers, EMC storage area networks, VMware virtualization and Symantec NetBackup PureDisk backup software.
"There was a lot of work getting things up and running, but a lot of positives have come from this," Lowder says. "We're getting a new building, we have replaced our servers and network in one fell swoop, and we now have a secondary disaster recovery site."
The threat of downtime -- from power outages to disasters such as hurricanes, tornados or the catastrophic flooding that Linn County experienced -- is prompting city and county governments to make smart IT purchases in backup software and storage gear to shore up their disaster recovery stance.
The amount each invests is different, and government leaders must weigh the level of risk and the length of recovery time they are willing to accept versus the amount they are willing to spend, says Jonathan Eunice, principal IT adviser at Illuminata.
"It's like insurance: You buy what you can afford," Eunice says. "You can't insure against everything, but you try to insure for the worst-case scenarios, and if you can afford a bit more, you get better protection."
#2
Upgrading continuity of operations and disaster recovery capabilities is the second-highest priority among enterprises, behind consolidating IT infrastructure
Source: Forrester Research, "Business Continuity and Disaster Recovery Are Top IT Priorities for 2010 and 2011"
The Rebuilding Process
Iowa's 2008 flood caused an estimated $7 billion in damage. The state had been drenched with a wet winter and spring, so when more storms arrived that June, water overflowed the river banks.
When county workers received evacuation orders, Lowder's key IT staffers worked late into the night, completing the nightly backup of critical applications. Even though IT administrators were told their building was safe, they moved backup tapes and critical servers to the top floors just in case. The mainframe, however, stayed on the bottom floor because of its size.
The flood submerged 10 miles of Cedar Rapids. Ten out of 14 county buildings were damaged, including the county administrative building that housed the IT department. The basement data center was submerged under six feet of water, destroying servers and the mainframe. The phone system, e-mail and business applications were down, and about 200 of the county's 500 computers were destroyed.
The county's critical servers that were moved to higher ground stayed dry, but the building itself had extremely high humidity, which damaged the servers, Lowder says. "We were able to get file servers running, but the stuff we saved had a high failure rate."
He needed to build a makeshift data center at nearby Kirkwood Community College, so he ordered new servers, switches and firewalls. Over the next week, the IT staff set up e-mail, a new Voice over IP phone system and other essential applications. The county purchased a new mainframe and had its tax collection system back online six weeks later.
In early 2009, with the makeshift infrastructure in place, Lowder stepped back to reassess where the IT department was and developed a three-phase continuity of operations strategy.
In the first phase, in late 2009, the county built a new data center featuring an HP c7000 blade enclosure, five HP blade servers, a 15 terabyte EMC CLARiiON AX4 SAN and VMware virtualization software. Through virtualization, the county reduced its server count from 30-plus standalone servers to five. Three blades run virtual machines, the fourth blade runs the domain controller, and the fifth is a cold spare that will take over if the other blades go down, Lowder says.
Photo: Milton Morris
In the second phase, Lowder and his team built a secondary data center. Last March, the county's Emergency Management Agency allowed them to house backup servers and a second SAN that replicates data in real time. If the main data center goes down, the IT staff can fire up the backup servers and get applications running again in 24 hours.
Last summer the IT department installed Symantec NetBackup PureDisk software to back up and deduplicate data. Before the flood, each server had its own tape backup drive. Now the data is backed up to disk on the SAN and replicated to the secondary site.
But Lowder isn't done. For the third phase, he and his team will implement VMware's vCenter Site Recovery Manager, which will turn the secondary data center from a cold to hot site. If the main data center goes offline, Site Recovery Manager will automatically failover to the secondary data center. That will speed the recovery time objective from 24 hours to one hour.
32%
The percentage of enterprises that plan to increase continuity of operations spending the next year, according to a Forrester Research study
Next fall, the IT department will move the main data center to its permanent home in the new Linn County Community Services building. And within the next year or two, Lowder plans to create a third data center for even more protection.
"The flood wiped out buildings, so it made me think: We have a lot of buildings that are geographically dispersed in opposite ends of the county," he says. "I see an opportunity to extend things further and replicate to a third disaster recovery site miles away."
Phased Approach
Florence County, S.C., is fortunate that it hasn't suffered from any disasters since Hurricane Hugo in 1989. But county IT Director Robert Franks doesn't want to press his luck.
When Franks joined the county five years ago, it had one data center and used only tape backup. Upon his arrival, he embarked on a multiyear project to improve continuity of operations and disaster recovery planning -- and storage played a big role.
Today, Florence County has three data centers with separate EMC SANs in each. To protect data, Franks relies on disk-to-disk-to-tape backup. If one data center goes offline, he can relaunch services at the other data centers.
Because of budget constraints, Franks' strategy was to take a methodical, phased approach. During his first year, because network downtime was a problem, he invested in new infrastructure. Then he focused on building additional data centers.
In 2007, the Law Enforcement Center had a fireproof room available, so he built a second data center to house applications for law enforcement and other departments that had nearby offices. In 2008, he built a third data center at the county's planning building. The extra data centers serve two purposes: disaster recovery and faster IT services because department-specific applications and data are stored where users need access to them.
During this period, Franks purchased three EMC CLARiiON CX-4 SANs, one for each location. The new or frequently accessed data is stored on Fibre Channel drives, while older or less critical data is stored in lower-cost SATA drives. With EMC's Networker software, the county performs incremental daily backups and full weekend backups to disk. And when the SATA drives reach 90 percent capacity, the data is moved to tape.
Over the past two years, the IT department has virtualized its servers, which further improves continuity of operations. Today, the county has seven physical servers running 45 virtual servers across its three sites. Snapshots of the virtual servers are saved onto tape. If a disaster strikes, IT administrators will use virtualization management software to move snapshots of critical virtual servers to other data centers.
Florence County doesn't have fast enough bandwidth or the budget to buy the additional SAN storage needed to replicate data between SANs in real time, but it's something Franks hopes to do.
Franks' motto is hope for the best, but plan for the worst, and that drives him to constantly improve the county's disaster readiness. "You have to find all the single points of failure and eliminate them," he says.
The Starting Point
While some departments require extensive solutions to protect their data centers, other IT administrators have to seek more affordable options for continuity of operations if budget is a concern.
Two years ago, Robert James, IT director of ÂWoodbury, Minn., purchased a $10,000 hardware appliance that backs up data and uses virtualization to make copies of important server images. The appliance stores 2TB of data and takes snapshots of data from the city's most critical applications every 15 minutes.
The appliance doesn't have the processing power and memory to run all seven critical applications at once, but if a disaster hits, IT staffers can bring up the city's three most important applications -- Active Directory, e-mail and financial software -- within an hour and run them as virtual servers on the appliance.
While the appliance doesn't provide immediate failover for all applications, it is good enough, James says. "This appliance was a low-cost alternative that would get us through an emergency."
Go to statetechmag.com/DR111 to read about the disaster recovery initiatives implemented by the city of Batavia, Ill.
City officials at Big Bear Lake, Calif., learned their lesson five years ago when only one of 10 servers was backed up. When the document management system's hard drive died, the city was forced to spend $25,000 to hire a data recovery company, recalls Ken Watts, who was then a newly hired IT consultant but is now the city's information systems manager.
Afterward, city leaders allowed Watts to invest in new infrastructure, including a tape drive and Symantec's Backup Exec software for data backup. When Big Bear Lake can afford it, he hopes to buy a SAN and partner with a nearby city to use its data center as a backup site. For now, he makes do with tape backups.
"My project to improve disaster recovery may be too expensive now, but you have to start putting together plans and work toward that long-term direction," he said.
Rescue Equipment
Phil Lowder, IT director of Linn County, Iowa, shares this advice for saving data center gear during a disaster:
1. Keep mobility in mind.
During a disaster, you have to get the equipment out fast. By going with blade servers and virtualization, Linn County shrank its footprint, so it was easier to move the equipment. In addition, the data center must be designed so equipment can be moved easily and quickly. Include a loading dock and eliminate obstacles such as thresholds that make it hard to move heavy equipment.
2. Position critical racks by the door.
Linn County keeps all mission-critical technology in one or two racks near the door, so IT staff can grab those first. If time permits, they will go back for other equipment. Document and practice your evacuation drill.
3. Keep the IT department out of the basement.
The first floor is preferable because it provides faster access to exits.