Data Center

State and Local Governments Need Disaster Recovery Appropriate for Their Enterprise

State and local agencies should consider the benefits of recovery time objective, hypervisor-based replication and volume copy shadow service.

Randy Barrett

Randy Barrett is a freelance writer and editor based in Washington, D.C. A large part of his portfolio career includes teaching banjo and fiddle as well as performing professionally.

When disaster strikes and computer systems go down, everything depends on how critical data is backed up. Various technologies offer different capabilities. For state and local government agencies, choosing the right one can make the difference between a prolonged crash and recovery in a matter of seconds.

Eighty percent of data center operators experienced an outage in the past three years, according to a recent survey by Uptime Institute. About 60 percent of those failures led to losses of more than $100,000. These are not always the result of cyberattacks. A sobering 40 percent of organizations experienced an outage caused by human error, often a failure to properly follow procedures or due to flaws in the IT architecture itself.

Evan Davis, IT security manager for Grey County in Ontario, Canada, says state and local governments must start with a recovery plan. “Figure out what you need to have,” he says. “These technologies are expensive. Know what you need to do and get stakeholders involved to support you.”

What Is Recovery Time Objective (RTO)?

Recovery time objective is the gold standard of measurement in disaster recovery. According to Rubrik, “RTO is the goal your organization sets for the maximum length of time it should take to restore normal operations following an outage or data loss.” RTO answers a simple question: How quickly must your operation be back up and running?

If the answer is just a few seconds, the solution will cost more. Accurately calculating RTO requires an inventory of all systems, critical applications, virtual machines and data. Then, it’s a matter of building a hierarchy and deciding how long an application can be down before your agency’s business comes to a halt and starts losing money.

Click the banner below to learn how to better protect your agency as an Insider.

What Is Continuous Data Protection (CDP)?

Data backup has evolved from punch cards and magnetic tape to floppy disks and hard drives, and now to cloud-based solutions that eliminate the need to keep large machines humming in an on-premises data center. Traditional backups were usually done at night to minimize impact on the production IT system, creating a lag of up to 24 hours.

Continuous data protection is a system that backs up data every time a change is made. CDP keeps an ongoing record of data alterations and makes it possible to restore a system to any previous point in time. It effectively removes the traditional interval between two scheduled backups.

“Continuous data protection copies … any changes to your data, from source to target,” notes Cloudian, a hybrid storage provider. “True continuous data protection systems record every write and store it in a changelog on the CDP system. CDP keeps all changes until the last write before failure, allowing you to restore to that point or any previous point before the data was corrupted or lost.”

Here, an IT manager must make a decision about the hierarchy of storage needs, say experts.

“Not every kind of data needs a two-second recovery point objective,” says Rick Vanover, senior director of product strategy for Veeam. “CDP is not a solution for all potential incidents. If you lose a file, standard backup is best. That’s the daily disaster.”

The primary downside of CDP is that it offers a single point of failure. If the CDP software gets corrupted, your data can become toast. It’s generally recommended to use data backups for storage even when using CDP, Vanover says.

Keeping an extensive record of CDP changes is also expensive, says Mark Chuang, VMware’s head of product marketing for cloud storage and data. And it can be dicey to find a safe backup point in the most common form of attack. “Ransomware can have very long dwell times. Some bits already may have been infected or encrypted.”

For this reason, Gartner recommends the use of isolated recovery environments on virtual machines, where data can be forensically examined away from the production IT system. “Ransomware attacks will vary in nature, with the level of infection dictating the recovery strategy being used. Organizations can utilize modern backup infrastructure to restore rapidly in certain situations but must ensure ransomware is eradicated and the threat vector is eliminated or risk reinfection,” Gartner notes in a report.

The company also advises that an immutable copy of backup data be placed in an “air-gapped” location closed to all outside networks.

Virtualization has made hacking into virtual machines easier, as unattended VMs are an inviting window for cybercriminals.”

Amit Malik COO, Spektra Systems

What Is Hypervisor-Based Replication?

Sometimes not using any hardware is the best way to go.

“Hypervisor-based replication is software that is integrated directly with hypervisor software to replicate virtual machines and virtual disks to another hypervisor or other storage location,” notes cloud data company Zerto. “A hypervisor is a software-based operating platform that hosts and runs virtual machines and their virtual disks.”

The hypervisor-based replication approach has key advantages, including scalability and a smaller storage footprint. It also provides real-time continuous data protection without any scheduling. But it has security drawbacks.

“Virtualization has made hacking into virtual machines easier, as unattended VMs are an inviting window for cybercriminals,” writes Amit Malik, COO of Spektra Systems. “Penetrating a virtual machine and getting access to all the stored data which can be confidential and sensitive is now possible, also transferring the data is extremely easy from a VM.”

But what makes the technology so effective is also its Achilles’ heel. “Since virtual machines are designed to be independent of the host server, multiple VMs can be spinned in a matter of seconds and data can be easily cloned and transferred to other hardware devices without getting noticed,” he writes.

This data theft can cause huge losses to an organization and result in security breaches, Malik adds.

Creating new VMs and forgetting about them is a clear pitfall for IT managers. In a ransomware attack, “Your own operating system becomes a crime scene,” says Adam Scamihorn, product director of business continuity services for InterVision Systems. How to avoid it? “You need to clean off your virtual machines and threat hunt.”

EXPLORE: Why state and local agencies should integrate disparate data sources with data fabric.

What Is Volume Copy Shadow Service?

Volume Copy Shadow Service was created by Microsoft in 2003 as part of Windows Server. It is an essential element in creating snapshots of computer files, even when they are in use.

These slices of data are a read-only, point-in-time copy, and they aren’t locked when the backup is being created. “When all the components support VSS, you can use them to back up your application data without taking the applications offline,” notes Microsoft.

VSS snapshots usually are invisible to users, but sometimes they can bog applications down, writes SQL expert Brent Ozar on his blog. “Now, if you’re just taking one VSS Snap after hours when no one’s around, chances are that pause isn’t going to make a noise. But if you’re relying on VSS Snaps for more frequent backups, or if you’re running a 24×7 shop where users are still doing important stuff, this could be a real performance hit.” The solution is to limit the number of databases you run on a server to no more than 35.

Some consider VSS outmoded. “We don’t use it” says Chris Rogers, senior technology evangelist for Zerto. “It’s old technology used one time a day.”

valentinrussanov/Getty Images