Feb 21 2024
Data Center

Enhancing Data Integrity: A Guide for State and Local Governments

Agencies need complete, high-quality data to make informed data-driven decisions. Getting there isn’t easy, but there are ways to bolster data integrity.

An organization that relies on data will be only as good as its data, and data is only as good as the method of collection, storage and use. From this reality stems many a government agency’s information woes. And the damage done by bad data has only become more potent with the rise of generative AI. If such systems are trained on bad information, bad outcomes are a foregone conclusion. Thus, the new Holy Grail for state and local data managers is data integrity, an essential goal in evolving an information system to support decision-making and the timely delivery of services to citizens.  

What Is Data Integrity?

Data integrity is often conflated and confused with data quality. They are, in fact, two sides of the same coin. Data integrity commonly refers to an overarching “umbrella for data management, including security,” says Eric Sweden, program director for enterprise architecture and governance at the National Association of State Chief Information Officers.

According to the Data Management Association, data integrity refers to information that “complies with all rules regarding definitions, relationships, lineage, and heritage.” Further, when data moves, it’s “probably not changed unexpectedly through transmission between systems.”

Data integrity assures managers they can rely on and trust any individual piece of information in the pipeline. While such a thing may seem basic, experts say that attaining a high level of integrity is enormously difficult, and it’s a prime target for hackers.

According to IBM: “To achieve a high level of data integrity, an organization implements processes, rules and standards that govern how data is collected, stored, accessed, edited and used … An organization with a high level of data integrity can increase the likelihood and speed of data recoverability in the event of a breach or unplanned downtime, protect against unauthorized access and data modification and achieve and maintain compliance more effectively.”

Data integrity is the primary framework for information trustworthiness. “Without data integrity, you can’t have data quality,” explains Brian Vecci, field CTO at Varonis.

Click the banner below for more on how government agencies can manage their data.

What Is Data Quality?

“Data quality provides a measure of data integrity,” Sweden says. “Data quality assesses the level of data integrity by evaluating accuracy, completeness, reliability, validity and timeliness.” It’s a subset of attributes which make data integrity possible.

The Data Management Association defines data quality as “the degree to which data is accurate, complete, timely, consistent with all requirements and business rules, and relevant for a given use.”

“High-quality data eliminates incongruency across systems and departments and ensures consistent data across processes and procedures,” IBM notes. “Collaboration and decision-making among stakeholders are improved because they all rely on the same data.”

The city of Austin, Texas, recently discovered the challenges that come with a lack of data quality. A July 2023 internal audit found that “the data on the portal did not consistently match departments’ data sources. The discrepancies between data on the portal and in department sources varied from as few as two missing records to hundreds of thousands of missing records. This means community members and City decision-makers who use data from the portal may be getting information that is incomplete, inaccurate, or otherwise different from the data departments may use when making decisions.”

It turns out, organizations of all kinds believe their data integrity isn’t where it needs to be. In a survey conducted by Drexel University, only 34 percent of organizations felt their data quality was “high” or “very high.” Half of the respondents felt poor data quality is the leading challenge to data integrity.

How Do You Boost Data Integrity?

Improving data integrity is a gradual process. You first need to understand what you have, says Timothy Humphrey, chief analytics officer at IBM.

“In most institutions, your metadata is poor,” Humphrey says, because organizations inherit information over time. “You need to understand the map of your data universe.”

For example, what systems do they go through? What’s the lifecycle all the way to insight? Humphrey also advises taking an iterative approach to data mapping. “Go after the most challenging problems first.”

Brian Vecci
Without data integrity, you can’t have data quality.”

Brian Vecci Field CTO, Varonis

City officials in Austin ran headlong into this problem. Information about the data fell short. “Metadata on the portal are sometimes missing or incomplete,” the audit noted. “The least consistently provided metadata in our analysis were column descriptions and frequency of updates. This hinders people’s ability to interpret the data or to see how current the data is.”

Experts agree that automation is now an essential part of validating and reconciling information. “Manual stare-and-compare between two screens never works” because it introduces too many errors, says Curtis O’Dell, global business manager for Tricentis, an Austin-based software testing company.

Where Does Data Governance Come In?

Data integrity is not possible without strong data governance policies. NASCIO’s Sweden strongly advises state and local governments to create positions for a chief data officer, chief privacy officer and chief security officer if these roles don’t already exist. Each is necessary to create a viable governance framework.

“Governance is how I can ensure that the data is of good quality,” Humphrey says. “It’s about processes end to end — everyone needs the same definition of a term, and the formats need to be the same.” It’s one of the most difficult things an organization can do, he adds, because it requires a dance between people, technology and change management.

How do you meet that challenge? Bring in a consultant to help establish a data governance plan. “A third party gets listened to, and outside counsel tends to have more credibility” with the C-suite, Sweden says. There’s a lot to be learned from an expert who has done it before.

Austin’s audit found a lack of governance plagued its open-data portal: “Departments are responsible for putting data on the portal. [The Communications and Technology Management department] is responsible for managing the portal’s operation. No one person or department is responsible for verifying data or an overall strategy for providing data to the public.”

Establishing data ownership is an essential part of governance that often slips through the cracks, Vecci says. If a system is maintained by human resources but it’s housed in 100 scattered hard drives or in the cloud, who exactly is in charge?

A lack of resources compounds the problem for state and local governments. “They’re facing a whole lot of unknown unknowns,” Vecci says.

bymuratdeniz/Getty Images

Become an Insider

Unlock white papers, personalized recommendations and other premium content for an in-depth look at evolving IT