What Is Big Data in State and Local Government?
Joseph Flynn, public sector CTO at Boomi, notes that “data has now become a critical asset of every agency and department.”
“We’re seeing a greater value placed on this data, but we’re also seeing state and local governments get into a state of ‘analysis paralysis’ as the number of analytics tools rapidly expands,” Flynn says. He also notes that there are challenges around the growing IT skills gap in state and local governments, as well as a need to incorporate existing legacy systems into new deployments without disrupting current operations.
Robert Carey, president of Cloudera Government Solutions and former CIO of the U.S. Navy, also points to distinct differences in how states approach Big Data based on their current infrastructure and budgets.
“Some have fairly deep pockets, while some do not,” he says. “Some tend to dive in quickly and make connections with public cloud providers. Others are digging into very specific programs to make the most of their budgets.”
“State and local governments are each on their own curve of digital transformation,” says Rick Vanover, senior director of product strategy at Veeam. “Many are putting citywide surveillance in place, many states have digitized services that used to be paper and law enforcement has more data than ever to manage. Each city or state has its own story, but what is consistent is that there is a constant need for the services powered by this data.”
Regardless of budget allotment or level of government, Carey points to similar data priorities. From diving into financial data to examining COVID-19 data, extrapolating environmental data and integrating climate data, there’s no shortage of opportunity for governments if they can make the best use of Big Data.
RELATED: How can state and local agencies enhance their use of data analytics?
What Are the Five V’s of Big Data?
Often described as the essential and innate characteristics of data, the five V’s can help state and local government data scientists effectively leverage Big Data sources. They are:
- Volume. Volume refers to the amount of data collected and used by organizations. Large amounts of collected data are known as Big Data, but there is no specific amount that defines the term. In 2021, approximately 79 zettabytes of data were created worldwide.
- Variety. Variety highlights the diverse types of data now available. Carey points to common types, including structured, unstructured and semistructured. Each has potential value depending on how, when and why it is collected.
- Velocity. Velocity is all about speed. How quickly is data generated, and how quickly can organizations capture and analyze this data? With the potential value of many data sources tied directly to their timeliness — such as emergent weather events or traffic patterns — the ability to collect data on demand is critical.
- Veracity. Veracity focuses on the quality and accuracy of data. Assessing veracity means evaluating the source and cleanliness of the data to ensure it is reliable. It also means conducting operations such as deduplication to reduce the risk of redundant data collection.
- Value. Value speaks to the ability of state and local organizations to act on the data they collect. Delivering value requires a combination of the previous V’s to ensure the right variety of data is collected at the right time by systems capable of handling the volume and velocity of this information while simultaneously ensuring its quality and accuracy.
It’s worth noting this isn’t an immutable list. Originally described as three V’s (volume, variety and velocity), two more were added when it became apparent that value and veracity were also critical characteristics.
Vanover even suggests adding one more: verbalization. “Owners of Big Data systems and lakes should be able to verbalize the five V’s to state and local agencies clearly,” he says.
EXPLORE: How can data aid states in their economic recovery?
How Can State and Local Agencies Optimize Their Use of Big Data?
To optimize collection, curation and use, state and local agencies can leverage the five complementary C’s of Big Data.
- Creation (Volume). Creation speaks to strategy building — the need for agencies to understand what they’re trying to achieve with big data efforts. “The first thing is developing a strategy,” Carey says. “It’s not uncommon to find some people diving right in and finding out the water is freezing cold and deeper than they thought.” He recommends creating labs and production environments first to demonstrate that data can align with strategic goals, rather than spending up front on large-scale projects that don’t ultimately deliver.
- Cataloging (Variety). Cataloging is all about discovering the location and format of data and determining how this impacts analytics goals. Is the behavior of data consistent? Are there areas where disparate data types provide more context to drive better decision-making? Robust cataloging helps reduce the natural complexity of data variety and streamline Big Data operations.
- Consumption (Velocity). Consumption is the ability to handle data volumes at speed. For state and local agencies, this could take the form of public, private or hybrid cloud adoption under authorized frameworks, such as StateRAMP, to help maximize consumption speed without sacrificing security.
- Consistency (Veracity). Consistency refers to best practices around Big Data collection, curation and analysis. By creating consistent and repeatable processes, organizations are better able to ensure the accuracy and quality of their data, in turn improving data analytics output.
- Connection (Value). Connection links data sources to intended outcomes. As noted by Flynn, while the value of isolated data events occurs at the point of transaction, “with larger analytics, this shifts. Over time, it becomes a rich set of information.” For government agencies, connection must account for both point-in-time transactions and the broader impact of data sets on state or local trends.
What’s the bottom line? Making the best use of Big Data is critical for governments to modernize service delivery and made better decisions. Vanover puts it simply: “The Big Data of state and local governments is data with a mission — and now, it’s bigger than ever.”
DIVE DEEPER: How is California using a geospatial data portal?