The Data Management Conundrum in Smart Cities
The world’s most advanced smart city initiatives rely on an impressive number of devices. The city of San Diego, for example, has thousands of connected streetlights.
Data from those devices is shared among a spectrum of public and private sector stakeholders for the benefit of the general population. For instance, San Diego’s streetlights are used to monitor atmospheric data, traffic patterns and more. Meanwhile, Waze uses publicly available traffic data to improve the travel experience for users around the world, while giving urban planners the tools to analyze that data. Further afield, in Seoul, South Korea, solar-powered waste bins incorporating fill-level sensors are helping to refine manual collection routes, bringing significant cost savings to a cleaner city.
All of these sensors and solutions collect vast quantities of data, leaving city administrators with many questions. What data should be saved, and what should be destroyed? What should be aggregated and shared, and what should be treated confidentially? And, critically, how can petabytes — or even exabytes — of smart city data be transported and processed in such a manner that swift action can be taken based on the insights delivered?
Regulation poses further complexities. As more civic processes become automated, public authorities must retain a historical record of the data that propels them, particularly when they are linked to highly regulated sectors such as healthcare, financial services and the automotive market.
Meanwhile, personally identifiable information carries its own set of requirements. Serious concerns have been raised about how data is collected and used, creating a quandary around which data to keep and which not to keep. Cities must be very careful about how they use the data they collect and ensure that PII is kept secure and anonymous.
With so many smart systems simultaneously at play and so many challenges around those systems, what’s the most effective way to archive and maintain data for compliance and auditing purposes on a mega scale?
Data Management at the Edge of a Smart City
To answer these questions, it helps to understand the different environments in which data is generated, collected and held.
In smart cities, a significant amount of data is produced at the point of collection — otherwise known as the edge. This might include sensors on a subway train or cameras directed at a waterside park. The challenge for public authorities is to determine what data from edge locations should be retained or moved into a central processing center to be aggregated, shared, analyzed and archived.
Cities need to be able to process data quickly to adjust and refine edge-based applications, but transmitting data from an edge device to a data center — and back again — often takes too long. A faster and more cost-effective approach is to combine edge storage, where data is processed at the device itself, with edge computing storage, which involves switching office computing and storage infrastructure to support intermediate aggregation points.
This can be complemented with a traditional data center that is used for truly centralized storage. When this happens, some medium-level processing will likely take place, but we’re not yet at a place where these points have the capabilities that traditional data centers offer. That may change in the future as edge computing becomes more commonplace.
For now, 5G technology facilitates beyond-the-edge data stream transactions. 5G stitches together disparate data from sensors, cameras and devices at the edge. Initially, 5G will be most important at the furthest reaches of the network, eventually traveling throughout the entire high-speed wired network.
One Storage Design Does Not Fit All
All of this activity requires a flexible storage architecture that can handle data processing across multiple environments. Processing data at the edge or in a traditional data center are both different approaches with unique storage requirements.
Storage at the edge tends to have a compact footprint because of the limited space available. Stored data volume is relatively small, as it is mainly used for specific, real-time applications at or near the point of collection. A key question for storage designers to ask is how ephemeral edge-based data is. How long should it be retained for immediate use and when should it be pushed to the core?
The destination for most smart city data is a central storage facility typically built on public, private or hybrid cloud infrastructure. Here, massive amounts of information can be aggregated for macro analysis, and as a result, there is a requirement for significant processing power, not just to scale, but to provide performance at scale.
Automated Storage Solutions Scale to Cities’ Needs
Accordingly, the architecture required for both edge and traditional data center storage environments must be robust, built to last and capable of handling vast amounts of data. Legacy storage arrays were not built to accommodate the potential exabytes of data that smart cities will likely consume.
The traditional human-heavy ways of managing vast quantities of data simply will not scale; automation will become key. Automated storage technologies built on open-source software and industry standard hardware specifications can deliver the required scalability. They can be automatically adjusted based on capacity needs, often making them more cost-effective than traditional storage systems and ideal for cities that are being inundated with data that needs to be processed quickly.
Why Data Storage Matters in Smart Cities
Storage is just one component of smart city infrastructure, but its complexity and cost often presents the largest roadblock to the aspirations of city governments.
A solution is to use automated storage solutions that are massively scalable and compatible with industry-standard hardware infrastructure — and to tailor the architecture to different data storage environments and use cases.
Planning ahead and thinking critically and realistically about the way data is used in urban settings will go a long way to creating cost-effective storage infrastructure that enables a smart city to reach its true potential.