Software

What Is Retrieval Augmented Generation, and How Are State and Local Agencies Using It?

RAG allows government agencies to infuse generative artificial intelligence models and tools with up-to-date information, creating more trust with citizens.

Phil Goldstein

Twitter

Phil Goldstein is a former web editor of the CDW family of tech magazines and a veteran technology journalist. He lives in Washington, D.C., with his wife and their animals: a dog named Brenna and two cats, Grady and Princess.

State and local agencies are using generative artificial intelligence for a wide range of tasks, from modernizing old code to intelligently processing documents and improving the citizen experience by augmenting the capabilities of contact center agents.

However, there is a chance that the AI will generate a query response that is factually incorrect. For citizens who are using generative AI tools to answer questions about taxes, government benefits or other high-stakes matters, that could wind up being ruinous.

To mitigate that possibility, agencies are starting to experiment with and even deploy an AI process called retrieval-augmented generation.

With RAG, the answers users get to queries should be “more accurate, and more specific to what they want to know,” says Amy Hille Glasscock, program director for innovation and emerging issues at the National Association of State Chief Information Officers.

Click the banner for deeper insight into public sector transformation.

st-transformgovernment-animated-2024-clickhere-desktop

st-transformgovernment-animated-2024-clickhere-mobile

Using RAG, agency IT leaders can train the models they’re using on agency data to generate more specific answers, and can have the models look for and then retrieve updated information so that answers are accurate.

For example, if a law or policy changed around taxes or permits, RAG would allow the model to ingest that new information and incorporate it into its answers.

What Is Retrieval-Augmented Generation, Anyway?

If not properly supervised and trained, a generative AI chatbot may attempt to “do something for you that it’s programmed to do, but it’s going to generate an answer that really doesn’t represent the answer that you want,” says Keith Briggs, head of enterprise architecture and innovation at the North Carolina Department of Information Technology (NCDIT).

RAG allows agencies to focus AI models on particular data when responding to queries, increasing the quality of responses. It does this by connecting the large language model an agency is using to external, verified sources of information.

Modern AI chatbots are trained on billions of parameters but are “not grounded in the knowledge that you want them to use,” adds Justin Vargas, rapid application development IT director at NCDIT.

One strategy for grounding models is fine-tuning them, Vargas notes, which essentially involves taking an existing model and embedding new information into it. By adding that new information, an IT team can compel the model to use it rather than the information it already knows, Vargas says.

PREPARE: IT infrastructure modernization can help agencies prepare for AI adoption.

However, there are limitations to such fine-tuning, as the information’s accuracy will still be tied to a static point in time. Fine-tuning is also a complex and time-consuming task, Vargas notes. It is an iterative process in which those managing the model must constantly check whether it is tuned properly.

In contrast, he says, RAG gathers data at the time that the question is being asked. The agency will send the AI model instructions that say, “forget everything, I want you to focus on this block of information that I’m providing you and here’s what I want you to do with that information,” Vargas says. That is the retrieval augmentation part of the process, in which the model is being tasked with retrieving updated information, for example, from a government website.

“And then you load that into the same prompts where you’re asking it a question, and then you let it go do its thing,” he says.

Benefits of RAG for State and Local Governments

There are numerous benefits state and local agencies can gain from using RAG, officials say, including improved accuracy in generative AI tools, which can increase the trust of citizens.

“We have a fundamental business problem in the state of Utah we’re trying to address with artificial intelligence,” says Utah CIO Alan Fuller. “We want to help our employees be more productive. At the same time, the currency of government is trust. So, whatever we do, we need to maintain a high level of trust.”

The Utah Tax Commission, the state’s internal revenue service, has 200 call center agents that answer residents’ tax questions. The state wanted to augment their capabilities with a generative AI assistant, Fuller says, and had a “bake off” with leading AI vendors to test their offerings. The models were grounded on a vast amount of information, including tax law, training materials for call center agents, transcripts of past calls and, using RAG, up-to-date state websites.

Click the banner to subscribe for weekly email updates.

Utah then ran a test of the AI chatbots by asking them questions that tax subject-matter experts had come up with, Fuller says. The state rated them on a scale of 1 to 4, with 4 meaning that the chatbot answered the question as well as or better than a knowledgeable help desk agent.

Across the four that were tested, at least one of them got a “4” 92% of the time on average, says Christian Napier, director of AI at the State of Utah Division of Technology Services. The vendors, according to Fuller, said they could do better and expected their numbers to be at least 99%; the state is doing another round of testing.

It’s important for Utah to use RAG, Fuller says, because laws and policies change all the time. In the last legislative session, he estimates, state lawmakers made more than 100 changes to state tax law. RAG can help ensure that citizens can trust generative AI tools and won’t make decisions based on incorrect information, he says.

“You don’t want to have outdated information that you’re grounding to your model, or you’ll get garbage in, garbage out,” he says. RAG helps ensure that outdated information is pulled out of the models the state is using and that they can retrieve the most recent data.

For now, Fuller says, Utah is working to refine its generative AI tools and keep them for internal-facing use cases for state employees. The state wants to avoid a citizen relying on a chatbot that isn’t fully vetted to file their taxes, for example.

You don’t want to have outdated information that you’re grounding to your model, or you’ll get garbage in, garbage out.”

Alan Fuller CIO, Utah

However, RAG could help in other areas, Fuller says, such as the state’s IT help desk, or in the Utah Insurance Department, where knowledge bases can be summarized and updated and then fed into AI models to help answer user queries.

“That use case has a lot of legs, and there’s a lot of different places where we can implement something like that,” he says.

RAG also allows governments to have AI models focus on particular information. RAG enables state IT teams to hyperfocus the model on retrieving specific, usually updated, information to answer questions, Vargas says.

Additionally, RAG allows agencies to get to a higher level of reliability in a less expensive way than other methods, Briggs says.

“It’s much easier, I believe, for a state agency to approach improving the responses through a RAG capability than a fine-tuning capability,” he says. “Because the model that you use and the skill set it takes to fine-tune a model to be responsive is much more complex than establishing a RAG solution.”

Many AI vendors now offer RAG tools along with their models, which allow agencies to identify and build quality data pipelines, Briggs says.

Building and Deploying Your First RAG Pipeline

Building a RAG pipeline likely will start with the IT team using the RAG tools embedded in whichever AI suite of tools the agency is using. There is a clear, logical flow to building a RAG pipeline.

Document Ingestion

“The process of document ingestion occurs offline, and when an online query comes in, the retrieval of relevant documents and the generation of a response occurs,” NVIDIA notes in a blog post.

Raw data “from diverse sources, such as databases, documents, or live feeds, is ingested into the RAG system,” the blog notes, and the term “document” could include standard documents, such as PDFs and text files, but also .csv files, emails and more.

Document Preprocessing

“When we talk about building a RAG pipeline, it really orients itself around the data,” Briggs says. “You have to make sure you’re organizing the data in a fashion that makes sense.”

Data needs to be cleaned and processed and must be high-quality before being fed into a RAG pipeline, he says.

From there, Vargas says, the data must be divided up into chunks in a process known as text splitting, because AI models, just like humans, “have a limited attention span.”

“If you provide it 100 pages of information and then you ask it a specific question, you’re less likely to get the right answer than if you provide it one or two pages of information that you already know are very focused on the answer to the question,” he says.

Generating Embeddings

“Generating embeddings involves converting data into high-dimensional vectors, which represent text in a numerical format,” the NVIDIA blog post notes.

“You’re adding embedding data that gives better semantic reference to what the data is so that when people ask questions or there’s an engagement with the AI, it knows more specifically what to retrieve,” Briggs says.

From there, the “processed data and generated embeddings are stored in specialized databases known as vector databases,” the NVIDIA post says. “These databases are optimized to handle vectorized data, enabling rapid search and retrieval operations.”

When a query comes in, Vargas says, it is turned into a vector and compared to the vectors of all the data in the database. “You pull back the text that has the vectors in the line closest with a question, and then you are more likely to have the right information to answer the question.”

skynesher/Getty Images

Become an Insider

Sign up today to receive premium content!

StateTech Magazine

What Is Retrieval Augmented Generation, and How Are State and Local Agencies Using It?

What Is Retrieval-Augmented Generation, Anyway?

Benefits of RAG for State and Local Governments