Three ways agencies can prepare before AI costs skyrocket

Andriy Onufriyenko/Getty

These three steps might save you some heartburn.

As a new administration takes the reins in Washington, one of the key topics that is gaining momentum is the desire to leverage the power of artificial intelligence to advance government missions.

There’s been much debate about appropriate use of AI, how it will affect the government workforce and how to use it ethically. But the key AI adoption driver that may preempt all the others is the cost of implementing these solutions.

AI has been used across the federal government for many years, but the cost of implementing AI solutions escalated dramatically for a number of reasons. For one, the sheer size of the data sets associated with generative AI models means agencies will have to make sizable investments in compute and data preprocessing to fine-tune and run the large models to get the needed responses out of them. 

Previously, most government AI implementations focused mostly on discriminative AI applications such as risk predictions, anomaly detections, and natural language processing. Generative AI and large language models in particular, despite them being pretrained, often require large amounts of agency-specific datasets to fine-tune them for their needs. The expanded scope of the data and the much larger size of these deep neural network models often running into billions of parameters, translates into significantly higher compute costs compared to the AI models of just a few years ago.

This in turn can have a further multiplier effect on other components of infrastructure costs such as data storage, pre-processing, governance, and management it not appropriately planned in advance. This is, in turn, complicated by the fact that the context of every use case and agency’s technological stack is unique in its own right. For example, layering a generative AI-powered search capability on large swathes of legacy databases would have completely different cost dynamics than a document review and summarization use case working on high-velocity transactional data.

In addition, agencies must consider costs associated with hiring personnel who have the skill sets needed to work with these models. Professionals who specialize in deep learning and generative AI are rare and valuable, and software developers or AI engineers at some private sector AI labs may get paid nearly $1 million per year. 

So how can agencies manage and control these escalating costs?

·       First, agencies should develop design approval requests – a detailed analysis that looks at the pros and cons of different architectures and compares the actual production costs that each scenario entails. Agencies should do this even before they develop the solution and deploy it in the cloud. It’s worth noting that a lot of cloud providers offer tools that can help agencies to create architectures and simulate their costs. By comparing the costs for alternative solution architectures depending on the amount and type of data as well as the choice of models, agencies can obtain a reasonable estimate of the costs they would incur in real-world scenarios.

·       Second, agencies should add a conscious step of comparing the costs of using open-source vis-a-vis closed source models. Open-source models can sometimes be run on the agencies’ own native environments or in self-hosted environments that could potentially reduce the costs. However, depending on the complexity of the setup, it involves higher upfront investments for procuring compute as well as specialized deployment skills. With many closed source AI platforms, while the upfront costs are significantly lower, their pay-per-use and/or monthly subscription pricing models may lead to higher operating costs over the longer-term. A structured Total Cost of Ownership (TCO) model should be applied to assess these trade-offs effectively, as it provides a comprehensive financial analysis crucial for CIOs and CFOs in government agencies. By integrating TCO considerations into AI cost assessments, agencies can better forecast long-term financial implications and optimize their investment decisions.

·       The third critical component is concurrent and ongoing monitoring. The velocity and volume of an agency’s data may change over time, which often also necessitates model retraining. Therefore, it is essential to put in place robust organizational systems and processes in which leaders are held accountable for managing cloud costs and making sure they stay within the budget. This would become all the more critical given the new administration's focus on efficiency, and agencies should keep this aspect in mind right from the system design stage.

AI presents incredible opportunities for government agencies to improve how they operate, achieve their mission goals and provide services to citizens with enhanced efficiencies. But in the scramble to take advantage of these benefits, agencies must not lose sight of the need for adequate strategizing and planning to guard against the very real possibility of runaway AI costs.

Ramakrishnan Krishnamurthy is REI's Data Analytics Lead and Anand Trivedi is REI's AI Lead.