New purpose-designed chips poised to boost generative AI development
Ethical use and computing power are two of the top concerns in getting AI into government service.
No advancement in recent years has raised as much interest from both government and the private sector as generative AI. The newest generative AI platforms like ChatGPT-4 and others can not only examine existing information in order to solve problems or provide analysis, but are also able to generate new information, very much like a human brain.
But there are a few potential roadblocks along the way, especially for getting generative AI into government service. The top two concerns are being able to use the new technology in an ethical way, and the massive amount of computing power required to support generative AI development, or to sustain a program once deployed.
On the ethical side, governments around the world are now attempting to regulate the technology or provide ethical guardrails for its use. Concerns about generative AI even led the White House to release an executive order outlining how agencies should — and shouldn’t — employ the new technology.
Following the release of the executive order, I moderated a discussion among federal and private sector CTOs about their plans for using generative AI. Everyone in attendance was still extremely excited about what generative AI could do in government service, and confident that it could be used ethically while also fully complying with the executive order.
“On the pragmatic side, we see huge potential for using generative AI, and USAID has been funding some research and programs around the world on gen AI,” said Sukhvinder Singh, chief technology officer and responsible AI official for the U.S. Agency for International Development. “In just one example, we deployed portable X-ray machines equipped with an AI capability to better detect tuberculosis.”
In the future, Singh hopes to be able to use generative AI to predict things like disease or pandemic outbreaks, allowing USAID to proactively get resources in place to be ready to help out, instead of having to react to each new crisis after it happens.
Director of Information Technology for the Transportation Security Administration Balaji Subramaniam agreed that the executive order is a great framework for the responsible use of generative AI in government, but said that public-facing agencies like the TSA need to go even farther to ensure its safe use. That includes training their workforces and demonstrating for them what the new technology can do, and also how to use it safely and efficiently to accomplish agency missions.
“We have to make sure that our workforce is properly trained and educated on what generative AI is, what the prompt engineering is designed to do and what kind of a product they will ultimately be producing,” Subramaniam said.
Generating computing power for generative AI
Ethical concerns aren’t the only obstacle to getting generative AI into government service. While the software that makes up the new AI platforms is really the heart of the technology, there is also a substantial hardware requirement as well. Even training generative AI on the large language models that it needs to help influence its decisions can take months to complete using standard computing hardware, even if entire data centers are employed in that task. After that, almost as much computing power is needed so that the new generative AI models are able to process user commands and requests quickly.
That need has created an entirely new kind of computer chip specifically designed to be used with AI. Based on standard GPU chips, the ones designed for AI feature enhanced and expanded memory designed to process AI-related tasks as quickly as possible. The most popular chip used for that right now is the H100 from NVIDIA. It’s an impressively powerful chip with memory bandwidth rated at 3.35 terabytes per second, and total memory capacity topping 80GB. But even that is not enough for really advanced generative AI applications, especially if those designing AI applications don’t want to wait for months every time they need to tweak their models.
Driven by the need for even more powerful hardware, NVIDIA just announced a brand new chip dedicated to AI, the H200. The new H200s are designed to cut AI processing times by as much as 50% or more, while also reducing power consumption.
The design of the H200 is similar to the H100 chip, but upgrades the memory quite a bit. NVIDIA says that the two chips are so compatible with one another that data centers can begin adding H200s to increase their AI-computing capacity without first uninstalling the older H100s. Each H200 chip can support 4.8 terabytes per second of memory throughput and has a 141GB total memory capacity.
With the pending release of the H200 chip, the only problem on the hardware side might be getting the H200 into the data centers and laboratories of both private sector organizations and government agencies working on AI. Dedicated AI chips like the current model H100 are in terribly short supply right now, with individual chips selling for as much as $25,000 or more. Given that it takes hundreds of those kinds of chips to support generative AI development, that makes for a very large entry point for most organizations, even if a supply of the chips can be located.
The new H200 chips might alleviate the situation, since it would require fewer chips to get similar results, and also because the new H200s could be added into any existing infrastructure populated by H100s to boost capacity. Because of that and other factors, Sam Altman, the head of OpenAI — the company behind ChatGPT — believes that the acute supply problem will likely ease up next year.
Assuming NVIDIA can produce enough H200s to meet skyrocketing demand, the new chips can go a long way to supercharge generative AI development, reducing training times and model adjustments from months to weeks. With the hardware problem finally getting solved and ethical government guardrails in place, it seems like generative AI could have a bright future in federal service.
John Breeden II is an award-winning journalist and reviewer with over 20 years of experience covering technology. He is the CEO of the Tech Writers Bureau, a group that creates technological thought leadership content for organizations of all sizes. Twitter: @LabGuys