Earlier this month, The Wall Street Journal reported that one-third of nuclear power plants are in negotiations with technology companies to power new data centers. Meanwhile, Goldman Sachs predicts that AI will increase data center electricity use by 160% between now and 2030, which would more than double current carbon emissions. It’s estimated that each ChatGPT query consumes at least 10 times more energy than a Google search. The question is whether the exponentially growing costs of training AI models will ultimately limit AI’s potential.
At VB Transform 2024, a panel led by Hyunjun Park, co-founder and CEO of CATALOG, tackled this topic. To discuss the scope of the problem and potential solutions, Park welcomed Dr. Jamie Garcia, Director of Quantum Algorithms and Partnerships at IBM, Paul Roberts, Director of Strategic Accounts at AWS, and Kirk Bresniker, Chief Architect at Hewlett Packard Labs and HPE Fellow and VP, to the stage.
Unsustainable resources and inequitable technology
“A 2030 landing is far enough away that we can course correct, but at the same time, it’s realistic enough that we need to consider the impacts of what we’re doing now,” Bresniker said.
He added that between 2029 and 2031, the cost of resources to train one model once will exceed US GDP, and by 2030 it will exceed global IT spending. So we are heading towards a hard ceiling and it’s time to make a decision, and not just because the costs will be prohibitive.
“Because inherent in the issue of sustainability is also equity,” he explained. “If it’s proven to be unsustainable, it’s inherently unfair. So, given that we have this amazing technology that’s going to be widespread and hopefully available to everyone, we need to ask ourselves, what can we do? What do we need to change? Are there any points that need to be dramatically changed to make this technology available to everyone?”
The role of corporate responsibility
Some companies are taking responsibility for this looming environmental disaster and working to mitigate the impending financial disaster. In terms of carbon footprint, AWS has been charting a path towards more responsible usage and sustainability, and today it looks to implement the latest liquid cooling solutions from Nvidia, among others.
“We’re looking at strengthening both steel and concrete to reduce our carbon footprint,” Roberts explained, “plus we’re looking at alternative fuels — not just traditional diesel fuel for the generators, but hydroelectric vegetable oil and other alternative fuels.”
The company is also pushing alternative chips: For example, it released its own silicon, Trainium, which is many times more efficient than alternative options, and introduced Inferentia, which delivers more than 50% better performance per watt than existing options to help reduce inference costs.
The company’s second-generation Ultra Cluster network, which aids in training and pre-training, supports up to approximately 20,000 GPUs and delivers approximately 10 petabits per second of network throughput with less than 10 microseconds of latency on the same spine, resulting in a 25% reduction in overall latency. The end result is more models can be trained faster at a lower cost.
Can quantum computing change the future?
Garcia’s research focuses on how quantum and AI interact with each other, and the results are promising. Quantum computing could bring resource savings and speed benefits. Garcia says quantum machine learning can be used for AI in three ways: quantum models on classical data, quantum models on quantum data, and classical models on quantum data.
“There are different theoretical proofs in each category that show there are advantages to using quantum computers to tackle these different areas,” Garcia says, “for example, when you have limited training data, or when you have very sparse data, or when you have very interconnected data. One area that we see as very promising in this field is healthcare and life sciences applications — anything that has a quantum mechanical nature and that you need to tackle.”
IBM is actively researching the vast potential of quantum machine learning, and there are already many applications in fields such as life sciences, industrial applications, and materials science. IBM researchers are also developing Watson Code Assist to help users who are new to quantum computing take advantage of quantum computers in their applications.
“We’re using AI to help with this, helping people optimize their circuits and define their problems in a way that a quantum computer can solve them,” she explained.
The solution, she added, will be a combination of bits, neurons and qubits.
“It’s going to be the CPU, GPU and QP working together to differentiate different parts of the workflow,” she says. “We need to push quantum technology to a level where it can run the circuits that we’re talking about. Once we get to that level, we think we’ll see exponential speedups, polynomial speedups. But the potential of the algorithms is very promising to us.”
But before quantum can become today’s heroes, the infrastructure requirements for quantum pose a challenge, including further reducing power consumption and improving component engineering.
“The infrastructure requirements for quantum require a lot of physics research,” she explained. “For me, the real challenge is getting the vision of all three working together to solve problems in the most resource-efficient way.”
Selection and hard ceilings
“The most important thing is radical transparency – going back through the supply chain and giving decision makers a deep understanding of the sustainability, energy, privacy and security properties of every technology we employ, so they can understand the true cost,” Bresniker said. “This will allow us to calculate the true benefits of these investments. Right now, we have experts in each field talking to companies about adoption, but we don’t necessarily enumerate what it takes to actually integrate these technologies successfully, sustainably and equitably.”
And part of it is about choice, Roberts said. The horse has already left the stable, and more and more organizations will leverage LLM and gen AI. The opportunity is there to choose the best performance characteristics for your application, rather than consuming resources indiscriminately.
“From a sustainability and energy perspective, you have to think about what is the use case you’re trying to enable with your particular application and model, and what silicon you’re using to run that inference,” he said.
You can choose the hosts, or even the specific applications or specific tools that abstract the underlying use cases.
“The reason this is important is because it gives you choice, it gives you a lot of control, and it allows you to choose the most cost-effective and optimal deployment for your application,” he said.
“You put more data, more energy, more water, more people into it, the model gets bigger, but is it really good for the company? That’s the real question about the adaptability of the company,” Bresniker added. “If we continue like this, we’re going to hit a hard ceiling. If we start having conversations with that understanding and we start to push back — we want more transparency, we want to know where that data comes from, how much energy is going into that model, are there alternatives? Maybe having a few smaller models is better than one monolithic monoculture. We’re going to address the monoculture before we hit the ceiling.”