It’s no secret we've stumbled upon a formidable obstacle to our AI-powered future: the staggering energy consumption of our current models.
Industry leaders are scrambling to come up with short-term answers to ensure they don’t miss the wave, with ambitious efforts like
And while that plays out at the big kid’s table, a flock of new startups are building on the progress made in recent years, re-thinking the fundamentals to see if there are solutions that could serve as the long-term solution.
One that doesn’t require hundreds of millions of dollars in infrastructure investment.
Operating large language models in their current iteration is an energy-intensive process that's rapidly approaching unsustainable levels. Training a single AI model can emit as much carbon as five cars over their entire lifetimes. It's not just an environmental concern; it's a scalability nightmare threatening to derail the AI revolution before it fully takes flight.
Consider these sobering facts:
And as the industry pushes for more advanced AI capabilities, this energy consumption is set to skyrocket. This is not only a problem on an operational level but also in the greater picture, as industry leaders like Google have
The solution may be rather simple: smarter, smaller, more efficient models built for a set of specific purposes.
Narrowing the scope, if you will.
One such example is the open-source Aria model built by Rhymes, which employs minimal parameter activation. While the Aria model boasts a total of 25.3 billion parameters, it only activates a mere 3.9 billion for any given task. Traditional, mainstream models like GPT-3 activate all of their parameters for every task, regardless of complexity, while Aria's approach is like a surgeon using only the necessary instruments for a specific procedure. Many surgeons would tell you they don’t need to deploy the entire operating room's worth of equipment for every operation.
Rhyme has deployed this in practice with BeaGo, which they call “a smarter, faster AI search.” Based on my tests, BeaGo’s results were indistinguishable from competing products from Perplexity and other, most energy and time-intensive products.
But it’s more than simply narrowing the scope: the startup has built a multimodal open-source mixture-of-experts model that sorts and intelligently manages large, long-context data of all types, including text, video, and images.
Rhymes’ solution may be lighting the pathway for AI in 2025 and beyond, all without the hundreds of millions of dollars in infrastructure spending.
In the end, the work of companies like Rhymes serves as a reminder that just because we found something that works, doesn’t mean the task of innovating is over. While Microsoft and Google run with our existing large language models, working to productize and bring AI to the mass-market, others cannot stop working on building something even better.
I’m heartened by the startup-driven approach that I see here at the end of 2024 – combining multimodal capabilities, selective parameter activation, and open-source collaboration – offers a blueprint for how we can achieve a vision of an AI that both works and works for the planet.