I have been thinking a lot about the competition between OpenAI, Anthropic, Meta, and Google for who has the best pinnacle AI model.
I think it comes down to 4 key areas.
The Model Itself
Post-training
Internal Tooling
Agent Functionality
Let’s look at each of these.
The model is obviously one of the most important components because it it’s the base of everything.
So here we’re talking about how big and powerful the base model is, e.g., the size of the neural net. This is a competition around training clusters, energy requirements, time requirements, etc. And each generation (e.g., GPT 3→4→5) it gets drastically more difficult to scale.
So it’s largely a resources competition there, plus some smart engineering to use those resources as efficiently as possible.
But a lot of people are figuring out now that it’s not just the model that matters. The post-training of the model is also super key.
Post-training refines and shapes model knowledge to enhance its accuracy, relevance, and performance in real-world applications.
I think of it as a set of highly proprietary tricks that magnify the overall quality of the raw model. Another way to think of this is to say that it’s a way to connect model weights to human problems.
I’ve come to believe that post-training is pivotal to the overall performance of a model, and that a company can potentially still dominate if they have a somewhat worse base model but do this better than others.
I’ve been shouting from the rooftops for nearly two years that there is likely massive slack in the rope, and that the stagnation we saw in 2023 and 2024 around model size will get massively leaped over by these tricks.
Post-training is perhaps the most powerful category of those tricks. It’s like teaching a giant alien brain how to be smart, when it had tremendous potential before but no direction.
So the model itself might be powerful, but it’s unguided. So post-training teaches the model about the types of real-world things it will have to work on, and makes it better at solving them.
So that’s the model and post-training, which are definitely the two most important pieces. But tooling matters as well.
What we’re seeing in 2024 is that the connective tissue around an AI model really matters. It makes the models more usable. Here are some examples:
High-quality APIs
Larger context sizes
Haystack performance
Strict output control
External tooling functionality (functions, etc)
Trust/Safety features
Mobile apps
Prompt testing/evaluation frameworks
Voice mode on apps
OS integration
Integrations with things like Make, Zapier, n2n
Anthropic’s Caching mode
Just like with pre-training, these things aren’t as important as the model itself, but they matter because things are only useful to the extent that they can be used.
So, Tooling is about the integration of AI functionality into customer workflows.
Next lets talk about Agents.
Right now AI Agent functionality is mostly externally developed and integrated. There are projects like CrewAI, Autogen, Langchain, Langraph, etc., that do this with varying levels of success.
But first—real quick—what is an agent?
❝
An AI agent is an AI component that interprets instructions and takes on more of the work in a total AI workflow than just LLM response, e.g., executing functions, performing data lookups, etc., before passing on results.
Real-world AI Definitions
So basically, an AI Agent is something that emulates giving work to a human who can think, adjust to the input given, and intelligently do things for you as part of a workflow.
I think the future of Agent functionality is to have it deeply integrated into the models themselves. Not in the weights, but in the ecosystem overall.
In other words, we soon won’t be writing code that creates an Agent in Langchain or something, which then calls a particular model and returns the results to the agent.
Instead, we’ll just send our actual goal to the model itself, and the model will figure out what part needs agents to be spun up, using which tools (like search, planning, writing, etc.) and it’ll just go do it and give you back the result when it’s done.
This is part of this entire ecosystem story. It’s taking pieces that are external right now (Agent Frameworks), and brings that internal to the native model ecosystem.
We should start thinking about top AI models as Model Ecosystems rather than just models because it’s not just the neural net weights doing the work.
There are four (4) main components to a Model Ecosystem—the Model itself, Post-training, Internal Tooling, and Agent functionality.
#1 (The model) is the most well-known piece, and it’s largely judged by its size (billions of parameters).
#2 (Post-training) is all about teaching that big model how to solve real-world problems.
#3 (Internal Tooling) is about making it easier to use a given model.
#4 (Agent functionality) emulates human intelligence, decision-making, and action as part of workflows.
The company that wins the AI Model Wars will need to excel at all four of these, not just spending lots of money to have the neural net with the most parameters.
Thanks to Jai Patel for informing many thoughts on this, especially around pre-training.