I've been skeptical about AGI. I finally believe it's within reach

Dec 26, 2024

tldr: we are a 50x improvement away to get to AGI.

It feels strange to write this, but I think we're at a stage where we're trying to figure out what stands between us and something we could reasonably call artificial general intelligence (AGI).

Now, there isn’t a widely accepted definition of AGI. Many influential voices in the field have incentives to shape this to fit their own narratives. For instance, Microsoft risks losing access to OpenAI’s most advanced models if AGI is developed (though this might get changed soon)

Here's an interesting way to think about AGI: compare it to a human worker. Maybe we've reached AGI when AI can do all the thinking parts of a real job. Not the physical parts - just the mental work.

Take translators. AI has basically replaced them already. ChatGPT can translate well enough that most people don't need to hire human translators anymore.

In this case, AI matches human intelligence for the core task. It can translate as well as a person can. Of course, this doesn’t mean that all translators are out of a job. Sometimes we hire translators for other reasons. With legal documents, we need someone to check the work and take responsibility if there are mistakes. This is about liability, not intelligence. We don't need AI to replace these trust-based parts of the job before we can call it AGI.

Now, to be able to call it 'General', AI needs to adapt to different tasks like humans do. Think about how flexible you are during a normal day:

You can write an important email
Then jump into a spreadsheet to analyze numbers
Then explain a complex idea to a colleague
Then plan next month's schedule

We naturally switch between different kinds of thinking.

Today's AI systems are surprisingly capable at replacing many human jobs. They can be a decent sales rep. An OK copywriter. A better-than-average data analyst.

They can do all this right away, with minimal instruction. It's like having a new employee who shows up already knowing how to do most of their job. You just need to point them in the right direction, and they'll produce good work.

In fact, today's AI can do things that most humans can't do simultaneously:

Write code in multiple programming languages (top 0.1% of competitive coders)
Translate between dozens of languages
Explain complex topics like quantum physics in simple terms

And if you think the average human can do all of these at the same time, you have skewed perception of what the average human is capable of.

In fact, based on the announcements about o3 from openAI, it seems that ANY reasoning task can be learned by AI. I had to pause and take a deep breath when I first understood what this means.

So have we reached AGI? I don't think so.

When you hire someone, you expect them to get better at their job over time. Each mistake helps them grow. Each interaction improves the next one. This is what AI can't do yet.

It's not like AI is not getting better. It's probably the most rapidly developing technology of our lifetime.

But it does not gain experience.

“But wait - companies train AI on our conversations!" That's true, but it's not the same thing. That's more like replacing your employee with a slightly better one. It’s not your employee learning the specifics of how to get their job done.

And you have been at a job for more than a year, you know how much value there is in the latter.

When I hire a salesperson, I don't expect them to be very productive at first. They are probably going to make mistakes and fumble a lot. But over time, they will develop expertise about our products and our clients. They will get an intuition on what message works when a client has a particular problem. They will learn how to frame an offer so that a prospect wants to buy now.

The AI industry has developed several approaches to help AI remember and learn. You might have heard some of these terms: Fine-Tuning, RAG, longer context windows, and most recently, Reinforcement Fine-Tuning (RFT). Let me explain why none of these really solve our core problem.

I imagine RAG as something like a notepad you keep checking while talking. It helps you remember facts, but that's not how really how you learn things. If someone asks me about the economic impact of the Panama Canal, I can look it up on Wikipedia and try to reason about it. I might even provide a fresh perspective. But my analysis wouldn't be as good as asking a trade economist who's studied shipping routes for years.

Context Windows - this is like telling you everything about World War II right before asking you to analyze its economic impact. Google's model can handle 2 million tokens - about 8 novels worth of text! But their ability to reason with all this information is extremely low. You can't think clearly with that much information in your head at once. And this approach is incredibly expensive.

Fine-tuning is closer to real learning - you show the AI examples and it adjusts its behavior. But it's like memorizing flashcards before an exam. You might remember that A leads to B, but you don't understand why. And you need hundreds of examples to learn what a human could grasp from just a few. RFT (Reinforcement Fine-Tuning) is different. Instead of just memorizing answers, the AI remembers how it reached those answers. This is much closer to how humans learn. When you solve a math problem, you don't just remember the answer - you remember the steps that got you there. That's what makes you better at solving similar problems later.

An effective RFT should lead to AGI.

Current RFT doesn’t seem to be very effective. It still requires hundreds of examples to make it work. OpenAI claims its just a few dozens, but I believe what I see, not what I hear. A human needs 2-3 examples to learn something*.

That would mean that a 50x improvement in RFT will give us AGI.

But is that really true? When I hire a sales rep who sells to hospitals, they should get better at selling to pharmacies too. They don't just learn specific facts - they learn principles that work across healthcare. We want AI to make these same mental leaps.

Does RFT lead to this kind of generalisation? Maybe. I haven't really tried it. But it looks like it should. So I will update my thesis one last time.

50x improvement in RFT if models generalise based on RFT will lead to AGI.

But this sounds way too complex to make a good headline. Let's simplify.

50x improvement in RFT will lead to AGI.

That's too many terms.

We are a 50x improvement away from AI as smart as humans.

There. That's something anyone can understand. Will put that on LinkedIn.

Think about this for a second. For decades, AGI seemed like science fiction. We couldn't even get computers to recognize cats in pictures. Now we're just a 50x improvement away. That's not 50 times better at everything - just at learning from examples.

50x isn’t a huge gap. We've already seen bigger leaps in the last few years. I'm not putting any predictions on when this will happen. Even, who will do it first.

But AGI looks to be on the horizon.

Ivelin’s Substack

Discussion about this post