In a talk titled “Sparks of AGI” delivered at MIT last March, Sebastien Bubeck, a computer scientist from Microsoft Research, discussed a study involving OpenAI’s impressive new large language model, GPT-4. Bubeck and his team subjected GPT-4 to a battery of intelligence tests.
If you haven’t picked up a copy yet, here are a couple of reasons to consider doing so: (1) Amazon has temporarily reduced the price of the hardcover due to the initial surge in sales, making it the most affordable it’s likely to ever be (US | UK); and (2) for those who prefer audio, I personally recorded the audiobook. Check out a sample here (US | UK).
Bubeck noted, “If your focus is on problem-solving, abstract thinking, grasping complex concepts, and reasoning with new information, then GPT-4’s capabilities could be considered intelligent.”
However, despite its capabilities, GPT-4 faltered in certain tasks. For instance, when asked to modify a simple math equation to equal 106, it provided an incorrect answer. Similarly, it struggled with tasks such as writing a poem with a mirrored structure and solving puzzles like Towers of Hanoi.
What stumped GPT-4 in these tasks was its inability to simulate the future. Humans naturally simulate outcomes when faced with such challenges. We intuitively understand which elements to modify in a math equation or how to structure a poem’s ending while crafting its beginning.
In my recent piece for The New Yorker titled “Can an A.I. Make Plans?,” I delve into the significance of this limitation. Humans constantly engage in forward-thinking simulations, whether in conversations, navigating daily tasks, or pursuing long-term goals.
As I elaborate:
“In our interactions, we anticipate how responses might affect the mood, just as we predict the progress of various checkout lines at the supermarket. Achieving our goals often requires forward-thinking assessments of potential actions’ outcomes, whether it’s contemplating major life decisions or prioritizing tasks throughout the day.”
For artificial intelligences to truly emulate human-like cognition, they must possess the ability to forecast outcomes. However, this capability eludes large language models like GPT-4 due to their static, feedforward architectures, which lack the capacity for recursion, iteration, or adaptive exploration of new possibilities.
While this may offer temporary reassurance against scenarios like Hal 9000 from “2001: A Space Odyssey,” there are other AI systems capable of future simulation. Efforts are underway to integrate these planning capabilities with the linguistic prowess of language models, promising more comprehensive digital cognition.
In my article, I delve deeper into these developments, emphasizing that the future of artificial intelligence lies not in sheer model size but in intelligent integration of diverse cognitive functionalities.