Contacts
Are we really that close to machines that think like human beings? OpenAI's O3 model has surprising capabilities, but is it a true quantum leap or just an illusion? And while the US and China advance, Europe is lagging behind: it's time to fight back

Artificial general intelligence, or AGI, is the type of artificial intelligence that aims to replicate the human ability to solve problems in a wide variety of domains and contexts.

Unlike specialized AI, which is designed for specific tasks, an AGI should be able to adapt to new situations and tackle tasks for which it was not explicitly programmed, exhibiting a human-like level of flexibility.

In recent years there have been amazing advances in generative AI with large language models (LLMs), e.g. ChatGPT, but until recently many believed that this would not be the way to AGI. In recent months, however, the global debate about the similarity between machines and human intelligence has been enlivened by two significant developments. First, OpenAI has unveiled its new O3 problem-solving model (not yet publicly available), an advanced implementation of ChatGPT. Second, the emergence of so-called large concept models (LCMs), which instead of using words to generate answers concatenate short sentences that directly express concepts, has shown promising performance.

Are we really on the brink of human-like artificial intelligence or are we simply surprised by the incremental advances in AI technology?

The case of O3

We try to answer here by considering the case of O3. The announcement of OpenAI inspired much speculation and lofty statements based on some of its characteristics:

1) First, it achieved a remarkable performance in advanced mathematical reasoning on the FrontierMath test, developed by a team of mathematicians. Until recently, artificial intelligence systems achieved a 2% success rate, while O3 achieved 25%. 

2) Second, it has made tremendous progress in programming tasks. Developers have observed that it can produce programming codes faster and at a lower computational cost than its predecessors. 

3) Third, and this seems to be the most surprising feature, it has achieved unexpected performance in the ARC-AGI test (Abstract and Reasoning Corpus for Artificial General Intelligence, https://arcprize.org/arc). This test tests artificial intelligence systems with graphical puzzles designed to assess reasoning and abstraction ability. While previous models had scored around 55%, O3 achieved an impressive 75% at low computational cost (a few dollars of electricity per answer) and 87% at high computational cost (a few thousand dollars), surpassing the average human score of 76%.

What makes these models different from the large language models like ChatGPT that we have become accustomed to?

Actually, not very much.

The key feature of O3 (a run of the already available O1 model) is its feature of internally generating multiple solution paths generated by LLMs and comparing them by choosing the best one, a strategy known as “chain thinking.” In a nutshell, many different solutions are compared and, automatically using other AI systems, a selection is made. It is a feature that makes these systems of the highest conceptual and application interest. I recommend that readers try using O1 to solve nontrivial planning problems. It is of the highest interest to see how problems are set up, decomposed into subproblems, and then often solved. Even when the system fails, the chain of thought remains interesting.

However, we cannot say that it really resembles AGI, at least in the interpretation that links it to human intelligence.

What is intelligence?

The question of resemblance to human intelligence is misplaced.

Does AGI consist of being able to solve any problem that humans can potentially solve? Or do we think of cognitive abilities that exceed those of humans in a given number of problems? This ambiguity leads to confusing remarkable progress in specific areas, such as ARC-AGI, with the achievement of general intelligence. 

Humans are able to solve ARC-AGI problems without special exposure to similar tasks, while O3 compares solutions offered by trained LLMs on vast datasets. 

If we really had AGI available, it would be able to solve ARC-AGI tests as a special case, but the fact that a specific system knows how to solve such tests does not imply that it is a form of AGI. 

To give an example that is often cited by Turing laureate and Meta VP Yann LeCun, young people learn to drive a car in all driving situations with about 20 hours of driving lessons, while AI systems are able to do so similarly only in simple traffic conditions despite being trained on the equivalent of millions of driving hours.

Proponents of the so-called scalability hypothesis believe that by using computational resources to create enough data and generate solutions (a strategy known as “compute into data”), AI systems will eventually outperform human intelligence. 

Critics argue that scalability alone cannot bridge the gap with human intelligence. Unlike humans, AI systems have difficulty generalizing knowledge across domains, adapting to novel situations, or learning from minimal examples. These limitations suggest that new models and methodologies will be needed to achieve true general intelligence.

It should be reiterated, however, that both factions believe that some form of AGI will be achieved within a finite time horizon.

How to deal with the progress of artificial intelligence

The race for AI predominance is moving enormous resources in the United States and, secondarily, in China.

Europe remains at a standstill: the time for discussions and laws on regulations and ethics, totally legitimate and agreeable, are outdated by reality. It is necessary to move to a different phase, where a competitive European AI is created. Quantitative figures should be made clear: in order to give birth to competitive R&D European centers, it takes many tens of billions of euros per year (Microsoft alone invests €80 billion/year). 

There is no doubt that dominance in AI will correspond to economic dominance. But there's more: owning control of AI will be a huge form of political power.

Right now it is in the hands of a few large technology companies, and we already see how much they influence the political system.

The great impact of the AI system developed by Deepseek shows how sensitive is this matter, and how the battle between Open Source and Proprietary AI software will dominate the scene in the coming years.

European politicians urgently need to understand that we need to take action at the level of the European Union; individual countries cannot meet the challenge.

It is urgent to create the conditions for sustainable, open and democratic AI, which is the opposite of the technological oligarchy we are witnessing.

It will be crucial for Europe and for the global economy.

There is an example of such European collaboration: CERN in Geneva. The cost of building the next particle accelerator is estimated at several tens of billions of euros. Well, let's avoid building it and convert CERN into the European AI center, taking advantage of all the top-level management and technological experience accumulated at CERN over the decades. 

European researchers are ready, there are high-profile European scientific networks working closely with Canada and the UK.

The Deepseek experience shows that China has managed to make substantial progress.

Can EU politicians take action?

Read more