A recent study from Stanford University has sparked a debate in the artificial intelligence (AI) community by suggesting that the extraordinary claims of emergent abilities of AI are largely illusory. The findings challenge the perception that AI is developing at an unprecedented rate and exhibits abilities far beyond human understanding.

Some of these extraordinary claims come from the developers of the AI systems themselves. Sundar Pichai, the CEO of Google, claimed on a 60 Minutes segment that Google’s large language model (LLM) AI Bard could translate Bengali, despite not being trained to do so.

Researchers at Microsoft also released a preprint paper titled “Sparks of Artificial General Intelligence: Early experiments with GPT-4” that suggested GPT-4 could “solve novel and difficult tasks…without needing any special prompting.”

The idea that large LLMs like ChatGPT or Bard suddenly gained abilities that they weren’t designed to have already didn’t make sense to AI experts. That’s not how LLMs work.

LLMs are not artificial general intelligence, the term used to describe a system similar to human intelligence that can learn, comprehend and perform various intellectual tasks they weren’t specifically programmed to solve.

Instead, LLMs work by analyzing vast amounts of text data and learning patterns, structures, and relationships within the text. They are trained using deep learning techniques, enabling them to generate human-like responses or predictions based on the context provided.

However, these models cannot produce information they were not exposed to during their training, as their knowledge is solely based on the text data they have processed.

What Explains These Extraordinary Claims?

The study, authored by Rylan Schaeffer, Brando Miranda, and Sanmi Koyejo, examined multiple cases of AI systems displaying novel, unexpected, and impressive abilities.

The researchers focused on the impact of measurement techniques on the perception of AI’s emergent abilities. They found that employing non-linear or discontinuous metrics could lead to AI performance appearing to have sudden, unexpected shifts, which are then mistakenly attributed to emergent capabilities.

In actuality, the performance progression seems consistent, and it seems that the claims of emergent abilities are simply predictable outliers. The researchers added that the problem is likely worsened by not having enough data.

The results changed significantly when linear metrics, such as ‘Token Edit Distance’ and ‘Brier Score’, which output continuous results, were used. When the researchers switched from using non-linear to linear metrics, they observed that the AI model’s progress was more predictable and steady, debunking the supposed “emergent” characteristics of its abilities.

The paper used baseball metrics to explain what’s happening here a bit better. They give the example of using 2 different statistics to gauge a player’s hitting skill: average distance hit and whether the player’s average distance hit is 325 ft or greater.

Both metrics can be helpful, but the continuous metric ‘average distance hit’ provides more insight into a player’s performance, especially for outliers. It will likely be smooth and continuous whereas the other metric will return a non-linear yes or no.

With a non-linear metric, outliers may appear more unusual due to limited data for understanding them, while a continuous metric provides a more comprehensive picture.

The Stanford researchers go on to advise AI researchers to confirm their results with more and better metrics before making extraordinary claims.

Related Articles:

Ethereum Price Prediction 2023 – 2030

Elon Musk Threatened to Give @NPR Handle to Another Company

What's the Best Crypto to Buy Now?

  • B2C Listed the Top Rated Cryptocurrencies for 2023
  • Get Early Access to Presales & Private Sales
  • KYC Verified & Audited, Public Teams
  • Most Voted for Tokens on CoinSniper
  • Upcoming Listings on Exchanges, NFT Drops