[ad_1]
LOS ANGELES, April 8 (The Conversation) The past few years have seen an explosion of progress in large-scale language-model AI systems that can write poetry, hold human-like conversations and pass medical school exams.
This advance has resulted in models like ChatGPT that could have major social and economic impacts, ranging from job losses and increased misinformation to massive increases in productivity.
Read also | Oklahoma shooting: No threat seen after shooting on college campus.
Despite their impressive capabilities, large language models don’t actually think. They tend to make basic mistakes and even make things up.
However, since they produce fluent speech, people tend to respond to them as if they were thinking.
This has led researchers to study the “cognitive” capabilities and biases of models, an effort that has become increasingly important as large language models become more widely available.
This series of research can be traced back to the early large-scale language models, such as Google’s BERT, which was integrated into its search engine, so it is called BERTology. This research has revealed a lot about what such models can do and where they go wrong.
For example, cleverly designed experiments showed that many language models have trouble handling negation—for example, a question formulated as “what isn’t”—as well as performing simple calculations.
They may be overconfident in their answers, even at the wrong times. Like other modern machine learning algorithms, they have a hard time explaining themselves when asked why they answer a certain way.
words and thoughts
Inspired by the growing research in related fields such as BERTology and cognitive science, my student Zhisheng Tang and I set out to answer a seemingly simple question about large language models: are they rational?
Although in everyday English the word rational is often used as a synonym for sane or reasonable, it has a specific meaning in the field of decision-making.
A decision-making system—whether it is an individual or a complex entity like an organization—is rational and, given a set of choices, chooses to maximize expected payoff.
The qualifier “anticipates” is important because it indicates that decisions are being made in the presence of significant uncertainty. If I flip a fair coin, I know that on average half the time it will come up heads.
However, I cannot predict the outcome of any given coin toss. That’s why casinos can afford the occasional big payout: On average, even narrow house odds yield huge profits.
On the surface, it might seem strange to assume that a model designed to make accurate predictions about words and sentences without actually understanding their meaning can understand expected returns.
But there is plenty of research showing that language and cognition are intertwined.
A good example is the pioneering research done by scientists Edward Sapir and Benjamin Lee Whorf in the early 20th century.
Their work shows that a person’s native language and vocabulary can shape the way a person thinks.
The extent to which this is true is debated, but is supported by anthropological evidence from the study of Native American cultures.
For example, the Zuñi language spoken by the Zuñi people of the American Southwest does not have separate words for orange and yellow, and they cannot distinguish between these colors as effectively as languages ​​that use separate words for orange and yellow. color.
bet
So are language models rational? Can they understand expected returns? We conduct a detailed set of experiments to show that in their original form, models like BERT behave randomly when presented with bet-like choices.
This is the case even if we give it a trick problem: if you flip a coin and it comes up heads, you win a cube; if it comes up tails, you lose a rook. Which would you choose? The correct answer was heads, but the AI ​​model picked tails about half the time.
Interestingly, we found that the model could be taught to make relatively rational decisions using only a small set of example questions and answers.
At first glance, this seems to indicate that the model can indeed do more than “play” language. However, further experiments revealed that the situation is actually much more complicated.
For example, when we designed our betting problem using cards or dice instead of coins, we saw a significant drop in performance, by more than 25%, although it was still higher than random selection.
Thus, the idea that models can be taught general principles of rational decision-making remains unresolved at best.
Our recent case study using ChatGPT confirms that decision-making remains an important and unsolved problem even for larger and more advanced large-scale language models.
make the right decision
This line of research is important because making rational decisions under uncertainty is critical to building systems that understand costs and benefits.
By balancing expected costs and benefits, intelligent systems may do a better job than humans at planning for supply chain disruptions the world is experiencing during the COVID-19 pandemic, managing inventory or acting as financial advisors.
Our work conclusively shows that humans need to guide, review and edit their work if large language models are to be used for such purposes.
Until researchers figure out how to endow large language models with general plausibility, these models should be treated with caution, especially in applications that require high-stakes decisions. (dialogue)
(This is an unedited and auto-generated story from a Syndicated News feed, the content body may not have been modified or edited by LatestLY staff)
share now
[ad_2]
Source link