A pair of new studies presents a problematic dichotomy for OpenAI’s ChatGPT large language model programs. Although its popular generative text responses are now all-but-indistinguishable from human answers according to multiple studies and sources, GPT appears to be getting less accurate over time. The fact that no one knows why the deterioration has occurred is perhaps even more disturbing.
A team from Stanford and UC Berkeley noted in a research study published on Tuesday that ChatGPT’s behavior has noticeably changed over time—and not for the better. What’s more, researchers are somewhat at a loss for exactly Why not? The quality of the response is declining.
To examine the consistency of ChatGPT’s underlying GPT-3.5 and -4 programs, the team tested the AI’s tendency to “drift,” i.e. It was also tested on its ability to correctly follow commands and provide answers with different levels of accuracy. Researchers asked ChatGPT 3.5 and -4 both to solve math problems and answer sensitive and danger questions. They also asked them to visually reason and generate…
