With prompt engineers among the workers most in demand in the wake of generative AI’s arrival in the enterprise, it was inevitable that someone would investigate whether their role, too, could be automated, or at least facilitated, by AI.
And, indeed, a recent study focused on how to write the best prompts for a large-language model (LLM) AI to solve mathematical problems has found that another AI gets better results than a human. The study sought to determine whether human-generated “positive thinking” prompts—such as “this will be fun!” or “take a deep breath and think”—produce better responses. The results were mixed when using different LLMs.
AI-optimized prompts win
However, using AI-optimized prompts instead “consistently equaled or surpassed the effectiveness of our manually generated ‘positive thinking’ prompts in nearly all instances,” wrote researchers Rick Battle and Teja Gollapudi of VMware.
Their conclusion: It’s easy to get an LLM to come up with new answers by feeding it different prompts. It’s more difficult to produce consistently great answers through human-generated prompts.
“Affecting performance is trivial,” they wrote. “Improving performance, when tuning the prompt by hand, is laborious and computationally prohibitive when using scientific processes to evaluate every change.”
Battle and Gollapudi cite a 2023 study, from Chengrun Yang of Google DeepMind and other researchers, coming to a similar conclusion. AI-optimized prompts can be AI model- and task-specific, while similar human-generated prompts can produce “drastically different performance,” the Yang study says.
Their research, along with Yang’s, “highlights the superior capability” of Ais to optimize their own prompts.
“Engaging in the iterative process of refining prompts and monitoring the subsequent score progression can be an enjoyable endeavor,” Battle and Gollapudi write. “However, this approach proves to be highly time-inefficient, especially when systematically assessing all modifications from a scientific standpoint.”
It probably shouldn’t be surprising that AI prompts net the best AI results, even if the concept of “best” may be subjective, said Daniel Freeman, a senior consultant and AI expert at Scotwork International, a global negotiation skills and training company.
“We’re already addressing the biases inherent in AI models, so perhaps the ‘best’ prompt is one that yields unintended outcomes for the user,” he said.
A recent article on Business Insider suggested that AI is coming for the high-paying AI prompt engineering jobs, and Freeman raised similar concerns as researchers and AI developers explore the boundaries of AI.
Ethical considerations
“When we reach a point where we’re asking AI to prompt itself, the question arises: Are we inadvertently prioritizing cost-cutting over larger ethical considerations?” he said. “Do we really want to side-line human involvement?”
The recent studies suggest the possibility of removing humans from AI and training, but that’s a “precarious path to tread,” he added. “We rely on the human touch to guide the development of this emerging technology sensibly.”
Human guidance is needed to guide AI chatbots away from giving out inaccurate information, to cut back on plagiarism in academia, and to resolve ethnical dilemmas in AI-generated art, for example, Freeman noted.
Freeman has seen AI become increasingly integrated into negotiation training processes with his job at Scotwork. “It’s foreseeable that AI will continue to expand its role in future business negotiations and beyond, but striking the right balance between AI and human capabilities will be paramount for successful integration and the preservation of negotiation as an art form,” he said.