With a new AI-powered chatbot announcing its entry, the chatbot competition is feeling more like a fantasy battle game where it’s all about who’s got a bigger weapon and a more fragile ego. While things are getting heated in the artificial intelligence space, the rush to outperform has led to embarrassing fails this week.
Also read: Bard to Bing: Rise of ChatGPT fuels AI-based search chatbots
Expensive fails: What’s happening?
Google’s Bard chatbot, designed to rival OpenAI’s ChatGPT and launched on 7 February had a rough start after it delivered incorrect information in its promotional video. Google posted a short video of Bard in action on Twitter, describing it as a "launchpad for curiosity" that can simplify complex topics. However, Reuters caught an inaccurate answer a few hours before its launch event in Paris.
To the question of what to tell a nine-year-old about new discoveries from the James Webb Space Telescope, Bard said it was used to take the first pictures of a planet outside the Earth’s solar system, which is not true. The European Southern Observatory’s Very Large Telescope (VLT) took the first pictures in 2004, as confirmed by NASA. This misinformation quickly raised questions about its integrity and cost the parent company Alphabet, $100 million.
Once an error is spotted, it doesn't take long before the questioning arena expands. It wasn't long before attention shifted to Microsoft’s AI-powered revamp of Bing, which, as it turns out, has been making up information. Independent AI researcher Dmitri Brereton claimed that Bing AI apparently made several errors and fabricated information about certain products at a public demo last week, as reported by Mint.
For instance, as Brereton elaborated in his blog post, when the chatbot listed the cons of the Bissel Pet Hair Eraser Handheld Vacuum, it included the noise level and short cord. However, when Brereton compared the response to the article it cited as a source, the review didn’t mention the noise and turns out, it’s cordless.
He further added, “Bing AI did a great job of creating media hype, but their product is no better than Google’s Bard.”
Why are the chatbots failing?
Last week, Prabhakar Raghavan, senior vice president at Google and head of Google Search, told Germany's Welt am Sonntag newspaper that the kind of AI that’s become a popular topic today can lead to something the tech space calls “hallucinations.”
"This then expresses itself in such a way that a machine provides a convincing but completely made-up answer," Raghavan told the paper, as reported by Reuters. This might explain Microsoft’s Bing making up the cons for the vacuum.
A research paper titled ‘Survey of Hallucination in Natural Language Generation,' published in ACM Computing Surveys last year, stated that in natural language processing, a subfield of linguistics, computer science, and artificial intelligence focused on the interactions between computers and human beings, “hallucinated text gives the impression of being fluent and natural despite being unfaithful and nonsensical. It appears to be grounded in the real context provided, although it is actually hard to specify or verify the existence of such contexts.”
For instance, in his blog post, Brereton claimed that Bing AI's attempt to summarize a financial report for Gap clothing presented wrong financial data and that the answer was riddled with inaccuracies. He specifically highlighted that diluted earnings per share was a “completely made up number.”
With AI hallucinations, at first glance, the information might seem plausible, but when you dig deeper, it becomes clear that it's from the AI’s imagination. According to the paper, AI hallucinations can be extrinsic—the “generated output that contradicts the source content”—or intrinsic—the “generated output that cannot be verified from the source content.”
The researchers state that hallucination is concerning as it affects performance, raises safety concerns for real-world applications and could lead to privacy violations.
Not well integrated…yet
OpenAI CEO Sam Altman admitted ChatGPT’s flaws recently in the New York Times tech podcast, Hard Fork. "People really love it, which makes us very happy. But no one would say this was like a great, well-integrated product yet," Altman said.
On a different platform, Greg Brockman, cofounder of OpenAI, said that launching ChatGPT in its current form was “a last resort for the company after internal issues with beta testers”, as Business Insider reported.
Race against time?
With one tech giant trying to rival another’s AI chatbot, it feels like there is a rush to find out who comes out with the best AI chatbot and less focus on the nitty-gritty that makes these tech developments often worth the wait.
The user-reported error messages of ChatGPT, Bing AI’s made-up figures, and Bard’s misinformation come with a reminder that AI technology is still in its infancy so the many debates about tech taking over the world might need to pause for a bit.
Also read: How is the law keeping up with digital art?