Air Canada chatbot error underscores AI’s enterprise liability danger

A civil tribunal in Canada has ordered Air Canada to pay for a mistake made by a customer-service chatbot, highlighting the need for companies to better train and monitor their artificial intelligence (AI) tools.

British Columbia resident Jake Moffatt visited Air Canada’s website in November 2022 to book a flight for his grandmother’s funeral in Ontario. The website’s chatbot told him he could be refunded a portion of the next-day ticket and the return ticket, if he applied for the discount within 90 days.

That information was incorrect; Air Canada’s policy, available on its website, is to provide bereavement discounts if the customer applies in advance. After Air Canada refused to provide the discount, a Canadian tribunal ordered the airline to pay about $600 in bereavement refunds and tribunal costs — about half of what Moffatt paid for the tickets.

Companies using chatbots and other generative AI (genAI) tools must invest in monitoring efforts “in order to save money from chatbot productivity gains,” said Avivah Litan, a distinguished vice president analyst focused on AI at Gartner. “Otherwise, they will end up spending more on legal fees and fines than they earn from productivity gains.”

In the Air Canada case, Christopher Rivers, a member of the Brit ish Columbia Civil Resolution Tribunal, sided with Moffatt

and rejected the airline’s assertion that the chatbot is “a separate legal entity that is responsible for its own actions.”

Air Canada could not explain why the information on bereavement discounts on its website was more reliable than what was provided by the chatbot, Rivers wrote in his Feb. 14 ruling. “Air Canada owed Mr. Moffatt a duty of care,” he added. “Generally, the applicable standard of care requires a company to take reasonable care to ensure their representations are accurate and not misleading.”

Three analysts who focus on the AI market agreed that companies using chatbots and other AI tools need to check their output. About 30% of genAI answers are fictional, an output called a “hallucination,” Litan said.

“Companies using chatbots must use guardrails that highlight output anomalies such as hallucinations, inaccurate, and illegal information — and set up human review operations that investigate and block or approve these outputs before they are disseminated,” she said. “They must ensure that outputs, especially in customer-facing applications, are accurate and that they do not steer customers or the organization managing the chatbot down the wrong path.”

GenAI chatbots aren’t ready for customer-service interactions unless companies using them invest in reliability, security, and safety controls, she argued. Companies using chatbots should set up new operations to manually review unanticipated responses highlighted by anomaly detection tools.

Cases where chatbots provide the wrong information highlight the need for companies to focus on responsible AI, said Hayley Sutherland, research manager for knowledge discovery and conversational AI at IDC. Companies need to invest in testing and training the AI tools they use, she recommended.

“Regardless of what format or UI [AI is] delivered in, companies tend to be held responsible for the information they provide to customers, so it’s wise to proceed with caution,” she said.

Sutherland recommended that companies eyeing chatbots and other AI tools first use them for less sensitive internal cases, such as employee knowledge assistance, instead of jumping straight into customer service.

AI hallucinations can be plausible sounding, even while they provide incorrect information, she noted. To combat the problem, “generative AI systems should include a ‘human in the loop’ and other mechanisms to bound, ground, and validate chatbot output during the training phase, as well as with continuous testing,” Sutherland said.

Another problem is that current chatbots can only handle a few simple tasks, said David Truog, vice president and principal analyst at Forrester. “Unfortunately, companies deploying chatbots are often overconfident in bots’ effectiveness,” he said. “They underestimate the complexity of creating an effective bot; the thing they most often and disastrously neglect is much-needed expertise in human-centered design and in conversation design.”

Companies shouldn’t expect chatbots to get a special legal status, he said.

“Chatbots are software, just like the rest of a company’s website or app,” Truog said. “And any organization that deploys software to interact with its customers on its behalf is responsible for whatever that software does. It’s common for customers to anthropomorphize chatbots somewhat since they use human languages, but that’s no excuse for companies to do the same to the point of washing their hands of any responsibility when their bot misbehaves.”