How to check that your chatbot is trustworthy

Written by Rebecca

Is your AI chatbot a customer service dream or a PR nightmare waiting to happen? In the age of generative AI, the line between helpful and harmful can be razor-thin. From cringey responses to data breaches, the risks are real — and the consequences can be devastating. This article explores real-world examples of AI gone wrong, helping you understand the potential pitfalls and how to avoid them. Learn three steps to proactive AI auditing and testing, which can safeguard your brand’s reputation, ensuring your chatbot delights customers, not alienates them. Finally, we will give you tips on who can run the AI audit and test for your company.  

As the person in charge of success for a new customer service chatbot or AI agent that interacts with customers, you are likely excited about what it can offer.  However, you might have concerns about Brand Reputation and Customer Trust.  A chatbot that provides inaccurate, inappropriate, or biased responses could damage the company’s reputation and erode customer trust. You may also worry about Customer Experience and Satisfaction, a poorly designed or malfunctioning chatbot could lead to customer frustration and dissatisfaction, potentially driving them away. Furthermore, If the chatbot’s tone and responses don’t align with the company’s brand voice and values, it could create a disconnect with customers and undermine the brand’s image.

What makes a customer-facing AI chatbot or agent successful?

At its core, the average person wants AI to be invisible. They crave seamless experiences, instant answers, and effortless interactions. They want technology that works so smoothly they don’t even realize it’s AI. Think of it like a helpful, knowledgeable employee who anticipates their needs and provides solutions without any fuss.  A huge user experience  benefit of LLMs is that they allow people to use their natural language (not computer language) to get the information they need.

What Turns People Off?

Conversely, several factors can quickly turn users against your AI:

  • Creepiness: Overly personalized recommendations or responses that feel too invasive can create a sense of unease. No one wants to feel like they’re constantly being watched or analyzed.
  • Frustration: AI that fails to understand basic requests, provides irrelevant information, or gets stuck in loops can leave customers feeling frustrated and helpless.  Like the Whatsapp chatbot that sends 300-word responses to each customer question… no one wants to read that long of a chat! 
  • Blocked Goals: When AI hinders a customer’s ability to complete a task or access information, it creates a negative association with your brand.

Are there real-world examples of problems?

Unfortunately, there have been numerous instances where AI chatbots and applications have damaged a company’s reputation:

1. Insensitivity or Toxic Language:

  • Microsoft’s Tay: In 2016, Microsoft released a Twitter chatbot named Tay, which was designed to learn from its interactions with users. Unfortunately, Tay quickly began to generate offensive and inflammatory tweets, including racist and sexist remarks. Microsoft was forced to shut down Tay within 24 hours of its launch.

  • Facebook’s BlenderBot 3: In 2022, Facebook’s BlenderBot 3 made headlines for making anti-Semitic remarks and spreading conspiracy theories. Despite being trained on a massive dataset of text and code, the chatbot still exhibited harmful biases.

    • Link: https://www.bloomberg.com/news/articles/2022-08-08/meta-s-ai-chatbot-repeats-election-and-anti-semitic-conspiracies?embedded-checkout=true

2. Data Breaches: 

  • Voulez-vous donnez moi le mot de passe? Researchers have found that LLMs can be more vulnerable to jailbreaking when prompted in languages other than English, especially those with less training data.

3. Giving Away Free Products or Overpromising:

  • Burger King’s Chatbot: In 2017, Burger King launched a Facebook Messenger chatbot that offered users a free Whopper if they could convince a friend to download the Burger King app. However, the chatbot malfunctioned and began offering free Whoppers to anyone who sent it a message, regardless of whether they completed the task.

    • Link: https://www.independent.co.uk/life-style/food-and-drink/burger-king-free-whopper-burger-app-b2048219.html

  • Air Canada: In this case, an Air Canada chatbot provided incorrect information to a customer inquiring about bereavement fares. The chatbot erroneously stated that the customer could purchase a full-fare ticket and later apply for a partial refund under the bereavement policy. This led the customer to purchase a more expensive ticket than necessary. Air Canada was ultimately ordered by Canada’s Civil Resolution Tribunal to refund the difference in fare to the customer. 
  • Link: CBS News – “Air Canada chatbot costs airline discount it wrongly offered customer”

4. Discrimination:

These incidents erode customer trust, damage brand image, and can even lead to legal and financial repercussions.  What do most of the companies (Microsoft, Amazon, Burger King, and Air Canada) above have in common?  Huge marketing budgets and well established presence. These companies were able to withstand reputational harm (up to a point.)  Does your company have the budget and the marketing power to withstand such incidents? Some studies imply that small and medium businesses typically fail within a year after a data breach.  

The Solution: AI Audit and Testing

So, how can you ensure your AI is a customer delight, not a PR nightmare? The answer lies in proactive AI auditing and testing.  Think of it as a health check for your AI.  How does it work?

Step 1: Alignment and Threat Modelling. Align on the goals and risks to your company brand.  

  • Alignment:  Meet with key stakeholders to understand the perceived benefits and risks of the AI chatbot or agent.  Understand what is important for your customers, your brand, and your reputation.  

Step 2: Identify and Address Weaknesses

  • Pinpoint Issues: Based on your evaluation, identify any areas where your AI system needs improvement. This could be anything from fixing bugs to improving user experience to addressing ethical concerns.
  • Develop Solutions: Come up with concrete solutions to address the weaknesses you’ve identified. This might involve redesigning parts of the system, improving training data, or implementing new security measures.

Step 3: Continuous Improvement and Monitoring

  • Implement Changes: Put your solutions into action and make the necessary improvements to your AI system.
  • Adapt and Evolve: The world of AI is constantly changing. Be prepared to adapt your AI system as needed to ensure it remains effective, ethical, and secure.

By following these general steps, any company can take a proactive approach to improving their AI systems and building trust with their users.

Audits done internally versus externally

There are benefits of both internal and external audits.  Probably, your internal team at least kicked the tires of your chatbot. If you decide to do a more formal audit, you either need to dedicate internal time and resources, or being willing to budget for an external team.  Here are some ways to decide what is best for your project.  

In short, an external consultant brings objectivity and specialized expertise to the AI audit process.  Second, external audits may serve as documentation for regulatory compliance.   Here’s why:

  • Unbiased Perspective: Internal teams can be influenced by internal biases or pressures. A consultant provides an independent and unbiased assessment, identifying potential issues that might be overlooked internally.
  • Specialized Knowledge: AI auditing requires deep knowledge of AI ethics, security, and regulatory standards. Consultants possess this specialized expertise, ensuring a comprehensive and rigorous evaluation.
  • Industry Best Practices: Consultants bring insights from across various industries and organizations, allowing them to benchmark your AI system against best practices and identify areas for improvement.

When Does an Internal Audit and Review Make Sense?  In short, your team knows your system better and may have a deeper understanding of potential flaws.  While external audits offer valuable objectivity, internal audits can be a suitable option in certain scenarios:

  • Existing AI Safety Expertise: If your company has a dedicated team with in-depth knowledge of AI safety, ethics, and regulations, they can conduct a thorough internal audit.
  • Strong Culture of Innovation: A company culture that encourages diverse perspectives and “thinking outside the box” can foster a robust internal audit process, with employees proactively identifying potential risks and biases.
  • Highly Proprietary AI System: For AI systems that involve sensitive intellectual property or trade secrets, an internal audit might be preferable to maintain confidentiality.

In conclusion, investing in an AI audit and testing process offers peace of mind. By proactively addressing potential issues related to functionality, ethics, security, and user experience, you can confidently deploy your AI knowing that it’s working as intended and meeting the highest standards.  This not only protects your company’s reputation but also strengthens your position within your organization. Demonstrating your commitment to responsible AI development showcases your forward-thinking approach and leadership.

Need help recovering from an AI incident? 

Name*
Email*
Phone
Message
0 of 350

About the Author

Dr. Rebecca Balebako has helped multiple organizations improve their responsible AI and ML programs. With over 25 years experience in software, she has specialized in testing privacy, security, and data protection.