A reflection by Batiste Roger, CTO of Odonatech
I'll spare you the clichéd questions like 'Have we chosen the right use case?' This post contains 10 questions that - from experience - will truly make a difference when integrating generative AI.
I'll give you one preliminary piece of advice: it's often useful to put real users in front of ChatGPT and see if they can solve the problem / use case that way. This will give you valuable insights into what they expect from your Layer 2 AI, and how they interact with it (e.g., do they prefer formal or informal language?).
🔒 Security and compliance
Data protection: What data do we agree to transmit to ChatGPT? Are these messages transmitted individually, or in a complete conversation flow (we could send every other message to ChatGPT and the rest to Claude, for example, to avoid transmitting an entire conversation)? Can we anonymize certain elements (for example, replacing first names with Michael and last names with Jordan, it's easy).
Reliability of responses: What strategy will we implement to prevent, detect, and quickly correct potentially erroneous or biased AI responses? Will human intervention be necessary when the bot makes a mistake? Within what timeframe can we ensure this correction, even when the bot is available 24/7? (We can't expect Tom Brady to fix every pass, can we?
Expertise and compliance: How will we ensure that the AI's responses are expert, personalized, and compliant with banking regulations? What knowledge bases and technologies will be used to (1) train the AI upstream, and (2) control the AI downstream (RAG, critics, layer 2, etc.)? How will these knowledge bases be created and maintained (is the chatbot manager responsible, or the subject matter expert for each database)?
🔧 Technical integration and differentiation
Integration with existing systems: What methodology will we adopt to integrate AI into our existing systems (CRM, operations) while ensuring our security and stability? Do we want the AI to read from our systems? Write to them? In real-time or not? Can we interface with a buffer database? We need to make sure our AI isn't just a bull in a China shop.
Update management: What process will we establish to effectively manage the evolution of underlying AI models without disrupting our services? For example, can we easily switch from ChatGPT to Mistral? I'm not just talking about connecting APIs, but also re-testing everything and adjusting prompts. We don't want to fumble this handoff like the New York Jets' quarterback situation.
Brand differentiation: How will we customize our AI to fully reflect our values and brand, despite using common base models (like ChatGPT) that our competitors also use? Instructions, prompts, fine-tuning, RAG, others? We need to ensure our AI isn't just another voice in the crowd, but the Beyoncé of banking bots..
🔄 Change management and customer experience of generative AI
Training and tooling: What interfaces/tools and training programs will we develop for our teams? Who will be the key players involved in supervising and complementing AI interactions? Will our employees be enthusiastic about becoming AI supervisors (for example)? We need to make sure we're not turning our staff into glorified robot babysitters, but empowering them to be AI whisperers.
Performance measurement: What key performance indicators (KPIs) will we implement to evaluate the success and ROI of our AI implementation? Of course, there are overall business indicators, but also more micro indicators, such as those that verify that version n+1 is better than version n (understanding that it's unrealistic to only test manually and use gut feeling to say if 'it seems better'). We need to be as precise in our measurements as a Swiss watchmaker, not just winging it like a weekend golfer.
Preference management: What procedure will we establish to accommodate users who prefer not to interact with AI or who explicitly express dissatisfaction?
Continuity plan: What business continuity strategy will we develop in case of AI API interruptions or major changes with our provider (e.g., OpenAI goes under, or Mistral experiences a technical issue for 30 minutes)?
Conclusion
I hope this will help you ask the right questions!
If this seems complicated, you're somewhat right: you will indeed need to develop automated testing software (specific to AI), set up deployment processes, enable your botmasters (or support teams) to use a supervision interface, coordinate compliance and data scientists, etc... These are things we've already done at Odonatech, and it's one of the sources of value in our solution. So, we can also discuss this if you'd like.
Comments