Migrating AI-infused chat to Kubernetes
Steven Jones and Nicholas Fong walk you through migrating a chatbot, cognitive search, and other services to a Kubernetes-based architecture. Technologies include multiregion clusters, load balancers, integrating Express and Flask servers, and high-speed data transfer for importing models.
Talk Title | Migrating AI-infused chat to Kubernetes |
Speakers | Steven Jones (IBM), Nicholas Fong (IBM) |
Conference | O’Reilly Software Architecture Conference |
Conf Tag | Engineering the Future of Software |
Location | New York, New York |
Date | February 24-26, 2020 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
Steven Jones and Nicholas Fong explain how a team of messaging and AI engineers at IBM builds, tests, deploys, and manages 20+ customer-facing chatbots that serve people across various geographies and languages on the web every day. The chatbots had to be able to answer customers’ most common questions, navigate users to the right resource, and be able to transfer the chat to the most knowledgeable human. To process natural language queries, IBM needed a Python-based microservice that could leverage that vast number of natural language and data processing libraries already available in Python. This service leveraged large existing language models as well as custom-built models and had high disk, CPU, and memory requirements. IBM fronted the Python-based microservice with a Node.js Express server to be able to handle large request volume, hand-off to other microservices, or to its own NLP microservice. The Express server also services IBM’s frontend chat experience. These same JavaScript developers worked on this microservice. This Node.js microservice was lean and required little memory and disk space. The teams needed an easy way to manage, deploy, scale, and update these microservices. They found out-of-the-box platform-as-a-service (PaaS) technologies had CPU, memory, and disk limitations that prevented them from leveraging large models. Bare-metal servers are hard to manage. IBM’s microservices intercommunicate frequently, so network communication speed is important. Kubernetes met all of the teams’ needs: it’s scalable, flexible, manageable, and fast. Steven and Nicholas walk you through migrating IBM’s chatbot, cognitive search, and other services to a Kubernetes-based architecture. Technologies include multiregion clusters, load balancers, integrating Express and Flask servers, and high-speed data transfer for importing models.