Estée Lauder Companies Leverages Cloud Run Worker Pools for Scalable AI Workloads
Estée Lauder Companies migrated its Rostrum platform, a polymorphic chat service for LLM-powered applications, to a producer-consumer model using Cloud Run worker pools to handle the surge in traffic during the holiday season for its consumer-facing AI application, Jo Malone London's AI Scent Advisor.
Why it matters
Estée Lauder Companies' use of Cloud Run worker pools demonstrates how serverless platforms can be leveraged to build scalable and reliable AI-powered applications, especially for handling high-traffic, distributed workloads.
Key Points
- 1Estée Lauder Companies decoupled the user-facing web tier from LLM operations using Cloud Run worker pools
- 2This architecture provided 100% message durability, strong UI latency SLAs, and minimal operations overhead
- 3Cloud Run worker pools are now the foundation for Estée Lauder Companies to rapidly launch specialized AI advisors across its diverse brand portfolio
Details
Cloud Run has traditionally been used for request-driven web applications and batch processing jobs. However, as developers build more complex applications like pipelines that process continuous streams of data or distributed AI workloads, they need an environment designed for continuous, background execution. Estée Lauder Companies' Rostrum platform, a polymorphic chat service for LLM-powered applications, originally ran as a standalone Cloud Run service. To launch their first consumer-facing generative AI application, Jo Malone London's AI Scent Advisor, they needed an architecture that could sustain the load of AI prompts from thousands of simultaneous users. Estée Lauder Companies migrated to a producer-consumer model using Cloud Run worker pools, where the web tier acts as the producer, instantly publishing user messages to Cloud Pub/Sub, and the worker pools deployments act as 'always-on' consumers, pulling messages from the queue to handle LLM inference. This decoupled architecture provided 100% message durability, strong UI latency SLAs, and minimal operations overhead, allowing the team to focus on the user experience rather than infrastructure. This modular architecture now serves as the blueprint for Estée Lauder Companies to rapidly launch specialized AI advisors across its diverse house of brands.
No comments yet
Be the first to comment