Estée Lauder Companies Leverages Cloud Run Worker Pools for Scalable AI Workloads

Estée Lauder Companies migrated its Rostrum platform, a polymorphic chat service for LLM-powered applications, to a producer-consumer model using Cloud Run worker pools to handle the surge in traffic during the holiday season for its consumer-facing AI application, Jo Malone London's AI Scent Advisor.

💡

Why it matters

Estée Lauder Companies' use of Cloud Run worker pools demonstrates how serverless platforms can be leveraged to build scalable and reliable AI-powered applications, especially for handling high-traffic, distributed workloads.

Key Points

  • 1Estée Lauder Companies decoupled the user-facing web tier from LLM operations using Cloud Run worker pools
  • 2This architecture provided 100% message durability, strong UI latency SLAs, and minimal operations overhead
  • 3Cloud Run worker pools are now the foundation for Estée Lauder Companies to rapidly launch specialized AI advisors across its diverse brand portfolio

Details

Cloud Run has traditionally been used for request-driven web applications and batch processing jobs. However, as developers build more complex applications like pipelines that process continuous streams of data or distributed AI workloads, they need an environment designed for continuous, background execution. Estée Lauder Companies' Rostrum platform, a polymorphic chat service for LLM-powered applications, originally ran as a standalone Cloud Run service. To launch their first consumer-facing generative AI application, Jo Malone London's AI Scent Advisor, they needed an architecture that could sustain the load of AI prompts from thousands of simultaneous users. Estée Lauder Companies migrated to a producer-consumer model using Cloud Run worker pools, where the web tier acts as the producer, instantly publishing user messages to Cloud Pub/Sub, and the worker pools deployments act as 'always-on' consumers, pulling messages from the queue to handle LLM inference. This decoupled architecture provided 100% message durability, strong UI latency SLAs, and minimal operations overhead, allowing the team to focus on the user experience rather than infrastructure. This modular architecture now serves as the blueprint for Estée Lauder Companies to rapidly launch specialized AI advisors across its diverse house of brands.

Like
Save
Read original
Cached
Comments
?

No comments yet

Be the first to comment

AI Curator - Daily AI News Curation

AI Curator

Your AI news assistant

Ask me anything about AI

I can help you understand AI news, trends, and technologies