Google Cloud AI5h ago|Business & Industry Products & Services

Estée Lauder Companies Leverages Cloud Run Worker Pools for Scalable AI Workloads

Estée Lauder Companies migrated its Rostrum platform, a polymorphic chat service for LLM-powered applications, to a producer-consumer model using Cloud Run worker pools to handle the surge in traffic during the holiday season for its consumer-facing AI application, Jo Malone London's AI Scent Advisor.

💡

Why it matters

Estée Lauder Companies' use of Cloud Run worker pools demonstrates how serverless platforms can be leveraged to build scalable and reliable AI-powered applications, especially for handling high-traffic, distributed workloads.

Key Points

1Estée Lauder Companies decoupled the user-facing web tier from LLM operations using Cloud Run worker pools
2This architecture provided 100% message durability, strong UI latency SLAs, and minimal operations overhead
3Cloud Run worker pools are now the foundation for Estée Lauder Companies to rapidly launch specialized AI advisors across its diverse brand portfolio

Details

Cloud Run has traditionally been used for request-driven web applications and batch processing jobs. However, as developers build more complex applications like pipelines that process continuous streams of data or distributed AI workloads, they need an environment designed for continuous, background execution. Estée Lauder Companies' Rostrum platform, a polymorphic chat service for LLM-powered applications, originally ran as a standalone Cloud Run service. To launch their first consumer-facing generative AI application, Jo Malone London's AI Scent Advisor, they needed an architecture that could sustain the load of AI prompts from thousands of simultaneous users. Estée Lauder Companies migrated to a producer-consumer model using Cloud Run worker pools, where the web tier acts as the producer, instantly publishing user messages to Cloud Pub/Sub, and the worker pools deployments act as 'always-on' consumers, pulling messages from the queue to handle LLM inference. This decoupled architecture provided 100% message durability, strong UI latency SLAs, and minimal operations overhead, allowing the team to focus on the user experience rather than infrastructure. This modular architecture now serves as the blueprint for Estée Lauder Companies to rapidly launch specialized AI advisors across its diverse house of brands.

Estée Lauder Companies Leverages Cloud Run Worker Pools for Scalable AI Workloads

Why it matters

Key Points

Details

Dive deeper

Related Articles

Securing AI Inference on GKE with Model Armor

New GKE Cloud Storage FUSE Profiles Simplify AI Storage Con…

Claude Mythos Preview: Available in private preview on Vert…

Rightmove Reinvents Property Search with Unified Data and AI

Ultimate Prompting Guide for Lyria 3 Music Generation Models

Build Music Generation into Your Apps with Lyria 3 Models o…

Introducing Gemma 4 on Google Cloud: Our Most Capable Open …

Introducing Veo 3.1 Lite and Veo Upscaling on Vertex AI

Envoy: A Future-Ready Foundation for Agentic AI Networking

How Honeylove Boosts Product Quality and Service Efficiency…

AI Curator

Ask me anything about AI

Related Articles

Securing AI Inference on GKE with Model Armor

New GKE Cloud Storage FUSE Profiles Simplify AI Storage Con…

Claude Mythos Preview: Available in private preview on Vert…

Rightmove Reinvents Property Search with Unified Data and AI

Ultimate Prompting Guide for Lyria 3 Music Generation Models

Build Music Generation into Your Apps with Lyria 3 Models o…

Introducing Gemma 4 on Google Cloud: Our Most Capable Open …

Introducing Veo 3.1 Lite and Veo Upscaling on Vertex AI

Envoy: A Future-Ready Foundation for Agentic AI Networking

How Honeylove Boosts Product Quality and Service Efficiency…