Reranking 565K Products Using Deep Learning at SeeStocks
SeeStocks, a price comparison engine, built a multi-stage deep learning pipeline to rerank over 565,000 products and improve relevance on their category pages.
Why it matters
This case study demonstrates how deep learning can be effectively applied to improve product search and discovery in large-scale ecommerce platforms.
Key Points
- 1Implemented a 3-stage pipeline: candidate retrieval, cross-encoder reranking, and business rules/diversity
- 2Leveraged vision-language models, taxonomic distance, and price distribution to score product relevance
- 3Faced challenges with flat product taxonomies and built a hierarchical disambiguation layer to improve classification
- 4Deployed the pipeline in production, achieving significant improvements in relevance, misclassification, and user engagement
Details
SeeStocks, a Spanish price comparison platform, manages a catalog of over 565,000 products across multiple retailers. To ensure the most relevant products appear first on their category pages, they built a multi-stage deep learning pipeline. The first stage uses approximate nearest neighbor search against pre-computed category embeddings to retrieve a broad set of candidate products. These candidates are then reranked in the second stage using a cross-encoder model that evaluates visual similarity, taxonomic distance, title-category coherence, and price distribution. Finally, the pipeline applies business rules like deduplication, retailer diversity, and freshness decay. Key challenges included dealing with the limitations of flat product taxonomies, which they solved by building a hierarchical disambiguation layer. The full pipeline runs on a single GPU server with under 200ms end-to-end latency, and has led to significant improvements in relevance, misclassification, and user engagement metrics.
No comments yet
Be the first to comment