Dev.to Machine Learning3h ago|Research & PapersProducts & Services

ONNX Runtime + pgvector in Django: Semantic Search Without PyTorch or External APIs

This article discusses a solution for implementing semantic search in a Django-based web application without relying on external APIs or PyTorch dependencies. It leverages ONNX Runtime for efficient model inference and pgvector for vector storage.

đź’ˇ

Why it matters

This approach offers a cost-effective and privacy-preserving solution for implementing semantic search in small-to-medium web applications, without the overhead of external APIs or heavyweight PyTorch dependencies.

Key Points

  • 1Calling external embedding APIs introduces costs, latency, and privacy concerns for small-to-medium web applications
  • 2ONNX (Open Neural Network Exchange) allows running pre-trained models without the original framework, reducing deployment complexity
  • 3ONNX Runtime provides a lean alternative to PyTorch, enabling efficient inference on CPU or GPU
  • 4The article uses the paraphrase-multilingual-MiniLM-L12-v2 model exported to ONNX format for semantic search
  • 5pgvector is used for vector storage, eliminating the need for a separate vector database

Details

The article presents a solution for implementing semantic search in a Django-based web application without relying on external APIs or PyTorch dependencies. The author explains the limitations of the common approach of calling an external embedding API, which introduces costs, latency, and privacy concerns, especially for small-to-medium applications. To address these issues, the article introduces the use of ONNX (Open Neural Network Exchange) and ONNX Runtime. ONNX is a serialization format that allows running pre-trained models without the original framework, reducing deployment complexity. ONNX Runtime is a lean inference engine that can execute ONNX models on CPU or GPU, providing a more efficient alternative to PyTorch. The article showcases the use of the paraphrase-multilingual-MiniLM-L12-v2 model, exported to ONNX format, for semantic search. Additionally, the article discusses the use of pgvector for vector storage, eliminating the need for a separate vector database.

Like
Save
Read original
Cached
Comments
?

No comments yet

Be the first to comment

AI Curator - Daily AI News Curation

AI Curator

Your AI news assistant

Ask me anything about AI

I can help you understand AI news, trends, and technologies