Why So Many ML Algorithms Exist Despite Big Names
The article discusses why multiple machine learning algorithms continue to exist and be used, even when there are dominant algorithms like XGBoost, LightGBM, and CatBoost. It argues that the idea of a single 'best' algorithm is a myth, as algorithms optimize for different constraints beyond just accuracy.
Why it matters
This article provides an insightful perspective on why a diversity of machine learning algorithms continues to exist, even as certain algorithms become dominant. It highlights the importance of considering real-world system constraints beyond just model accuracy.
Key Points
- 1Benchmarks optimize for a single metric, but real-world production systems care about factors like latency, memory footprint, and infrastructure compatibility
- 2Algorithms don't compete in a vacuum, they compete inside systems with specific constraints
- 3Cloud providers make money based on resource consumption, not just model accuracy, so inference efficiency is crucial
- 4Dominant algorithms are not designed to optimize for all constraints, leaving room for specialized alternatives
Details
The article argues that the machine learning field has not converged on a single 'best' algorithm because algorithms optimize for different constraints beyond just accuracy. In production environments, factors like latency, memory usage, and infrastructure compatibility matter more than just benchmark scores. Each algorithm has its own strengths, such as tree ensembles for accuracy, linear models for speed, and KNN for interpretability. Cloud providers also make money based on resource consumption, not just model accuracy, so inference efficiency is a key consideration. Dominant algorithms like XGBoost are highly engineered but were not designed to optimize for all possible constraints, leaving room for specialized alternatives like SmartKNN that focus on predictable inference cost and low latency. The article concludes that the continued existence of multiple ML algorithms is a sign that the technology is being used in the real world, not just optimized on paper.
No comments yet
Be the first to comment