Reddit Machine Learning8h ago|Research & Papers Products & Services

Achieving 0.9975 F1 on HDFS Log Anomaly Detection with Mamba-3

The author trained a log anomaly detector using the Mamba-3 architecture, achieving an F1 score of 0.9975 on the HDFS benchmark dataset. This outperforms previous state-of-the-art models and the author is curious to explore the limits of this approach.

💡

Why it matters

This work demonstrates significant performance improvements in log anomaly detection, which is an important task for system monitoring and reliability.

Key Points

1Mamba-3 based log anomaly detector achieved 0.9975 F1 score on HDFS benchmark
2Key improvements came from using template-based tokenization instead of NLP-style approach
3Model is small (4.9M params), trains quickly (36 mins on RTX 4090), and runs inference fast (500 logs/sec)
4Continuous anomaly scores can enable adaptive thresholds for production use

Details

The author built a log anomaly detection model using the recently published Mamba-3 / SSM architecture. Starting with a standard NLP-based approach, the model initially achieved 0.61-0.74 F1 score. The breakthrough came when the author switched to template-based tokenization, where each log event type is represented by a single token. This reduced the vocabulary size from 8000 to 50, shrank the model size by 10x, and significantly improved training time and reduced overfitting. The final model has 4.9M parameters, trains in 36 minutes on an RTX 4090, and can run inference at over 500 logs per second on a single consumer GPU. The author is now curious to explore applying this approach to other log anomaly detection benchmarks like BGL, Thunderbird, and Spirit.

Achieving 0.9975 F1 on HDFS Log Anomaly Detection with Mamba-3

Why it matters

Key Points

Details

Dive deeper

Related Articles

CVPR 2026 Travel Grant/Registration Waiver

When to Transition from Simple Heuristics to ML Models

Exploring the Reality of Autonomous AI Development Agents i…

Inquiry into ICML 2026 Review Dynamics

Differentiable Clustering and Search

Reviewer Promised to Increase Score but Hasn't Done It Yet

VOID: Video Object and Interaction Deletion (physically-con…

TMLR Reviews Seem More Reliable Than ICML/NeurIPS/ICLR

Waiting for ICML Rebuttal Acknowledgments

Physicist-turned-ML-engineer Seeks Advice on ML Research

AI Curator

Ask me anything about AI

Related Articles

CVPR 2026 Travel Grant/Registration Waiver

When to Transition from Simple Heuristics to ML Models

Exploring the Reality of Autonomous AI Development Agents i…

Inquiry into ICML 2026 Review Dynamics

Differentiable Clustering and Search

Reviewer Promised to Increase Score but Hasn't Done It Yet

VOID: Video Object and Interaction Deletion (physically-con…

TMLR Reviews Seem More Reliable Than ICML/NeurIPS/ICLR

Waiting for ICML Rebuttal Acknowledgments

Physicist-turned-ML-engineer Seeks Advice on ML Research