A Tool That Pits 9 Free LLMs Against Your Code as Adversarial Reviewers
The author built a tool called AIF (Architecture Impact Framework) that sends your code to 9 different large language models (LLMs) across 6 providers for independent code review, like adversarial auditors. The tool then synthesizes the findings and provides an interactive triage session.
Why it matters
AIF provides a novel approach to code review by leveraging the strengths of multiple AI models, which can help identify a wider range of issues than relying on a single model.
Key Points
- 1AIF sends your code to 9 different LLMs across 6 providers for independent code review
- 2The tool cross-references the findings and provides agreement levels (ALL, MAJORITY, SINGLE)
- 3The tool uses a free model stack including Llama, Qwen3, Nemotron, and Mistral
- 4The author found the ALL-agree findings were consistently real issues, while MAJORITY and SINGLE findings were more mixed
Details
The author built AIF (Architecture Impact Framework) to address the problem of relying on a single AI model for code review, as each model has its own blind spots. AIF sends your code to 9 different LLMs across 6 providers simultaneously, and the models review the code independently without seeing each other's outputs. The tool then cross-references the findings and provides agreement levels: ALL (60%+ of models flagged the same issue), MAJORITY (2 or more models agree), and SINGLE (only one model flagged it). After the synthesis, the tool provides an interactive triage session where the user can review each finding and mark it as valid, partially valid, already fixed, defer, reject, or skip. The author found that the ALL-agree findings were consistently real issues, while the MAJORITY and SINGLE findings were a mix of real issues and false positives or stylistic concerns. The tool is designed to be fully configurable, allowing users to swap in any model or provider they want.
No comments yet
Be the first to comment