Benchmarks are noisy in 2026, and your hallucination rates change based on the...
https://echo-wiki.win/index.php/Why_Citations_Are_Not_Safety:_The_Illusion_of_Accuracy_in_RAG_Systems
Benchmarks are noisy in 2026, and your hallucination rates change based on the test you run. Even with web search, HalluHard hits a 30.2% error rate. Don't rely on generic scores