News
Microsoft released Visual Studio Code 1.104, adding Auto model selection and support for contributed language models in chat, alongside security, productivity, and Model Context Protocol (MCP) ...
OpenBench provides standardized, reproducible benchmarking for LLMs across 30+ evaluation suites (and growing) spanning knowledge, math, reasoning, coding, science, reading comprehension, health, long ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results