LLM Benchmarks

(2 articles)

NousCoder-14B Benchmark: How a $3M Open Model Matches Claude

Nous Research's 14B coding model hits 67.87% on LiveCodeBench, closing the gap with proprietary rivals in just 4 days of training. Here's what the numbers...

March 12, 20266 min

ChatGPT Just Made Math Homework Less Painful—Here's How

OpenAI rolled out interactive visual explanations for math and science in ChatGPT, letting students actually understand formulas instead of memorizing them....

March 12, 20265 min