Also, they show a counter-intuitive scaling Restrict: their reasoning work increases with challenge complexity around a degree, then declines Regardless of owning an ample token spending budget. By evaluating LRMs with their regular LLM counterparts underneath equal inference compute, we determine 3 overall performance regimes: (1) lower-complexity tasks wherever https://www.youtube.com/watch?v=snr3is5MTiU