Test-Time Scaling

#Pocket#NLP#LanguageModel#ICLR
Issue Date: 2025-07-01 Wider or Deeper? Scaling LLM Inference-Time Compute with Adaptive Branching Tree Search, Yuichi Inoue+, ICLR25 Comment元ポスト:https://x.com/iwiwi/status/1939914618132168961?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-Q ... #Pocket#NLP#LanguageModel#QuestionAnswering#KnowledgeGraph#FactualConsistency#Reasoning#PostTraining
Issue Date: 2025-05-20 Scaling Reasoning can Improve Factuality in Large Language Models, Mike Zhang+, arXiv25 Comment元ポスト:https://x.com/_akhaliq/status/1924477447120068895?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-Q ... #EfficiencyImprovement#Pocket#NLP#LanguageModel#ICLR#Verification
Issue Date: 2025-05-13 Faster Cascades via Speculative Decoding, Harikrishna Narasimhan+, ICLR25 Comment元ポスト:https://x.com/hillbig/status/1922059828429832259?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-QOpenReview: https://openreview.net/forum?id=vo9t20wsmd ...

#Survey#Pocket#NLP#LanguageModel
Issue Date: 2025-04-02 What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models, Qiyuan Zhang+, arXiv25 Comment元ポスト:https://x.com/hesamation/status/1907095419793911893?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-Qとてつもない量だ…網羅性がありそう。What to Scaleがよくあるselfconsistency(Parallel Sca ... #Pocket#NLP#LanguageModel#LLM-as-a-Judge
Issue Date: 2025-03-27 Scaling Evaluation-time Compute with Reasoning Models as Process Evaluators, Seungone Kim+, arXiv25 Comment元ポスト:https://x.com/jinulee_v/status/1905025016401428883?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-QLLM-as-a-JudgeもlongCoT+self-consistencyで性能が改善するらしい。![image](https ... #Pocket#NLP#LanguageModel#Verification
Issue Date: 2025-03-18 Sample, Scrutinize and Scale: Effective Inference-Time Search by Scaling Verification, Eric Zhao+, arXiv25 Comment元ポスト:https://x.com/ericzhao28/status/1901704339229732874?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-Qざっくりしか読めていないが、複数の解答をサンプリングして、self-verificationをさせて最も良かったものを選択するア ... #Pocket#NLP#LanguageModel
Issue Date: 2025-02-12 Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling, Runze Liu+, arXiv25 #Pocket#NLP#LanguageModel#Architecture
Issue Date: 2025-02-10 Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach, Jonas Geiping+, arXiv25 #Pocket#NLP#LanguageModel#Supervised-FineTuning (SFT)#read-later
Issue Date: 2025-02-07 s1: Simple test-time scaling, Niklas Muennighoff+, arXiv25 Comment解説:https://x.com/hillbig/status/1887260791981941121?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-Q ... #Pocket#NLP#LanguageModel#Reasoning
Issue Date: 2025-01-28 Evolving Deeper LLM Thinking, Kuang-Huei Lee+, arXiv25 CommentOpenReview: https://openreview.net/forum?id=nGP1UxhAbV&referrer=%5Bthe%20profile%20of%20Kuang-Huei%20Lee%5D(%2Fprofile%3Fid%3D~Kuang-Huei_Lee1) ... #EfficiencyImprovement#Pocket#NLP#LanguageModel
Issue Date: 2024-11-12 Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters, Charlie Snell+, arXiv24 Comment![image](https://github.com/user-attachments/assets/0562a65e-b2f1-4ff4-b806-107313fc255e)[Perplexity(参考;Hallucinationに注意)](https://www.perplexity.ai/s ... #Article#Tutorial#NLP#LanguageModel#Blog#Reasoning
Issue Date: 2025-03-09 The State of LLM Reasoning Models, Sebastian Raschka, 2025.03 #Article#Pocket#LanguageModel#Blog
Issue Date: 2024-12-17 Scaling test-time-compute, Huggingface, 2024.12 Commentこれは必読 ... #Article#NLP#LanguageModel#Chain-of-Thought#Reasoning
Issue Date: 2024-09-13 OpenAI o1, 2024.09 CommentJason Wei氏のポスト:https://x.com/_jasonwei/status/1834278706522849788?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-Q#1072 や #1147 で似たような考えはすでに提案されていたが、どのような点が異なるのだろうか? たと ...