Reasoning
#RecommenderSystems#CollaborativeFiltering#Pocket#NLP#LanguageModel#RAG(RetrievalAugmentedGeneration)
Issue Date: 2025-03-27 RALLRec+: Retrieval Augmented Large Language Model Recommendation with Reasoning, Sichun Luo+, arXiv25 Comment元ポスト:https://x.com/_reachsumit/status/1905107217663336832?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-QReasoning LLMをRecSysに応用する初めての研究(らしいことがRelated Workに書かれている) ... #Survey#Pocket#NLP#LanguageModel
Issue Date: 2025-03-23 Thinking Machines: A Survey of LLM based Reasoning Strategies, Dibyanayan Bandyopadhyay+, arXiv25 Comment元ポスト:https://x.com/dair_ai/status/1903843684568666450?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-QRL, Test Time Compute, Self-trainingの3種類にカテゴライズされている。また、各カテゴリごとにより細 ... #Survey#Efficiency/SpeedUp#Pocket#LanguageModel
Issue Date: 2025-03-22 Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models, Yang Sui+, arXiv25 CommentReasoning Modelにおいて、Over Thinking現象(不要なreasoning stepを生成してしまう)を改善するための手法に関するSurvey。 ... #Survey#Pocket#NLP#LanguageModel
Issue Date: 2025-03-23 Thinking Machines: A Survey of LLM based Reasoning Strategies, Dibyanayan Bandyopadhyay+, arXiv25 Comment元ポスト:https://x.com/dair_ai/status/1903843684568666450?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-QRL, Test Time Compute, Self-trainingの3種類にカテゴライズされている。また、各カテゴリごとにより細 ... #Survey#Efficiency/SpeedUp#Pocket#LanguageModel
Issue Date: 2025-03-22 Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models, Yang Sui+, arXiv25 CommentReasoning Modelにおいて、Over Thinking現象(不要なreasoning stepを生成してしまう)を改善するための手法に関するSurvey。#Adapter/LoRA
Issue Date: 2025-03-19 The First Few Tokens Are All You Need: An Efficient and Effective Unsupervised Prefix Fine-Tuning Method for Reasoning Models, Ke Ji+, arXiv25 Comment斜め読みだが、reasoning traceの冒頭部分は重要な役割を果たしており、サンプリングした多くのresponseのreasoning traceにおいて共通しているものは重要という直感から(Prefix Self-Consistency)、reasoning traceの冒頭部分を適切に生成 ... #Survey#Pocket#NLP#LanguageModel#Finetuning (SFT)
Issue Date: 2025-03-15 A Survey on Post-training of Large Language Models, Guiyao Tie+, arXiv25 CommentPost Trainingの時間発展の図解が非常にわかりやすい(が、厳密性には欠けているように見える。当該モデルの新規性における主要な技術はこれです、という図としてみるには良いのかもしれない)。個々の技術が扱うスコープとレイヤー、データの性質が揃っていない気がするし、それぞれのLLMがy軸の単一の元 ... #Survey#Pocket#NLP#LanguageModel#Finetuning (SFT)
Issue Date: 2025-03-04 LLM Post-Training: A Deep Dive into Reasoning Large Language Models, Komal Kumar+, arXiv25 Comment非常にわかりやすい。元ポスト:https://x.com/gm8xx8/status/189639919559626371 ... #Survey#Pocket#NLP#LanguageModel
Issue Date: 2025-02-26 From System 1 to System 2: A Survey of Reasoning Large Language Models, Zhong-Zhi Li+, arXiv25 Comment元ポスト:https://x.com/_reachsumit/status/1894282083956396544?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-Q ... #Tools#NLP#LanguageModel#LLMAgent
Issue Date: 2025-02-20 OctoTools: An Agentic Framework with Extensible Tools for Complex Reasoning, Pan Lu+, arXiv25 Comment元ポスト:https://x.com/lupantech/status/1892260474320015861?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-Q ... #Pocket#NLP#Dataset#LanguageModel#SyntheticData#Distillation
Issue Date: 2025-02-19 NaturalReasoning: Reasoning in the Wild with 2.8M Challenging Questions, Weizhe Yuan+, arXiv25 Comment元ポスト: https://x.com/jaseweston/status/1892041992127021300?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-Q ... #Pocket#NLP#LanguageModel#Finetuning (SFT)
Issue Date: 2025-02-07 LIMO: Less is More for Reasoning, Yixin Ye+, arXiv25 Comment元ポスト:https://x.com/arankomatsuzaki/status/1887353699644940456?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-Q ... #Analysis#Pocket#NLP#LanguageModel#Chain-of-Thought#LongSequence
Issue Date: 2025-02-07 Demystifying Long Chain-of-Thought Reasoning in LLMs, Edward Yeo+, arXiv25 Comment元ポスト:https://x.com/xiangyue96/status/1887332772198371514?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-Q元ポストのスレッド中に論文の11個の知見が述べられている。どれも非常に興味深い。DeepSeek-R1のテクニカルペーパーと同様 ... #Pocket#NLP#LanguageModel#Test-time Compute
Issue Date: 2025-01-28 Evolving Deeper LLM Thinking, Kuang-Huei Lee+, arXiv25 #NLP#LanguageModel#RLHF (ReinforcementLearningFromHumanFeedback)#Mathematics
Issue Date: 2025-01-04 DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models, Zhihong Shao+, arXiv24 Comment元ポスト:https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_the-rlhf-method-behind-the-best-open-models-activity-7280850174522843137-3V9v?utm_source= ... #Pocket#NLP#QuestionAnswering#Zero/FewShotPrompting#Chain-of-Thought#RAG(RetrievalAugmentedGeneration)
Issue Date: 2025-01-03 AutoReason: Automatic Few-Shot Reasoning Decomposition, Arda Sevinc+, arXiv24 Comment元ポスト:https://x.com/dair_ai/status/1868299926897074309?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-Q ... #Survey#Pocket#NLP#LanguageModel#Mathematics
Issue Date: 2025-01-03 A Survey of Mathematical Reasoning in the Era of Multimodal Large Language Model: Benchmark, Method & Challenges, Yibo Yan+, arXiv24 #Pocket#NLP#LanguageModel
Issue Date: 2024-12-31 Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search, Huanjin Yao+, arXiv24 #NLP#LanguageModel#SelfTaughtReasoner
Issue Date: 2024-12-16 Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions, Yu Zhao+, arXiv24 Comment元ポスト:https://x.com/bilzrd/status/1868568258468774048?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-QLarge Reasoning Model (LRM)という用語は初めて見た。 ... #Survey#NLP#LanguageModel#Evaluation
Issue Date: 2024-11-07 Beyond Accuracy: Evaluating the Reasoning Behavior of Large Language Models -- A Survey, Philipp Mondorf+, arXiv24 Comment論文紹介(sei_shinagawa):https://www.docswell.com/s/sei_shinagawa/KL1QXL-beyond-accuracy-evaluating-the-behaivior-of-llm-surveyを出力させ、subquestionの回答部分に特殊トークン(THINK)を出力させるようにSupervisedに学習させる。最終的にTHINKトークン部分は、 ... #Survey#NLP#LanguageModel#Prompting
Issue Date: 2023-07-18 Reasoning with Language Model Prompting: A Survey, ACL23 Summary本論文では、推論に関する最新の研究について包括的な調査を行い、初心者を支援するためのリソースを提供します。また、推論能力の要因や将来の研究方向についても議論します。リソースは定期的に更新されています。 #Article#MachineLearning#Pocket#LanguageModel#Article
Issue Date: 2025-03-22 Understanding R1-Zero-Like Training: A Critical Perspective, 2025.03 Comment関連研究:#1815解説ポスト:https://x.com/wenhuchen/status/1903464313391624668?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-Q解説ポストを読むと、DAPOでの Token Level Policy UpdateのようなLengthに対 ... #Article#NLP#LanguageModel#ProprietaryLLM#SSM (StateSpaceModel)
Issue Date: 2025-03-22 Huayuan T1, Tencent, 2025.03 Comment元ポスト:https://x.com/txhunyuan/status/1903121005809373386?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-Q画像はブログより引用。DeepSeek-R1と比較すると優っているタスクと劣っているタスクがあり、なんとも言えない感。GPT4.5 ... #Article#NLP#Dataset#LanguageModel
Issue Date: 2025-03-21 Sudoku-bench, SakanaAI, 2025.03 CommentSudoku-Bench features the kind of Sudoku puzzles featured on Cracking the Cryptic (CTC). These Sudoku variants employ unique rulesets to evoke creativ ... #Article#NLP#LanguageModel#OpenWeightLLM
Issue Date: 2025-03-19 Llama Nemotron, Nvidia, 2025.03 CommentNvidiaによる初めてのreasoning model。元ポスト:https://x.com/kuchaev/status/1902078122792775771?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-QArtificial Analysisにやるベンチマーク:https://x ... #Article#NLP#LanguageModel#OpenWeightLLM
Issue Date: 2025-03-18 EXAONE-Deep-32B, LG AI Research, 2025.03 Comment元ポスト:https://x.com/ai_for_success/status/1901908168805912602?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-QEXAONE AI Model License Agreement 1.1 NC商用利用不可 ... #Article#NLP#LanguageModel#MultiLingual#OpenWeightLLM
Issue Date: 2025-03-12 Reasoning with Reka Flash, Reka, 2025.03 CommentWeights: https://huggingface.co/RekaAI/reka-flash-3Apache-2.0< /reasoning >を強制的にoutputさせることでreasoningを中断させることができ予算のコントロールが可能とのこと ... #Article#Tutorial#NLP#LanguageModel#Article#Test-time Compute
Issue Date: 2025-03-09 The State of LLM Reasoning Models, Sebastian Raschka, 2025.03 #Article#NLP#LanguageModel#ReinforcementLearning#OpenWeightLLM
Issue Date: 2025-03-06 QwQ-32B: Embracing the Power of Reinforcement Learning, Qwen Team, 2025.03 Comment元ポスト:https://x.com/hillbig/status/1897426898642460724?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-Q#1787Artificial Analysisによるベンチマークスコア:https://x.com/artificialanlys/ ... #Article#MachineLearning#NLP#LanguageModel#Library#ReinforcementLearning#python
Issue Date: 2025-03-02 Open Reasoner Zero, Open-Reasoner-Zero, 2024.02 Comment元ポスト:https://x.com/dair_ai/status/1893698293965725708?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-QWe introduce Open-Reasoner-Zero, the first open source implementati ... #Article#NLP#LanguageModel#OpenWeightLLM
Issue Date: 2025-02-17 Mistral-24B-Reasoning, yentinglin, 2025.02 CommentApache-2.0 ... #Article#NLP#LanguageModel#Finetuning (SFT)
Issue Date: 2025-02-07 Unsloth で独自の R1 Reasoningモデルを学習, npaka, 2025.02 Comment非常に実用的で参考になる。特にどの程度のVRAMでどの程度の規模感のモデルを使うことが推奨されるのかが明言されていて参考になる。 ... #Article#Tutorial#NLP#LanguageModel#Alignment#Finetuning (SFT)#Chain-of-Thought#Mathematics
Issue Date: 2024-12-27 LLMを数学タスクにアラインする手法の系譜 - GPT-3からQwen2.5まで, bilzard, 2024.12 Comment#1618において、数学においてモデルのパラメータ数のスケーリングによって性能改善が見込める学習手法として、モデルとは別にVerifierを学習し、モデルが出力した候補の中から良いものを選択できるようにする、という話の気持ちが最初よくわからなかったのだが、後半のなぜsample&select記事中で ... #Article#Pocket#LanguageModel#Article#SelfCorrection
Issue Date: 2024-12-22 OpenAI o1を再現しよう(Reasoningモデルの作り方), はち, 2024.12 CommentReflection after Thinkingを促すためのプロンプトが興味深い ...
Issue Date: 2025-03-19 The First Few Tokens Are All You Need: An Efficient and Effective Unsupervised Prefix Fine-Tuning Method for Reasoning Models, Ke Ji+, arXiv25 Comment斜め読みだが、reasoning traceの冒頭部分は重要な役割を果たしており、サンプリングした多くのresponseのreasoning traceにおいて共通しているものは重要という直感から(Prefix Self-Consistency)、reasoning traceの冒頭部分を適切に生成 ... #Survey#Pocket#NLP#LanguageModel#Finetuning (SFT)
Issue Date: 2025-03-15 A Survey on Post-training of Large Language Models, Guiyao Tie+, arXiv25 CommentPost Trainingの時間発展の図解が非常にわかりやすい(が、厳密性には欠けているように見える。当該モデルの新規性における主要な技術はこれです、という図としてみるには良いのかもしれない)。個々の技術が扱うスコープとレイヤー、データの性質が揃っていない気がするし、それぞれのLLMがy軸の単一の元 ... #Survey#Pocket#NLP#LanguageModel#Finetuning (SFT)
Issue Date: 2025-03-04 LLM Post-Training: A Deep Dive into Reasoning Large Language Models, Komal Kumar+, arXiv25 Comment非常にわかりやすい。元ポスト:https://x.com/gm8xx8/status/189639919559626371 ... #Survey#Pocket#NLP#LanguageModel
Issue Date: 2025-02-26 From System 1 to System 2: A Survey of Reasoning Large Language Models, Zhong-Zhi Li+, arXiv25 Comment元ポスト:https://x.com/_reachsumit/status/1894282083956396544?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-Q ... #Tools#NLP#LanguageModel#LLMAgent
Issue Date: 2025-02-20 OctoTools: An Agentic Framework with Extensible Tools for Complex Reasoning, Pan Lu+, arXiv25 Comment元ポスト:https://x.com/lupantech/status/1892260474320015861?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-Q ... #Pocket#NLP#Dataset#LanguageModel#SyntheticData#Distillation
Issue Date: 2025-02-19 NaturalReasoning: Reasoning in the Wild with 2.8M Challenging Questions, Weizhe Yuan+, arXiv25 Comment元ポスト: https://x.com/jaseweston/status/1892041992127021300?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-Q ... #Pocket#NLP#LanguageModel#Finetuning (SFT)
Issue Date: 2025-02-07 LIMO: Less is More for Reasoning, Yixin Ye+, arXiv25 Comment元ポスト:https://x.com/arankomatsuzaki/status/1887353699644940456?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-Q ... #Analysis#Pocket#NLP#LanguageModel#Chain-of-Thought#LongSequence
Issue Date: 2025-02-07 Demystifying Long Chain-of-Thought Reasoning in LLMs, Edward Yeo+, arXiv25 Comment元ポスト:https://x.com/xiangyue96/status/1887332772198371514?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-Q元ポストのスレッド中に論文の11個の知見が述べられている。どれも非常に興味深い。DeepSeek-R1のテクニカルペーパーと同様 ... #Pocket#NLP#LanguageModel#Test-time Compute
Issue Date: 2025-01-28 Evolving Deeper LLM Thinking, Kuang-Huei Lee+, arXiv25 #NLP#LanguageModel#RLHF (ReinforcementLearningFromHumanFeedback)#Mathematics
Issue Date: 2025-01-04 DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models, Zhihong Shao+, arXiv24 Comment元ポスト:https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_the-rlhf-method-behind-the-best-open-models-activity-7280850174522843137-3V9v?utm_source= ... #Pocket#NLP#QuestionAnswering#Zero/FewShotPrompting#Chain-of-Thought#RAG(RetrievalAugmentedGeneration)
Issue Date: 2025-01-03 AutoReason: Automatic Few-Shot Reasoning Decomposition, Arda Sevinc+, arXiv24 Comment元ポスト:https://x.com/dair_ai/status/1868299926897074309?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-Q ... #Survey#Pocket#NLP#LanguageModel#Mathematics
Issue Date: 2025-01-03 A Survey of Mathematical Reasoning in the Era of Multimodal Large Language Model: Benchmark, Method & Challenges, Yibo Yan+, arXiv24 #Pocket#NLP#LanguageModel
Issue Date: 2024-12-31 Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search, Huanjin Yao+, arXiv24 #NLP#LanguageModel#SelfTaughtReasoner
Issue Date: 2024-12-16 Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions, Yu Zhao+, arXiv24 Comment元ポスト:https://x.com/bilzrd/status/1868568258468774048?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-QLarge Reasoning Model (LRM)という用語は初めて見た。 ... #Survey#NLP#LanguageModel#Evaluation
Issue Date: 2024-11-07 Beyond Accuracy: Evaluating the Reasoning Behavior of Large Language Models -- A Survey, Philipp Mondorf+, arXiv24 Comment論文紹介(sei_shinagawa):https://www.docswell.com/s/sei_shinagawa/KL1QXL-beyond-accuracy-evaluating-the-behaivior-of-llm-surveyを出力させ、subquestionの回答部分に特殊トークン(THINK)を出力させるようにSupervisedに学習させる。最終的にTHINKトークン部分は、 ... #Survey#NLP#LanguageModel#Prompting
Issue Date: 2023-07-18 Reasoning with Language Model Prompting: A Survey, ACL23 Summary本論文では、推論に関する最新の研究について包括的な調査を行い、初心者を支援するためのリソースを提供します。また、推論能力の要因や将来の研究方向についても議論します。リソースは定期的に更新されています。 #Article#MachineLearning#Pocket#LanguageModel#Article
Issue Date: 2025-03-22 Understanding R1-Zero-Like Training: A Critical Perspective, 2025.03 Comment関連研究:#1815解説ポスト:https://x.com/wenhuchen/status/1903464313391624668?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-Q解説ポストを読むと、DAPOでの Token Level Policy UpdateのようなLengthに対 ... #Article#NLP#LanguageModel#ProprietaryLLM#SSM (StateSpaceModel)
Issue Date: 2025-03-22 Huayuan T1, Tencent, 2025.03 Comment元ポスト:https://x.com/txhunyuan/status/1903121005809373386?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-Q画像はブログより引用。DeepSeek-R1と比較すると優っているタスクと劣っているタスクがあり、なんとも言えない感。GPT4.5 ... #Article#NLP#Dataset#LanguageModel
Issue Date: 2025-03-21 Sudoku-bench, SakanaAI, 2025.03 CommentSudoku-Bench features the kind of Sudoku puzzles featured on Cracking the Cryptic (CTC). These Sudoku variants employ unique rulesets to evoke creativ ... #Article#NLP#LanguageModel#OpenWeightLLM
Issue Date: 2025-03-19 Llama Nemotron, Nvidia, 2025.03 CommentNvidiaによる初めてのreasoning model。元ポスト:https://x.com/kuchaev/status/1902078122792775771?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-QArtificial Analysisにやるベンチマーク:https://x ... #Article#NLP#LanguageModel#OpenWeightLLM
Issue Date: 2025-03-18 EXAONE-Deep-32B, LG AI Research, 2025.03 Comment元ポスト:https://x.com/ai_for_success/status/1901908168805912602?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-QEXAONE AI Model License Agreement 1.1 NC商用利用不可 ... #Article#NLP#LanguageModel#MultiLingual#OpenWeightLLM
Issue Date: 2025-03-12 Reasoning with Reka Flash, Reka, 2025.03 CommentWeights: https://huggingface.co/RekaAI/reka-flash-3Apache-2.0< /reasoning >を強制的にoutputさせることでreasoningを中断させることができ予算のコントロールが可能とのこと ... #Article#Tutorial#NLP#LanguageModel#Article#Test-time Compute
Issue Date: 2025-03-09 The State of LLM Reasoning Models, Sebastian Raschka, 2025.03 #Article#NLP#LanguageModel#ReinforcementLearning#OpenWeightLLM
Issue Date: 2025-03-06 QwQ-32B: Embracing the Power of Reinforcement Learning, Qwen Team, 2025.03 Comment元ポスト:https://x.com/hillbig/status/1897426898642460724?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-Q#1787Artificial Analysisによるベンチマークスコア:https://x.com/artificialanlys/ ... #Article#MachineLearning#NLP#LanguageModel#Library#ReinforcementLearning#python
Issue Date: 2025-03-02 Open Reasoner Zero, Open-Reasoner-Zero, 2024.02 Comment元ポスト:https://x.com/dair_ai/status/1893698293965725708?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-QWe introduce Open-Reasoner-Zero, the first open source implementati ... #Article#NLP#LanguageModel#OpenWeightLLM
Issue Date: 2025-02-17 Mistral-24B-Reasoning, yentinglin, 2025.02 CommentApache-2.0 ... #Article#NLP#LanguageModel#Finetuning (SFT)
Issue Date: 2025-02-07 Unsloth で独自の R1 Reasoningモデルを学習, npaka, 2025.02 Comment非常に実用的で参考になる。特にどの程度のVRAMでどの程度の規模感のモデルを使うことが推奨されるのかが明言されていて参考になる。 ... #Article#Tutorial#NLP#LanguageModel#Alignment#Finetuning (SFT)#Chain-of-Thought#Mathematics
Issue Date: 2024-12-27 LLMを数学タスクにアラインする手法の系譜 - GPT-3からQwen2.5まで, bilzard, 2024.12 Comment#1618において、数学においてモデルのパラメータ数のスケーリングによって性能改善が見込める学習手法として、モデルとは別にVerifierを学習し、モデルが出力した候補の中から良いものを選択できるようにする、という話の気持ちが最初よくわからなかったのだが、後半のなぜsample&select記事中で ... #Article#Pocket#LanguageModel#Article#SelfCorrection
Issue Date: 2024-12-22 OpenAI o1を再現しよう(Reasoningモデルの作り方), はち, 2024.12 CommentReflection after Thinkingを促すためのプロンプトが興味深い ...