Reproducibility
#Pocket#NLP#Dataset#LanguageModel#LLMAgent#Evaluation
Issue Date: 2025-06-30 The Automated LLM Speedrunning Benchmark: Reproducing NanoGPT Improvements, Bingchen Zhao+, arXiv25 Comment元ポスト:https://x.com/karpathy/status/1939709449956126910?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-Q ... #Pocket#NLP#LanguageModel#Reasoning
Issue Date: 2025-06-13 Give Me FP32 or Give Me Death? Challenges and Solutions for Reproducible Reasoning, Jiayi Yuan+, arXiv25 #RecommenderSystems#Pocket#read-later
Issue Date: 2025-05-16 A Troubling Analysis of Reproducibility and Progress in Recommender Systems Research, Maurizio Ferrari Dacrema+, TOIS21
Issue Date: 2025-06-30 The Automated LLM Speedrunning Benchmark: Reproducing NanoGPT Improvements, Bingchen Zhao+, arXiv25 Comment元ポスト:https://x.com/karpathy/status/1939709449956126910?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-Q ... #Pocket#NLP#LanguageModel#Reasoning
Issue Date: 2025-06-13 Give Me FP32 or Give Me Death? Challenges and Solutions for Reproducible Reasoning, Jiayi Yuan+, arXiv25 #RecommenderSystems#Pocket#read-later
Issue Date: 2025-05-16 A Troubling Analysis of Reproducibility and Progress in Recommender Systems Research, Maurizio Ferrari Dacrema+, TOIS21
#RecommenderSystems#NeuralNetwork#CollaborativeFiltering#Pocket#MatrixFactorization#RecSys#read-later
Issue Date: 2025-05-16 Neural Collaborative Filtering vs. Matrix Factorization Revisited, Steffen Rendle+, RecSys20 #RecommenderSystems#RecSys#read-later
Issue Date: 2025-05-14 Are We Evaluating Rigorously? Benchmarking Recommendation for Reproducible Evaluation and Fair Comparison, Zun+, RecSys20 Comment日本語解説:https://qiita.com/smochi/items/c4cecc48e4aba0071ead ... #RecommenderSystems#Pocket#read-later
Issue Date: 2025-05-14 On the Difficulty of Evaluating Baselines: A Study on Recommender Systems, Steffen Rendle+, arXiv19
Issue Date: 2025-05-16 Neural Collaborative Filtering vs. Matrix Factorization Revisited, Steffen Rendle+, RecSys20 #RecommenderSystems#RecSys#read-later
Issue Date: 2025-05-14 Are We Evaluating Rigorously? Benchmarking Recommendation for Reproducible Evaluation and Fair Comparison, Zun+, RecSys20 Comment日本語解説:https://qiita.com/smochi/items/c4cecc48e4aba0071ead ... #RecommenderSystems#Pocket#read-later
Issue Date: 2025-05-14 On the Difficulty of Evaluating Baselines: A Study on Recommender Systems, Steffen Rendle+, arXiv19