ICMLに関する論文・技術記事メモの一覧

ICML

#Pocket #NLP #LanguageModel #MoE(Mixture-of-Experts)#Scaling Laws
Issue Date: 2025-06-21 Scaling Laws for Upcycling Mixture-of-Experts Language Models, Seng Pei Liew+, ICML25 Comment元ポスト:https://x.com/sbintuitions/status/1935970879923540248?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-QOpenReview:https://openreview.net/forum?id=ZBBo19jldX関連:#1546 ... #Pocket #NLP #LanguageModel #Hallucination
Issue Date: 2025-06-14 Steer LLM Latents for Hallucination Detection, Seongheon Park+, ICML25 Comment元ポスト:https://x.com/sharonyixuanli/status/1933522788645810493?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-Q ... #EfficiencyImprovement #Pocket #NLP #LanguageModel #PEFT(Adaptor/LoRA)
Issue Date: 2025-06-12 Text-to-LoRA: Instant Transformer Adaption, Rujikorn Charakorn+, ICML25 Comment元ポスト:https://x.com/roberttlange/status/1933074366603919638?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-Qな、なるほど、こんな手が…！ ...

#MachineLearning #Pocket #NLP #LanguageModel #KnowledgeEditing
Issue Date: 2025-06-10 Representation Shattering in Transformers: A Synthetic Study with Knowledge Editing, Kento Nishi+, ICML25 Comment元ポスト:https://x.com/kento_nishi/status/1932072335726539063?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-Q ... #Pocket #NLP #DataGeneration #DataDistillation #SyntheticData
Issue Date: 2025-05-07 R.I.P.: Better Models by Survival of the Fittest Prompts, Ping Yu+, ICML25 Comment元ポスト:https://x.com/jaseweston/status/1885160135053459934?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-Qスレッドで著者が論文の解説をしている。 ... #NLP #LanguageModel #Reasoning #PostTraining
Issue Date: 2025-05-07 Thinking LLMs: General Instruction Following with Thought Generation, Tianhao Wu+, ICML25 Comment元ポスト:https://x.com/tesatory/status/1919461701206081813?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-Q外部のCoTデータを使わないで、LLMのreasoning capabilityを向上させる話っぽい。DeepSeek-R1の登場以 ... #ComputerVision #Embeddings #Analysis #Pocket #NLP #LanguageModel #Supervised-FineTuning (SFT)#Chain-of-Thought #SSM (StateSpaceModel)#PostTraining #read-later
Issue Date: 2025-05-04 Layer by Layer: Uncovering Hidden Representations in Language Models, Oscar Skean+, ICML25 Comment現代の代表的な言語モデルのアーキテクチャ（decoder-only model, encoder-only model, SSM）について、最終層のembeddingよりも中間層のembeddingの方がdownstream task（MTEBの32Taskの平均）に、一貫して（ただし、これはMTE ... #Pocket #NLP #LanguageModel #Alignment #Supervised-FineTuning (SFT)
Issue Date: 2024-11-07 Self-Consistency Preference Optimization, Archiki Prasad+, ICML25 Comment元ポスト:https://x.com/jaseweston/status/1854532624116547710?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-QSelf-Consistencyのように、モデルに複数の出力をさせて、最も頻度が高い回答と頻度が低い回答の2つでDPOのペアデー ... #Analysis #Pocket #NLP #LanguageModel #Alignment #ReinforcementLearning #PPO (ProximalPolicyOptimization)#DPO #On-Policy
Issue Date: 2025-06-25 Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data, Fahim Tajwar+, ICML24 #Pocket #NLP #Dataset #LanguageModel #Alignment #InstructionTuning #PostTraining
Issue Date: 2025-05-11 UltraFeedback: Boosting Language Models with Scaled AI Feedback, Ganqu Cui+, ICML24 #Pocket #NLP #LanguageModel #SSM (StateSpaceModel)
Issue Date: 2025-03-24 Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality, Tri Dao+, ICML24 CommentMamba2の詳細を知りたい場合に読む ... #MachineLearning #Pocket #NLP #LanguageModel #Alignment #PostTraining
Issue Date: 2024-10-27 KTO: Model Alignment as Prospect Theoretic Optimization, Kawin Ethayarajh+, N_A, ICML24 CommentbinaryフィードバックデータからLLMのアライメントをとるKahneman-Tversky Optimization (KTO)論文 ... #Analysis #MachineLearning #Pocket #NLP #SSM (StateSpaceModel)
Issue Date: 2024-08-27 The Illusion of State in State-Space Models, William Merrill+, N_A, ICML24 SummarySSM（状態空間モデル）は、トランスフォーマーよりも優れた状態追跡の表現力を持つと期待されていましたが、実際にはその表現力は制限されており、トランスフォーマーと類似しています。SSMは複雑性クラス$\mathsf{TC}^0$の外での計算を表現できず、単純な状態追跡問題を解決することができません。このため、SSMは実世界の状態追跡問題を解決する能力に制限がある可能性があります。 Comment>しかし、SSMが状態追跡の表現力で本当に（トランスフォーマーよりも）優位性を持っているのでしょうか？驚くべきことに、その答えは「いいえ」です。私たちの分析によると、SSMの表現力は、トランスフォーマーと非常に類似して制限されています：SSMは複雑性クラス$\mathsf{TC}^0$の外での計算を ... #Pocket #NLP #LanguageModel #Alignment #InstructionTuning #LLM-as-a-Judge #SelfImprovement
Issue Date: 2024-01-22 Self-Rewarding Language Models, Weizhe Yuan+, N_A, ICML24 Summary将来のモデルのトレーニングには超人的なフィードバックが必要であり、自己報酬を提供するSelf-Rewarding Language Modelsを研究している。LLM-as-a-Judgeプロンプトを使用して、言語モデル自体が自己報酬を提供し、高品質な報酬を得る能力を向上させることを示した。Llama 2 70Bを3回のイテレーションで微調整することで、既存のシステムを上回るモデルが得られることを示した。この研究は、改善可能なモデルの可能性を示している。 Comment人間の介入無しで（人間がアノテーションしたpreference data無しで）LLMのAlignmentを改善していく手法。LLM-as-a-Judge Promptingを用いて、LLM自身にpolicy modelとreward modelの役割の両方をさせる。unlabeledなprompt ...

#Pocket
Issue Date: 2023-05-22 Reprompting: Automated Chain-of-Thought Prompt Inference Through Gibbs Sampling, Weijia Xu+, N_A, ICML24 Summary本研究では、Repromptingという反復サンプリングアルゴリズムを紹介し、Chain-of-Thought（CoT）レシピを探索することで、特定のタスクを解決する。Repromptingは、以前にサンプリングされた解決策を親プロンプトとして使用して、新しいレシピを反復的にサンプリングすることで、一貫して良い結果を出すCoTレシピを推論する。複数のステップ推論が必要な5つのBig-Bench Hardタスクにおいて、Repromptingはゼロショット、フューショット、および人間が書いたCoTベースラインよりも一貫して優れたパフォーマンスを発揮する。Repromptingは、より強力なモデルからより弱いモデルへの知識の転移を促進し、より弱いモデルの性能を大幅に向上させることもできる。全体的に、Repromptingは、人間が書いたCoTプロンプトを使用する従来の最先端手法よりも最大で+17ポイントの改善をもたらす。 Commentんー、IterCoTとかAutoPromptingとかと比較してないので、なんとも言えない…。サーベイ不足では。あとChatGPTを使うのはやめて頂きたい。 ... #Pocket #NLP #LanguageModel #Poisoning
Issue Date: 2023-05-04 Poisoning Language Models During Instruction Tuning, Alexander Wan+, N_A, ICML23 SummaryInstruction-tuned LMs（ChatGPT、FLAN、InstructGPTなど）は、ユーザーが提出した例を含むデータセットでfinetuneされる。本研究では、敵対者が毒入りの例を提供することで、LMの予測を操作できることを示す。毒入りの例を構築するために、LMのbag-of-words近似を使用して入出力を最適化する。大きなLMほど毒入り攻撃に対して脆弱であり、データフィルタリングやモデル容量の削減に基づく防御は、テストの正確性を低下させながら、中程度の保護しか提供しない。 #ComputerVision #NLP #MulltiModal #ContrastiveLearning
Issue Date: 2023-04-27 Learning Transferable Visual Models From Natural Language Supervision, Radford+, OpenAI, ICML21 CommentCLIP論文。大量の画像と画像に対応するテキストのペアから、対象学習を行い、画像とテキスト間のsimilarityをはかれるようにしたモデル ![image](https://user-images.githubusercontent.com/12249301/234729329-dfa5dc1e ... #DocumentSummarization #NeuralNetwork #NLP #Admin'sPick
Issue Date: 2025-05-13 PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization, Jingqing Zhang+, ICML20 CommentPEGASUSもなかったので追加。BARTと共に文書要約のBackboneとして今でも研究で利用される模様。関連:#984 ... #NeuralNetwork #ComputerVision #EfficiencyImprovement #Pocket #Scaling Laws #Admin'sPick
Issue Date: 2025-05-12 EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks, Mingxing Tan+, ICML19 Comment元論文をメモってなかったので追加。#346も参照のこと。 ... #NeuralNetwork #NaturalLanguageGeneration #Controllable #NLP #DataToTextGeneration #ConceptToTextGeneration
Issue Date: 2017-12-31 Toward Controlled Generation of Text, Hu+, ICML17 CommentText Generationを行う際は、現在は基本的に学習された言語モデルの尤度に従ってテキストを生成するのみで、outputされるテキストをcontrolすることができないので、できるようにしましたという論文。 VAEによるテキスト生成にGANを組み合わせたようなモデル。 decodingする元 ... #NeuralNetwork #Tutorial #MachineLearning
Issue Date: 2018-02-22 Tutorial: Deep Reinforcement Learning, David Silver, ICML16 #MachineLearning #Pocket #LanguageModel #Transformer #Normalization #Admin'sPick
Issue Date: 2025-04-02 Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift, Sergey Ioffe+, ICML15 Commentメモってなかったので今更ながら追加した共変量シフトやBatch Normalizationの説明は#261記載のスライドが分かりやすい。 ... #NeuralNetwork #MachineLearning #Admin'sPick
Issue Date: 2018-02-19 An Empirical Exploration of Recurrent Network Architectures, Jozefowicz+, ICML15 CommentGRUとLSTMの違いを理解するのに最適 ... #InformationRetrieval #LearningToRank #Online/Interactive
Issue Date: 2018-01-01 Interactively Optimizing Information Retrieval Systems as a Dueling Bandits Problem, Yue+, ICML09 Commentonline learning to rankに関する論文でよくreferされる論文提案手法は、Dueling Bandit Gradient Descent(DBGD)と呼ばれる. onlineでlearning to rankを行える手法で、現在の重みwとwをランダムな方向に動かし ... #Pocket #NLP #MultitaskLearning #Admin'sPick
Issue Date: 2018-02-05 A unified architecture for natural language processing: Deep neural networks with multitask learning, Collobert+, ICML2008. CommentDeep Neural Netを用いてmultitask learningを行いNLPタスク（POS tagging, Semantic Role Labeling, Chunking etc.）を解いた論文。被引用数2000を超える。 multitask learningの学習プロセスな ... #InformationRetrieval #LearningToRank #ListWise #Pocket
Issue Date: 2018-01-01 Listwise Approach to Learning to Rank - Theory and Algorithm （ListMLE）, Xia+, ICML2008 #NaturalLanguageGeneration #SingleFramework #NLP #DataToTextGeneration
Issue Date: 2017-12-31 Learning to sportscast: a test of grounded language acquisition, Chen+, ICML08 #InformationRetrieval #LearningToRank #ListWise #Admin'sPick
Issue Date: 2018-01-01 Learning to Rank: From Pairwise Approach to Listwise Approach （ListNet）, Cao+, ICML2007 Comment解説スライド：http://www.nactem.ac.uk/tsujii/T-FaNT2/T-FaNT.files/Slides/liu.pdf 解説ブログ：https://qiita.com/koreyou/items/a69750696fd0b9d88608従来行われてきたLearning t ... #InformationRetrieval #LearningToRank #PairWise #Admin'sPick
Issue Date: 2018-01-01 Learning to Rank using Gradient Descent （RankNet）, Burges+, ICML2005 Commentpair-wiseのlearning2rankで代表的なRankNet論文解説ブログ：https://qiita.com/sz_dr/items/0e50120318527a928407 lossは2個のインスタンスのpair、A, Bが与えられたとき、AがBよりも高くランクされる場合は確 ...