Diversity
#Pocket#NLP#LanguageModel#Alignment#ICLR#DPO#PostTraining
Issue Date: 2025-02-01 Diverse Preference Optimization, Jack Lanchantin+, ICLR25 Comment元ポスト:https://x.com/jaseweston/status/1885399530419450257?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-QOpenReview: https://openreview.net/forum?id=pOq9vDIYevDPOと同じ最適化方 ...
Issue Date: 2025-02-01 Diverse Preference Optimization, Jack Lanchantin+, ICLR25 Comment元ポスト:https://x.com/jaseweston/status/1885399530419450257?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-QOpenReview: https://openreview.net/forum?id=pOq9vDIYevDPOと同じ最適化方 ...