NeuralNetworkに関する論文・技術記事メモの一覧

NeuralNetwork

#Multi #RecommenderSystems #Survey #Pocket #MultitaskLearning #MulltiModal
Issue Date: 2025-03-03 Joint Modeling in Recommendations: A Survey, Xiangyu Zhao+, arXiv25 Comment元ポスト:https://x.com/_reachsumit/status/1896408792952410496?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-Q ... #NaturalLanguageGeneration #NLP #Dataset #LanguageModel #Evaluation #LLM-as-a-Judge
Issue Date: 2024-12-15 Striking Gold in Advertising: Standardization and Exploration of Ad Text Generation, Masato Mita+, ACL24 Comment広告文生成タスク（Ad Text Generation）は個々のグループのプロプライエタリデータでしか評価されてこなかったことと、そもそもタスク設定が十分に規定されていないので、その辺を整備したという話らしい。特に広告文生成のための初のオープンデータなCAMERAを構築している。データセットをTab ... #RecommenderSystems #CTRPrediction #ContrastiveLearning
Issue Date: 2024-11-19 Collaborative Contrastive Network for Click-Through Rate Prediction, Chen Gao+, arXiv24 Comment参考: [Mini-appの定義生成結果（Hallucinationに注意）](https://www.perplexity.ai/search/what-is-the-definition-of-the-sW4uZPZIQe6Iq53HbwuG7Q)論文中の図解: Mini-appにトリガーと## ...

#NLP #LanguageModel #Chain-of-Thought #ACL
Issue Date: 2023-04-27 Active prompting with chain-of-thought for large language models, Diao+, The Hong Kong University of Science and Technology, ACL24 Commentしっかりと読めていないが、CoT-answerが存在しないtrainingデータが存在したときに、nサンプルにCoTとAnswerを与えるだけでFew-shotの予測をtestデータに対してできるようにしたい、というのがモチベーションっぽいそのために、questionに対して、training ... #NLP #Chain-of-Thought #Prompting #AutomaticPromptEngineering #NAACL
Issue Date: 2023-04-25 Enhancing LLM Chain-of-Thought with Iterative Bootstrapping, Sun+, Xiamen University （w_ MSRA et al.）, NAACL24 CommentZero shot CoTからスタートし、正しく問題に回答できるようにreasoningを改善するようにpromptをreviseし続けるループを回す。最終的にループした結果を要約し、それらをプールする。テストセットに対しては、プールの中からNshotをサンプルしinferenceを行う。![imで ... #Survey #GraphBased #NLP
Issue Date: 2023-04-25 Graph Neural Networks for Text Classification: A Survey, Wang+, Artificial Intelligence Review24 CommentText Classification is the most essential and fundamental problem in Natural Language Processing. While numerous recent text classification models ap ... #MachineLearning #Pocket #Grokking
Issue Date: 2023-09-30 Explaining grokking through circuit efficiency, Vikrant Varma+, N_A, arXiv23 Summaryグロッキングとは、完璧なトレーニング精度を持つネットワークでも一般化が悪い現象のことである。この現象は、タスクが一般化する解と記憶する解の両方を許容する場合に起こると考えられている。一般化する解は学習が遅く、効率的であり、同じパラメータノルムでより大きなロジットを生成する。一方、記憶回路はトレーニングデータセットが大きくなるにつれて非効率になるが、一般化回路はそうではないと仮説が立てられている。これは、記憶と一般化が同じくらい効率的な臨界データセットサイズが存在することを示唆している。さらに、グロッキングに関して4つの新しい予測が立てられ、それらが確認され、説明が支持される重要な証拠が提供されている。また、グロッキング以外の2つの新しい現象も示されており、それはアングロッキングとセミグロッキングである。アングロッキングは完璧なテスト精度から低いテスト精度に逆戻りする現象であり、セミグロッキングは完璧なテスト精度ではなく部分的なテスト精度への遅れた一般化を示す現象である。 CommentGrokkingがいつ、なぜ発生するかを説明する理論を示した研究。理由としては、最初はmemorizationを学習していくのだが、ある時点から一般化回路であるGenに切り替わる。これが切り替わる理由としては、memorizationよりも、genの方がlossが小さくなるから、とのこと。これはよG ... #NLP #LanguageModel
Issue Date: 2023-06-16 RWKV: Reinventing RNNs for the Transformer Era, Bo Peng+, N_A, arXiv23 Summary本研究では、トランスフォーマーとRNNの両方の利点を組み合わせた新しいモデルアーキテクチャであるRWKVを提案し、トレーニング中に計算を並列化し、推論中に一定の計算およびメモリの複雑さを維持することができます。RWKVは、同じサイズのトランスフォーマーと同等のパフォーマンスを発揮し、将来的にはより効率的なモデルを作成するためにこのアーキテクチャを活用できることを示唆しています。 Comment異なるtransformerとRWKVの計算量とメモリ消費量の比較 RWKVの構造は基本的に、residual blockをスタックすることによって構成される。一つのresidual blockは、time-mixing（時間方向の混ぜ合わせ）と、channnel-mixing（要素間での ...

#MachineLearning #LanguageModel #NeuralArchitectureSearch
Issue Date: 2023-04-27 Can GPT-4 Perform Neural Architecture Search? Zhang+, The University of Sydney, arXiv23 Commentドメイン知識の必要のないプロンプトで、ニューラルモデルのアーキテクチャの提案をGPTにしてもらう研究。accをフィードバックとして与え、良い構造を提案するといったループを繰り返す模様 ![image](https://user-images.githubusercontent.com/1224Ne ... #NLP #LanguageModel #Chain-of-Thought
Issue Date: 2023-04-27 Self-consistency improves chain of thought reasoning in language models, Wang+, Google Research, ICLR23 Commentself-consistencyと呼ばれる新たなCoTのデコーディング手法を提案。これは、難しいreasoningが必要なタスクでは、複数のreasoningのパスが存在するというintuitionに基づいている。 self-consistencyではまず、普通にCoTを行う。そしてgreSel ... #NLP #LanguageModel #Chain-of-Thought #ICLR
Issue Date: 2023-04-27 Automatic Chain of Thought Prompting in Large Language Models, Zhang+, Shanghai Jiao Tong University, ICLR23 CommentLLMによるreasoning chainが人間が作成したものよりも優れていることを示しているとのこと #532 よりclusteringベースな手法を利用することにより、誤りを含む例が単一のクラスタにまとめられうことを示し、これにより過剰な誤ったデモンストレーションが軽減されることを示した。手法の ... #NLP #LanguageModel #Chain-of-Thought
Issue Date: 2023-04-27 Automatic prompt augmentation and selection with chain-of-thought from labeled data, Shum+, The Hong Kong University of Science and Technology, arXiv23 CommentLLMによるreasoning chainが人間が作成したものよりも優れていることを示しているとのこと #532 よりselection phaseで誤ったexampleは直接排除する手法をとっている。そして、強化学習によって、demonstrationのselection modelを訓練している ... #NLP #LanguageModel #Transformer
Issue Date: 2023-04-25 Scaling Transformer to 1M tokens and beyond with RMT, Bulatov+, DeepPavlov, arXiv23 CommentReccurent Memory Transformer #523 を使って2Mトークン扱えるようにしたよーという話。ハリーポッターのトークン数が1.5Mらしいので、そのうち小説一冊書けるかもという世界。 ... #Survey #EfficiencyImprovement #NLP #TACL
Issue Date: 2023-04-25 Efficient Methods for Natural Language Processing: A Survey, Treviso+, TACL23 Commentパラメータ数でゴリ押すような方法ではなく、"Efficient"に行うための手法をまとめている ![image](https://user-images.githubusercontent.com/12249301/234287218-2d42766f-5c5c-4cf9-859e-c2b0a5dR ... #ComputerVision #Pocket #SIGGRAPH
Issue Date: 2022-12-01 Sketch-Guided Text-to-Image Diffusion Models, Andrey+, Google Research, SIGGRAPH23 Commentスケッチとpromptを入力することで、スケッチ biasedな画像を生成することができる技術。すごい。 ![image](https://user-images.githubusercontent.com/12249301/205189823-66052368-60a8-4f03-a4b6-37T ... #DocumentSummarization #NLP #Abstractive #EACL
Issue Date: 2022-09-02 Long Document Summarization with Top-down and Bottom-up Inference, Pang+, Salesforce Research, EACL23 Comment日本語解説: https://zenn.dev/ty_nlp/articles/9f5e5dd3084dbd 以下、上記日本語解説記事を読んで理解した内容をまとめます。ありがとうございます。 # 概要基本的にTransformerベースのモデル（e.g. BERTSum, BART,>The ... #Survey #MachineLearning #Pocket
Issue Date: 2021-06-19 Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better, Menghani, ACM Computing Surveys23 Comment学習効率化、高速化などのテクニックがまとまっているらしいDeep Learning has revolutionized the fields of computer vision, natural language understanding, speech recognition, inform ... #RecommenderSystems #Pocket #CTRPrediction
Issue Date: 2024-11-19 Deep Intention-Aware Network for Click-Through Rate Prediction, Yaxian Xia+, arXiv22 Comment#1531 の実験で利用されているベースライン ... #RecommenderSystems #Pocket #CTRPrediction
Issue Date: 2024-11-19 Deep Interest Highlight Network for Click-Through Rate Prediction in Trigger-Induced Recommendation, Qijie Shen+, WWW22 Comment#1531 の実験で利用されているベースライン ... #DocumentSummarization #Analysis #Pocket #NLP #IJCNLP #AACL #Repetition
Issue Date: 2023-08-13 Self-Repetition in Abstractive Neural Summarizers, Nikita Salkar+, N_A, AACL-IJCNLP22 Summary私たちは、BART、T5、およびPegasusという3つのニューラルモデルの出力における自己繰り返しの分析を行いました。これらのモデルは、異なるデータセットでfine-tuningされています。回帰分析によると、これらのモデルは入力の出力要約間でコンテンツを繰り返す傾向が異なることがわかりました。また、抽象的なデータや定型的な言語を特徴とするデータでのfine-tuningでは、自己繰り返しの割合が高くなる傾向があります。定性的な分析では、システムがアーティファクトや定型フレーズを生成することがわかりました。これらの結果は、サマライザーのトレーニングデータを最適化するための手法の開発に役立つ可能性があります。 #MachineLearning #Transformer #TabularData
Issue Date: 2023-04-28 Why do tree-based models still outperform deep learning on typical tabular data?, Grinsztajn+, Soda, Inria Saclay , arXiv22 Commenttree basedなモデルがテーブルデータに対してニューラルモデルよりも優れた性能を発揮することを確認し、なぜこのようなことが起きるかいくつかの理由を説明した論文。 ![image](https://user-images.githubusercontent.com/12249301/235 ... #NLP #LanguageModel #Chain-of-Thought #Prompting
Issue Date: 2023-04-27 Large Language Models are Zero-Shot Reasoners, Kojima+, University of Tokyo, NeurIPS22 CommentZero-Shot CoT (Let's think step-by-step.)論文<img width="856" alt="image" src="https://user-images.githubusercontent.com/12249301/234746367-2cd80e23-8dc ... #NLP #Zero/FewShotPrompting #Chain-of-Thought #Prompting #NeurIPS
Issue Date: 2023-04-27 Chain of thought prompting elicits reasoning in large language models, Wei+, Google Research, NeurIPS22 CommentChain-of-Thoughtを提案した論文。CoTをする上でパラメータ数が100B未満のモデルではあまり効果が発揮されないということは念頭に置いた方が良さそう。 ![image](https://user-images.githubusercontent.com/12249301/234739先 ... #Pocket #NLP #LanguageModel
Issue Date: 2022-12-05 UNIFIEDSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models, Xie+, EMNLP22 #AdaptiveLearning #EducationalDataMining #KnowledgeTracing
Issue Date: 2022-08-26 Using Neural Network-Based Knowledge Tracing for a Learning System with Unreliable Skill Tags, Karumbaiah+, （w_ Ryan Baker）, EDM22 Comment超重要論文。しっかり読むべき# 一言で言うと KTを利用することを最初から念頭に置いていなかったシステムでは、問題に対して事後的にスキルをマッピングする作業が生じてしまい、これは非常に困難なことが多い。論文中で使用したアメリカの商用の数学のblended learningのシステムのデータでは、途中 ... #AdaptiveLearning #EducationalDataMining #LearningAnalytics #KnowledgeTracing
Issue Date: 2022-04-28 Empirical Evaluation of Deep Learning Models for Knowledge Tracing: Of Hyperparameters and Metrics on Performance and Replicability, Sami+, Aalto University, JEDM22 CommentDKTの説明が秀逸で、元論文では書かれていない分かりづらいところまできちんと説明してくれている。（inputは(スキルタグ, 正誤)のtupleで、outputはスキルタグ次元数のベクトルyで、各次元が対応するスキルのmasteryを表しており、モデルのtrainingはnext attempt入 ... #MachineTranslation #Embeddings #Pocket #NLP #AAAI
Issue Date: 2021-06-07 Improving Neural Machine Translation with Compact Word Embedding Tables, Kumar+, AAAI22 CommentNMTにおいてword embeddingがどう影響しているかなどを調査しているらしい ... #CollaborativeFiltering #Pocket #Evaluation #RecSys
Issue Date: 2025-04-15 Revisiting the Performance of iALS on Item Recommendation Benchmarks, Steffen Rendle+, arXiv21 #MachineLearning #Grokking #ICLR
Issue Date: 2023-04-25 GROKKING: GENERALIZATION BEYOND OVERFIT- TING ON SMALL ALGORITHMIC DATASETS, Power+, ICLR21 Workshop Comment学習後すぐに学習データをmemorizeして、汎化能力が無くなったと思いきや、10^3ステップ後に突然汎化するという現象（Grokking）を報告 ![image](https://user-images.githubusercontent.com/12249301/234430324-a23学習 ... #ComputerVision #NaturalLanguageGeneration #NLP
Issue Date: 2022-09-15 Generating Racing Game Commentary from Vision, Language, and Structured Data, Tatsuya+, INLG21 Commentデータセット: https://kirt.airc.aist.go.jp/corpus/ja/RacingCommentary ... #Pocket #EducationalDataMining #KnowledgeTracing
Issue Date: 2022-08-31 Behavioral Testing of Deep Neural Network Knowledge Tracing Models, Kim+, Riiid, EDM21 #NaturalLanguageGeneration #NLP #Dataset #DataToTextGeneration
Issue Date: 2022-08-18 Biomedical Data-to-Text Generation via Fine-Tuning Transformers, Ruslan+, INLG21 Commentbiomedical domainの新たなdata2textデータセットを提供。事前学習済みのBART, T5等をfinetuningすることで高精度にテキストが生成できることを示した。 ... #AdaptiveLearning #EducationalDataMining #LearningAnalytics #KnowledgeTracing
Issue Date: 2022-05-02 Learning Process-consistent Knowledge Tracing, Shen+, SIGKDD21 CommentDKTでは問題を間違えた際に、対応するconceptのproficiencyを下げてしまうけど、実際は間違えても何らかのlearning gainは得ているはずだから、おかしくね？というところに端を発した研究。 student performance predictionの性能よりも、Knowle# ... #Pocket #AdaptiveLearning #EducationalDataMining #LearningAnalytics #KnowledgeTracing
Issue Date: 2022-04-28 BEKT: Deep Knowledge Tracing with Bidirectional Encoder Representations from Transformers, Tian+ （緒方先生）, Kyoto University, ICCE21 CommentKTにBERTを利用した研究 #453 などでDeepLearningBasedなモデル間であまり差がないことが示されているので、本研究が実際どれだけ強いのかは気になるところ。 ... #AdaptiveLearning #EducationalDataMining #LearningAnalytics #KnowledgeTracing #AAAI
Issue Date: 2022-04-28 Do we need to go Deep? Knowledge Tracing with Big Data, Varun+, University of Maryland Baltimore County, AAAI21 Workshop on AI Education Commentデータ量が小さいとSAKTはDKTはcomparableだが、データ量が大きくなるとSAKTがDKTを上回る。 ![image](https://user-images.githubusercontent.com/12249301/165698674-279a7e0c-6429-48db-8cIn ... #AdaptiveLearning #EducationalDataMining #LearningAnalytics
Issue Date: 2022-04-28 An Empirical Comparison of Deep Learning Models for Knowledge Tracing on Large-Scale Dataset, Pandey+, AAAI workshop on AI in Education21 CommentEdNetデータにおいて、DKT, DKVMN, SAKT, RKTの性能を比較した論文 ![image](https://user-images.githubusercontent.com/12249301/165658767-24fda9a1-3ff1-47d1-b328-91fa18aec8 ... #AdaptiveLearning #EducationalDataMining #LearningAnalytics #KnowledgeTracing
Issue Date: 2022-04-27 A Survey of Knowledge Tracing, Liu+, IEEE Transactions on Learning Technologies, arXiv21 Comment古典的なBKT, PFAだけでなくDKT, DKVMN, EKT, AKTなどDeepなモデルについてもまとまっている。 ![image](https://user-images.githubusercontent.com/12249301/165438026-70f407c9-8eb2-43c3 ... #ComputerVision #NeurIPS
Issue Date: 2021-11-04 ResNet strikes back: An improved training procedure in timm, Wightman+, NeurIPS21 Workshop ImageNet PPF Comment2015年以後、様々な最適化アルゴリズム、正則化手法、データ拡張などが提案される中で、最新アーキテクチャのモデルにはそれらが適用される一方ベースラインとなるResNetではそれらが適用されず、論文の値のみが参照される現状はフェアではないので、ResNetの性能を向上させるような訓練手法を追求した研究 ... #AdaptiveLearning #EducationalDataMining #StudentPerformancePrediction #LAK
Issue Date: 2021-10-28 SAINT+: Integrating Temporal Features for EdNet Correctness Prediction, Shin+, RiiiD AI Research, LAK21 CommentStudent Performance PredictionにTransformerを初めて利用した研究 ![image](https://user-images.githubusercontent.com/12249301/139178783-ae4d4e2d-9fc5-44f5-9769- ... #NaturalLanguageGeneration #Pocket #NLP #DataToTextGeneration
Issue Date: 2021-10-08 過去情報の内容選択を取り入れたスポーツダイジェストの自動生成, 加藤+, 東工大, NLP21 #DocumentSummarization #NaturalLanguageGeneration #NLP #LanguageModel #PEFT(Adaptor/LoRA)#ACL
Issue Date: 2021-09-09 Prefix-Tuning: Optimizing Continuous Prompts for Generation, Lisa+ （Percy Liang）, Stanford University, ACL21 Comment言語モデルをfine-tuningする際，エンコード時に「接頭辞」を潜在表現として与え，「接頭辞」部分のみをfine-tuningすることで（他パラメータは固定），より少量のパラメータでfine-tuningを実現する方法を提案．接頭辞を潜在表現で与えるこの方法は，GPT-3のpromptingに着 ... #RecommenderSystems #CollaborativeFiltering #Pocket #MatrixFactorization #RecSys #read-later #Reproducibility
Issue Date: 2025-05-16 Neural Collaborative Filtering vs. Matrix Factorization Revisited, Steffen Rendle+, RecSys20 #DocumentSummarization #NLP #ICML #Admin'sPick
Issue Date: 2025-05-13 PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization, Jingqing Zhang+, ICML20 CommentPEGASUSもなかったので追加。BARTと共に文書要約のBackboneとして今でも研究で利用される模様。関連:#984 ... #Pretraining #Pocket #NLP #TransferLearning #PostTraining #Admin'sPick
Issue Date: 2025-05-12 Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer, Colin Raffel+, JMLR20 CommentT5もメモっていなかったので今更ながら追加。全てのNLPタスクをテキスト系列からテキスト系列へ変換するタスクとみなし、Encoder-DecoderのTransformerを大規模コーパスを用いて事前学習をし、downstreamタスクにfinetuningを通じて転移する。 ... #ComputerVision #MachineLearning #Pocket #NLP #ICLR #KnowledgeEditing #read-later
Issue Date: 2025-05-07 Editable Neural Networks, Anton Sinitsin+, ICLR20 Comment（おそらく）Knowledge Editingを初めて提案した研究OpenReview:https://openreview.net/forum?id=HJedXaEtvS ... #Embeddings #Pocket #CTRPrediction #RecSys #SIGKDD #numeric
Issue Date: 2025-04-22 An Embedding Learning Framework for Numerical Features in CTR Prediction, Huifeng Guo+, arXiv20 Comment従来はdiscretizeをするか、mlpなどでembeddingを作成するだけだった数値のinputをうまく埋め込みに変換する手法を提案し性能改善数値情報を別の空間に写像し自動的なdiscretizationを実施する機構と、各数値情報のフィールドごとのglobalな情報を保持するmeta-e ... #NLP #LanguageModel #Transformer
Issue Date: 2024-05-24 GLU Variants Improve Transformer, Noam Shazeer, N_A, arXiv20 SummaryGLUのバリエーションをTransformerのフィードフォワード・サブレイヤーでテストし、通常の活性化関数よりもいくつかのバリエーションが品質向上をもたらすことを発見した。 Comment一般的なFFNでは、linear layerをかけた後に、何らかの活性化関数をかませる方法が主流である。このような構造の一つとしてGLUがあるが、linear layerと活性化関数には改良の余地があり、様々なvariantが考えられるため、色々試しました、というはなし。オリ ...

#Pocket #NLP #LanguageModel #Zero/FewShotPrompting #In-ContextLearning #NeurIPS #Admin'sPick
Issue Date: 2023-04-27 Language Models are Few-Shot Learners, Tom B. Brown+, NeurIPS20 CommentIn-Context Learningを提案した論文論文に記載されているIn-Context Learningの定義は、しっかり押さえておいた方が良い。下図はmeta-learningの観点から見たときの、in-contextの位置付け。事前学習時にSGDでパラメータをupdateするのをouter ... #NaturalLanguageGeneration #NLP #LanguageModel #DataToTextGeneration #pretrained-LM #Zero/FewShotLearning
Issue Date: 2022-12-01 Few-Shot NLG with Pre-Trained Language Model, Chen+, University of California, ACL20 Comment# 概要 Neural basedなend-to-endなNLGアプローチはdata-hungryなので、Few Shotな設定で高い性能ができる手法を提案（Few shot NLG） Table-to-Textタスク（WikiBIOデータ, 追加で収集したBook, SongドメインのWiki ... #DocumentSummarization #MachineTranslation #NLP #Transformer #pretrained-LM
Issue Date: 2022-12-01 Leveraging Pre-trained Checkpoints for Sequence Generation Tasks, Rothe+, Google Research, TACL20 Comment# 概要 BERT-to-BERT論文。これまでpre-trainedなチェックポイントを利用する研究は主にNLUで行われてきており、Seq2Seqでは行われてきていなかったので、やりました、という話。 publicly availableなBERTのcheckpointを利用し、BERTをen ... #NaturalLanguageGeneration #NLP #DataToTextGeneration #pretrained-LM
Issue Date: 2022-12-01 Template Guided Text Generation for Task-Oriented Dialogue, Kale+, Google, EMNLP20 Comment# 概要 Dialogue Actをそのままlinearlizeして言語モデルに入力するのではなく、テンプレートをベースにしたシンプルなsentenceにして言語モデルに与えると、zero-shot, few-shotなsettingで性能が向上するという話（T5ベース）。 ![image]low ... #NaturalLanguageGeneration #NLP #DataToTextGeneration #Transformer
Issue Date: 2022-09-16 Text-to-Text Pre-Training for Data-to-Text Tasks, Mihir+, Google Research, INLG20 Comment# 概要 pre-training済みのT5に対して、Data2Textのデータセットでfinetuningを実施する方法を提案。WebNLG（graph-to-text）, ToTTo（table-to-text）, Multiwoz（task oriented dialogue）データにおいて# ... #AdaptiveLearning #EducationalDataMining #LearningAnalytics #KnowledgeTracing
Issue Date: 2022-04-28 When is Deep Learning the Best Approach to Knowledge Tracing?, Theophile+ （Ken Koedinger）, CMU+, JEDM20 Comment下記モデルの性能をAUCとRMSEの観点から9つのデータセットで比較した研究 DLKT DKT SAKT FFN Regression Models IRT PFA DAS3H Logistちなみに、一つのアイテムに複数のKCが紐づいている場合 ... #Pocket #AdaptiveLearning #EducationalDataMining #LearningAnalytics #KnowledgeTracing #SIGKDD
Issue Date: 2022-04-27 Context-Aware Attentive Knowledge Tracing, Ghosh+, University of Massachusetts Amherst, KDD20 Commentこの論文の実験ではSAKTがDKVMNやDKTに勝てていない ... #AdaptiveLearning #EducationalDataMining #LearningAnalytics #DropoutPrediction
Issue Date: 2022-04-14 Deep Attentive Study Session Dropout Prediction in Mobile Learning Environment, Riiid AI Research, Lee+, CSEDU20 Comment従来のdropout研究では、学校のドロップアウトやコースのドロップアウト、MOOCsなどでのドロップアウトが扱われてきたが、モバイル学習環境を考慮した研究はあまり行われてこなかった。モバイル学習環境では着信やソーシャルアプリなど、多くの外敵要因が存在するため、学習セッションのドロップアウトが頻繁に ... #MachineLearning #Pocket #NLP #NeurIPS
Issue Date: 2021-06-09 All Word Embeddings from One Embedding, Takase+, NeurIPS20 CommentNLPのためのNN-basedなモデルのパラメータの多くはEmbeddingによるもので、従来は個々の単語ごとに異なるembeddingをMatrixの形で格納してきた。この研究ではモデルのパラメータ数を減らすために、個々のword embeddingをshared embeddingの変換によって ... #ComputerVision #EfficiencyImprovement #Pocket #ICML #Scaling Laws #Admin'sPick
Issue Date: 2025-05-12 EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks, Mingxing Tan+, ICML19 Comment元論文をメモってなかったので追加。#346も参照のこと。 ... #DocumentSummarization #NLP #Extractive
Issue Date: 2023-08-28 Text Summarization with Pretrained Encoders, Liu+ （with Lapata）, EMNLP-IJCNLP19 Summary本研究では、最新の事前学習言語モデルであるBERTを使用して、テキスト要約のための一般的なフレームワークを提案します。抽出型モデルでは、新しいエンコーダを導入し、文の表現を取得します。抽象的な要約については、エンコーダとデコーダの最適化手法を異ならせることで不一致を緩和します。さらに、2段階のファインチューニングアプローチによって要約の品質を向上させました。実験結果は、提案手法が最先端の結果を達成していることを示しています。 CommentBERTSUMEXT論文通常のBERTの構造と比較して、文ごとの先頭に[CLS]トークンを挿入し、かつSegment Embeddingsを文ごとに交互に変更することで、文のrepresentationを取得できるようにする。その後、encodingされたsentenceの[CLS]トークンに対応 ...

#NLP #Library
Issue Date: 2022-07-29 Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks, Reimers+, UKP-TUDA, EMNLP19 CommentBERTでトークンをembeddingし、mean poolingすることで生成される文ベクトルを、Siamese Networkを使い距離学習（finetune）させたモデル。 <img width="655" alt="image" src="https://user-images.githu ... #MachineLearning #AdaptiveLearning #EducationalDataMining #KnowledgeTracing
Issue Date: 2022-07-22 Deep-IRT: Make Deep Learning Based Knowledge Tracing Explainable Using Item Response Theory, Chun-Kit Yeung, EDM19 Comment# 一言で言うと DKVMN #352 のサマリベクトルf_tと、KC embedding k_tを、それぞれ独立にFully connected layerにかけてスカラー値に変換し、生徒のスキルごとの能力パラメータθと、スキルの困難度パラメータβを求められるようにして、解釈性を向上させた研究。# ... #Pocket #AdaptiveLearning #EducationalDataMining #LearningAnalytics #KnowledgeTracing
Issue Date: 2022-04-28 Knowledge Tracing with Sequential Key-Value Memory Networks, Ghodai+, Research School of Computer Science, Australian National University, SIGIR19 #RecommenderSystems #CollaborativeFiltering #Evaluation #RecSys
Issue Date: 2022-04-11 Are We Really Making Much Progress? A Worrying Analysis of Recent Neural Recommendation Approaches, Politecnico di Milano, Maurizio+, RecSys19 CommentRecSys'19のベストペーパー日本語解説：https://qiita.com/smochi/items/98dbd9429c15898c5dc7重要研究 ... #AdaptiveLearning #EducationalDataMining #StudentPerformancePrediction #EDM
Issue Date: 2021-10-28 A Self-Attentive model for Knowledge Tracing, Pandy+ （with George Carypis）, EDM19 CommentKnowledge Tracingタスクに初めてself-attention layerを導入した研究interaction (e_{t}, r_{t}) および current exercise (e_{t+1}) が与えられた時に、current_exerciseの正誤を予測したい。 * e_{ ... #NaturalLanguageGeneration #NLP #DataToTextGeneration #EMNLP
Issue Date: 2021-10-08 Table-to-Text Generation with Effective Hierarchical Encoder on Three Dimensions （Row, Column and Time）, Gong+, Harbin Institute of Technology, EMNLP19 Comment## 概要既存研究では、tableをレコードの集合, あるいはlong sequenceとしてencodeしてきたが 1. other (column) dimensionの情報が失われてしまう (?) 2. table cellは時間によって変化するtime-series data![imag ... #GraphConvolutionalNetwork #Education #EducationalDataMining #KnowledgeTracing #WI
Issue Date: 2021-07-08 GRAPH-BASED KNOWLEDGE TRACING: MODELING STUDENT PROFICIENCY USING GRAPH NEURAL NETWORK, Nakagawa+, Tokyo University, WI19 Commentgraph neural networkでKnoelwdge Tracingした論文。各conceptのproficiencyの可視化までしっかりやってそう。 ... #NaturalLanguageGeneration #NLP #DataToTextGeneration #AAAI
Issue Date: 2021-06-26 Data-to-Text Generation with Content Selection and Planning, Puduppully+, AAAI19 CommentRotowire Datasetに対するData2Text研究において代表的な論文の一つ。Wisemanモデル #207 と共にベースラインとして利用されることが多い。実装: https://github.com/ratishsp/data2text-plan-py ... #ComputerVision #Pocket #NLP
Issue Date: 2021-06-15 On Empirical Comparisons of Optimizers for Deep Learning, Dami Choi+, N_A, arXiv19 Summary深層学習のオプティマイザの比較は重要であり、ハイパーパラメータの探索空間が性能に影響することが示唆されている。特に、適応的勾配法は常に他のオプティマイザよりも性能が低下しないことが実験で示されており、ハイパーパラメータのチューニングに関する実用的なヒントも提供されている。 CommentSGD, Momentum,RMSProp, Adam,NAdam等の中から、どの最適化手法(Optimizer)が優れているかを画像分類と言語モデルにおいて比較した研究（下記日本語解説記事から引用）日本語での解説: https://akichan-f.medium.com/optimizerはどれ ... #RecommenderSystems #CTRPrediction #CVRPrediction #SIGKDD
Issue Date: 2021-06-01 Conversion Prediction Using Multi-task Conditional Attention Networks to Support the Creation of Effective Ad Creatives, Kitada+, KDD19 Comment# Overview 広告のCVR予測をCTR予測とのmulti-task learningとして定式化。構築した予測モデルのattention distributionを解析することで、high-qualityなクリエイティブの作成を支援する。 genderやgenre等の情報でatten ... #Pocket #NLP #CommentGeneration #ACL
Issue Date: 2019-08-24 Coherent Comment Generation for Chinese Articles with a Graph-to-Sequence Model, Li+ ,ACL19 #RecommenderSystems #NaturalLanguageGeneration #Pocket #NLP #ReviewGeneration #WWW
Issue Date: 2019-08-17 Review Response Generation in E-Commerce Platforms with External Product Information, Zhao+, WWW19 #RecommenderSystems #NaturalLanguageGeneration #Pocket #NLP #ReviewGeneration #ACL
Issue Date: 2019-08-17 Automatic Generation of Personalized Comment Based on User Profile, Zeng+, ACL19 Student Research Workshop #RecommenderSystems #NaturalLanguageGeneration #NLP #ReviewGeneration #WWW
Issue Date: 2019-05-31 Multimodal Review Generation for Recommender Systems, Truong+, WWW19 CommentPersonalized Review Generationと、Rating Predictionを同時学習した研究（同時学習自体はすでに先行研究がある）。また、先行研究のinputは、たいていはuser, itemであるが、multi-modalなinputとしてレビューのphotoを活用した ... #NaturalLanguageGeneration #Pocket #NLP #ContextAware #AAAI
Issue Date: 2019-01-24 Response Generation by Context-aware Prototype Editing, Wu+, AAAI19 #ComputerVision #MachineLearning #Pocket #Normalization
Issue Date: 2025-04-02 Group Normalization, Yuxin Wu+, arXiv18 CommentBatchNormalizationはバッチサイズが小さいとうまくいかず、メモリの制約で大きなバッチサイズが設定できない場合に困るからバッチサイズに依存しないnormalizationを考えたよ。LayerNormとInstanceNormもバッチサイズに依存しないけど提案手法の方が画像系のタスクだ ... #Embeddings #NLP #RepresentationLearning
Issue Date: 2022-06-08 Deep contextualized word representations, Peters+, Allen Institute for Artificial intelligence, NAACL18 CommentELMo論文。通常のword embeddingでは一つの単語につき一つの意味しか持たせられなかったが、文脈に応じて異なる意味を表現できるようなEmbeddingを実現し（同じ単語でも文脈に応じて意味が変わったりするので。たとえばrightは文脈に応じて右なのか、正しいなのか、権利なのか意味が変わs ... #EducationalDataMining #StudentPerformancePrediction #EDM
Issue Date: 2021-11-12 Modeling Hint-Taking Behavior and Knowledge State of Students with Multi-Task Learning, Chaudry+, Indian Institute of Technology, EDM18 CommentDKVMN (#352)をhint-takingタスクとmulti-task learningした研究 ![image](https://user-images.githubusercontent.com/12249301/141440172-6f708367-1804-4b0c-8c1a-4 ... #NaturalLanguageGeneration #NLP #DataToTextGeneration #COLING
Issue Date: 2021-10-25 Point precisely: Towards ensuring the precision of data in generated texts using delayed copy mechanism., Li+, Peking University, COLING18 Comment# 概要 DataToTextタスクにおいて、生成テキストのデータの精度を高める手法を提案。two stageアルゴリズムを提案。①encoder-decoerモデルでslotを含むテンプレートテキストを生成。②Copy Mechanismでslotのデータを埋める、といった手法。 ①と②はそれ ... #NaturalLanguageGeneration #NLP #DataToTextGeneration #EMNLP
Issue Date: 2021-09-16 Operation-guided Neural Networks for High Fidelity Data-To-Text Generation, Nie+, Sun Yat-Sen University, EMNLP18 Comment# 概要既存のニューラルモデルでは、生データ、あるいはそこから推論された事実に基づいて言語を生成するといったことができていない（e.g. 金融, 医療, スポーツ等のドメインでは重要）。たとえば下表に示した通り、"edge"という単語は、スコアが接戦（95-94=1 -> スコアの差が小さい# ... #RecommenderSystems #CollaborativeFiltering #Contents-based #NewsRecommendation #WWW
Issue Date: 2021-06-01 DKN: Deep Knowledge-Aware Network for News Recommendation, Wang+, WWW18 Comment# Overview Contents-basedな手法でCTRを予測しNews推薦。newsのタイトルに含まれるentityをknowledge graphと紐づけて、情報をよりリッチにして活用する。 CNNでword-embeddingのみならず、entity embedding, cont#3 ... #EducationalDataMining #LearningAnalytics #StudentPerformancePrediction #AAAI
Issue Date: 2021-05-28 Exercise-Enhanced Sequential Modeling for Student Performance Prediction, Hu+, AAAI18 Comment従来のStudent Performance PredictionタスクではKnowledge Componentと問題に対する過去の正誤を入力として予測を行っていて、問題テキストを通じて得られる問題そのものの難しさは明示的に考慮できていなかった。なので、knowledge componentで ... #RecommenderSystems #CollaborativeFiltering #FactorizationMachines #CTRPrediction #WWW
Issue Date: 2020-08-29 Field Weighted Factorization Machines for Click-Through Rate Prediction in Display Advertising, Pan+, WWW18 CommentCTR予測でbest-performingなモデルと言われているField Aware Factorization Machines(FFM)では、パラメータ数がフィールド数×特徴数のorderになってしまうため非常に多くなってしまうが、これをよりメモリを効果的に利用できる手法を提案。FFMとは性能 ... #Pocket #NLP #CommentGeneration #WWW
Issue Date: 2019-08-24 Netizen-Style Commenting on Fashion Photos: Dataset and Diversity Measures, Lin+, WWW18 #RecommenderSystems #NaturalLanguageGeneration #Pocket #NLP #ReviewGeneration #RecSys
Issue Date: 2019-08-17 Improving Explainable Recommendations with Synthetic Reviews, Ouyang+, RecSys18 #MachineLearning #GraphBased #Pocket #GraphConvolutionalNetwork #ESWC
Issue Date: 2019-05-31 Modeling Relational Data with Graph Convolutional Networks, Michael Schlichtkrull+, N_A, ESWC18 Summary知識グラフは不完全な情報を含んでいるため、関係グラフ畳み込みネットワーク（R-GCNs）を使用して知識ベース補完タスクを行う。R-GCNsは、高度な多関係データに対処するために開発されたニューラルネットワークであり、エンティティ分類とリンク予測の両方で効果的であることを示している。さらに、エンコーダーモデルを使用してリンク予測の改善を行い、大幅な性能向上が見られた。 #RecommenderSystems #GraphBased #Pocket #GraphConvolutionalNetwork #SIGKDD
Issue Date: 2019-05-31 Graph Convolutional Neural Networks for Web-Scale Recommender Systems, Ying+, KDD18 #NLP #ReviewGeneration #ACL
Issue Date: 2019-04-12 Personalized Review Generation by Expanding Phrases and Attending on Aspect-Aware Representations, Ni+, ACL18 Comment![image](https://user-images.githubusercontent.com/12249301/56010165-8fd44a00-5d1d-11e9-8cad-81a5178d95d2.png) Personalized Review Generationタスクを、uPy ... #NaturalLanguageGeneration #Pocket #NLP #AAAI
Issue Date: 2019-01-24 A Knowledge-Grounded Neural Conversation Model, Ghazvininejad+, AAAI18, #RecommenderSystems #Survey
Issue Date: 2018-04-16 Deep Learning based Recommender System: A Survey and New Perspectives, Zhang+, CSUR18 #Pocket #NLP #DialogueGeneration #ACL
Issue Date: 2018-02-08 Personalizing Dialogue Agents: I have a dog, do you have pets too?, Zhang+, ACL18 #NaturalLanguageGeneration #Pocket #NLP #TACL
Issue Date: 2017-12-31 Generating Sentences by Editing Prototypes, Guu+, TACL18 #RecommenderSystems #General #Embeddings #MachineLearning #AAAI #Admin'sPick
Issue Date: 2017-12-28 StarSpace: Embed All The Things, Wu+, AAAI18 Comment分類やランキング、レコメンドなど、様々なタスクで汎用的に使用できるEmbeddingの学習手法を提案。 Embeddingを学習する対象をEntityと呼び、Entityはbag-of-featureで記述される。 Entityはbag-of-featureで記述できればなんでもよく、こ実際にS ... #Pocket #NLP #MoE(Mixture-of-Experts)#ICLR
Issue Date: 2025-04-29 Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer, Noam Shazeer+, ICLR17 CommentMixture-of-Experts (MoE) Layerを提案した研究 ... #ComputerVision #Pocket #Optimizer
Issue Date: 2023-12-13 Large Batch Training of Convolutional Networks, Yang You+, N_A, arXiv17 Summary大規模な畳み込みネットワークのトレーニングを高速化するために、新しいトレーニングアルゴリズムを提案しました。このアルゴリズムは、Layer-wise Adaptive Rate Scaling（LARS）を使用して、大きなバッチサイズでのトレーニングを行いながらモデルの精度を損なわずにトレーニングすることができます。具体的には、Alexnetを8Kのバッチサイズまでスケーリングし、Resnet-50を32Kのバッチサイズまでスケーリングしました。 CommentBatchSizeを大きくすると性能が落ちますよ、系の話（CNN）OpenReview:https://openreview.net/forum?id=rJ4uaX2aWICLR'18にrejectされている先行研究で提案よりも大きなバッチサイズを扱えるsynchronized SGDは強みだが、評 ...

#EducationalDataMining #LearningAnalytics
Issue Date: 2021-06-10 Deep Model for Dropout Prediction in MOOCs, Wang+, ICCSE17 CommentMOOCsにおける一つの大きな問題点としてDropout率が高いことがあげられ、これを防止するために様々なモデルが提案されてきた。これまで提案されてきたモデルでは人手によるfeature-engineeringが必要であることが問題である。なぜなら、feature-engineeringはdomai ... #EducationalDataMining #LearningAnalytics #AffectDetection #AIED
Issue Date: 2021-06-08 Improving Sensor-Free Affect Detection Using Deep Learning, Botelho+, AIED17 CommentDKTが実はBKTと対して性能変わらない、みたいな話がreference付きで書かれている。Ryan Baker, Neil Heffernan論文Affect Detectionは、physical/psychological sensorを利用する研究が行われてきており、それらは様々な制約により ... #EducationalDataMining #StudentPerformancePrediction #KnowledgeTracing #WWW
Issue Date: 2021-05-28 Dynamic Key-Value Memory Networks for Knowledge Tracing, Yeung+, WWW17 CommentDeepなKnowledge Tracingの代表的なモデルの一つ。KT研究において、DKTと並んでbaseline等で比較されることが多い。DKVMNと呼ばれることが多く、Knowledge Trackingができることが特徴。モデルは下図の左側と右側に分かれる。左側はエクササイズqtに対する生徒 ... #RecommenderSystems #NLP #ReviewGeneration #SIGIR
Issue Date: 2019-04-12 Neural rating regression with abstractive tips generation for recommendation, Li+, SIGIR17 CommentRating Predictionとtips generationを同時に行うことで、両者の性能を向上させた最初の研究。 tipsとは、ユーザの経験や感じたことを、短いテキスト（1文とか）で簡潔に記したもの。![image](https://user-images.githubusercontent ... #NLP #ReviewGeneration #INLG
Issue Date: 2019-04-12 Towards automatic generation of product reviews from aspectsentiment scores, Zang+, INLG17 CommenthierarchicalなNNで、long reviewの生成に取り組んだ論文 ... #NLP #ReviewGeneration #EACL
Issue Date: 2019-03-08 Learning to Generate Product Reviews from Attributes, Dong+, EACL17 Comment（たぶん）最初のreview generation論文 ... #RecommenderSystems #NaturalLanguageGeneration #CollaborativeFiltering #NLP #ReviewGeneration #IJCNLP
Issue Date: 2019-02-01 Estimating Reactions and Recommending Products with Generative Models of Reviews, Ni+, IJCNLP17 CommentCollaborative Filtering (CF) によるコンテンツ推薦とReview Generationを同時に学習し、両者の性能を向上させる話。非常に興味深い設定で、このような実験設定でReview Generationを行なった初めての研究。CFではMatrix Factoriza ... #RecommenderSystems #CollaborativeFiltering #MatrixFactorization #WWW #Admin'sPick
Issue Date: 2018-02-16 Neural Collaborative Filtering, He+, WWW17 CommentCollaborative FilteringをMLPで一般化したNeural Collaborative Filtering、およびMatrix Factorizationはuser, item-embeddingのelement-wise product + linear transofmrat ... #RecommenderSystems #Tutorial #InformationRetrieval #SIGKDD
Issue Date: 2018-02-16 Deep Learning for Personalized Search and Recommender Systems, KDD17 #Tutorial #NeurIPS
Issue Date: 2018-02-06 Deep Learning: Practice and Trends, NIPS17 Comment基礎から最新まで幅広いトピックがまとまったtutorial ... #Survey #NLP
Issue Date: 2018-02-04 Recent Trends in Deep Learning Based Natural Language Processing, Young+, arXiv17 #Pocket #NLP #GenerativeAdversarialNetwork #NeurIPS
Issue Date: 2018-02-04 Adversarial Ranking for Language Generation, Lin+, NIPS17 #MachineTranslation #NLP #Transformer #FoundationModel #Attention #NeurIPS #Admin'sPick
Issue Date: 2018-01-19 Attention is all you need, Vaswani+, NIPS17 CommentTransformer (self-attentionを利用) 論文解説スライド：https://www.slideshare.net/DeepLearningJP2016/dlattention-is-all-you-need 解説記事：https://qiita.com/nishiba/i分か ... #Tutorial #MachineTranslation #NLP
Issue Date: 2018-01-15 ゼロから始めるニューラルネットワーク機械翻訳, 中澤敏明, NLP17 Comment中澤さんによるNMTチュートリアル。 ... #MachineLearning #Online/Interactive #Pocket
Issue Date: 2018-01-01 Online Deep Learning: Learning Deep Neural Networks on the Fly, Doyen Sahoo+, N_A, arXiv17 Summary本研究では、オンライン設定でリアルタイムにディープニューラルネットワーク（DNN）を学習するための新しいフレームワークを提案します。従来のバックプロパゲーションはオンライン学習には適していないため、新しいHedge Backpropagation（HBP）手法を提案します。この手法は、静的およびコンセプトドリフトシナリオを含む大規模なデータセットで効果的であることを検証します。 #DocumentSummarization #Document #Supervised #Pocket #NLP #ACL
Issue Date: 2018-01-01 Coarse-to-Fine Attention Models for Document Summarization, Ling+ （with Rush）, ACL17 Workshop on New Frontiers in Summarization #NaturalLanguageGeneration #NLP #DataToTextGeneration #EMNLP #Admin'sPick
Issue Date: 2018-01-01 Challenges in Data-to-Document Generation, Wiseman+ （with Rush）, EMNLP17 Comment・RotoWire（NBAのテーブルデータ + サマリ）データを収集し公開 ![image](https://user-images.githubusercontent.com/12249301/119625430-23f1c480-be45-11eb-8ff8-5e9223d41481.png)【 ... #Single #DocumentSummarization #Document #Supervised #NLP #Abstractive #ACL #Admin'sPick
Issue Date: 2017-12-31 Get To The Point: Summarization with Pointer-Generator Networks, See+, ACL17 Comment解説スライド：https://www.slideshare.net/akihikowatanabe3110/get-to-the-point-summarization-with-pointergenerator-networks/1単語の生成と単語のコピーの両方を行えるハイブリッドなニューラル文書 ... #DocumentSummarization #Supervised #Pocket #NLP #Abstractive #EACL
Issue Date: 2017-12-31 Cutting-off redundant repeating generations for neural abstractive summarization, Suzuki+, EACL17 #Multi #DocumentSummarization #Document #Supervised #GraphBased #NLP #GraphConvolutionalNetwork #Extractive #CoNLL
Issue Date: 2017-12-31 Graph-based Neural Multi-Document Summarization, Yasunaga+, CoNLL17 CommentGraph Convolutional Network (GCN)を使って、MDSやりましたという話。既存のニューラルなMDSモデル [Cao et al., 2015, 2017] では、sentence間のrelationが考慮できていなかったが、GCN使って考慮した。また、MDSの学習デー ... #NaturalLanguageGeneration #Controllable #NLP #DataToTextGeneration #ConceptToTextGeneration #ICML
Issue Date: 2017-12-31 Toward Controlled Generation of Text, Hu+, ICML17 CommentText Generationを行う際は、現在は基本的に学習された言語モデルの尤度に従ってテキストを生成するのみで、outputされるテキストをcontrolすることができないので、できるようにしましたという論文。 VAEによるテキスト生成にGANを組み合わせたようなモデル。 decodingする元 ... #ComputerVision #NaturalLanguageGeneration #NLP #ACL
Issue Date: 2017-12-31 Multi-Task Video Captioning with Video and Entailment Generation, Pasunuru+, ACL17 Comment解説スライド：https://www.slideshare.net/HangyoMasatsugu/hangyo-acl-paperreading2017multitask-video-captioning-with-video-and-entailment-generation/1multitas ... #Pretraining #Unsupervised #NLP #EMNLP
Issue Date: 2017-12-31 Unsupervised Pretraining for Sequence to Sequence Learning, Ramachandran+, EMNLP17 Commentseq2seqにおいてweightのpretrainingを行う手法を提案 seq2seqでは訓練データが小さいとoverfittingしやすいという弱点があるので、大規模なデータでunsupervisedにpretrainingし、その後目的のデータでfinetuneすることで精度を向上させまし ... #EfficiencyImprovement #NLP #ACL
Issue Date: 2017-12-31 Learning to skim text, Yu+, ACL17 Comment解説スライド：http://www.lr.pi.titech.ac.jp/~haseshun/acl2017suzukake/slides/07.pdf![image](https://user-images.githubusercontent.com/12249301/34460775-f64d4 ... #Embeddings #Analysis #NLP #Word #ACL
Issue Date: 2017-12-30 Skip-Gram – Zipf + Uniform = Vector Additivity, Gittens+, ACL17 Comment解説スライド：http://www.lr.pi.titech.ac.jp/~haseshun/acl2017suzukake/slides/09.pdfEmbeddingの加法構成性（e.g. man+royal=king）を理論的に理由づけ（解説スライドより） ... #Embeddings #NLP #Word #NeurIPS
Issue Date: 2017-12-29 Poincare Embeddings for Learning Hierarchical Representations, Nickel+, NIPS17 Comment解説: http://tech-blog.abeja.asia/entry/poincare-embeddings 解説スライド：https://speakerdeck.com/eumesy/poincare-embeddings-for-learning-hierarchical-represe・ ... #Sentence #Embeddings #NLP #EMNLP
Issue Date: 2017-12-28 Supervised Learning of Universal Sentence Representations from Natural Language Inference Data, Conneau+, EMNLP17 Commentslide: https://www.slideshare.net/naoakiokazaki/supervised-learning-of-universal-sentence-representations-from-natural-language-inference-data汎用的な文のエン ... #Sentence #Embeddings #NLP #ICLR #Admin'sPick
Issue Date: 2017-12-28 A structured self-attentive sentence embedding, Li+ （Bengio group）, ICLR17 CommentOpenReview:https://openreview.net/forum?id=BJC_jUqxe ... #MachineTranslation #Pocket #NLP #ACL
Issue Date: 2017-12-28 What do Neural Machine Translation Models Learn about Morphology?, Yonatan Belinkov+, ACL17 Commenthttp://www.lr.pi.titech.ac.jp/~haseshun/acl2017suzukake/slides/06.pdf(2025.05.12追記)上記は2017年にすずかけ台で開催されたACL 2017読み会での解説スライドです。 ... #MachineTranslation #NLP #ACL
Issue Date: 2017-12-28 Sequence-to-Dependency Neural Machine Translation, Wu+, ACL17 #MachineTranslation #Pocket #NLP #EMNLP
Issue Date: 2017-12-28 Neural Machine Translation with Source-Side Latent Graph Parsing, Kazuma Hashimoto+, EMNLP17 #Tutorial #ComputerVision #Pocket #GenerativeAdversarialNetwork
Issue Date: 2017-12-28 Generative Adversarial Networks: An Overview, Dumoulin+, IEEE-SPM17 #Pocket #SpeechProcessing #Admin'sPick
Issue Date: 2025-06-13 WaveNet: A Generative Model for Raw Audio, Aaron van den Oord+, arXiv16 #Controllable #NLP #EMNLP #Length
Issue Date: 2025-01-03 Controlling Output Length in Neural Encoder-Decoders, Yuta Kikuchi+, EMNLP16 CommentEncoder-Decoderモデルにおいてoutput lengthを制御する手法を提案した最初の研究 ... #AdaptiveLearning #EducationalDataMining #LearningAnalytics #KnowledgeTracing #NeurIPS
Issue Date: 2022-04-27 Estimating student proficiency: Deep learning is not the panacea, Wilson+, Knewton+, NIPS16 workshop CommentDKTの性能をBKTやPFA等の手法と比較した研究 #355 を引用し、DKTとBKTのAUCの計算方法の違いについて言及している ... #EducationalDataMining #LearningAnalytics #StudentPerformancePrediction #EDM
Issue Date: 2021-05-29 Back to the basics: Bayesian extensions of IRT outperform neural networks for proficiency estimation, Ekanadham+, EDM16 CommentKnewton社の研究。IRTとIRTを拡張したモデルでStudent Performance Predictionを行い、3種類のデータセットでDKT #297 と比較。比較の結果、IRT、およびIRTを拡張したモデルがDKTと同等、もしくはそれ以上の性能を出すことを示した。IRTはDKTと比べて ... #EducationalDataMining #LearningAnalytics #StudentPerformancePrediction #KnowledgeTracing #EDM
Issue Date: 2021-05-28 Going Deeper with Deep Knowledge Tracing, Beck+, EDM16 CommentBKT, PFA, DKTのinputの違いが記載されており非常にわかりやすい ![image](https://user-images.githubusercontent.com/12249301/119996969-310be080-c00a-11eb-84ce-631413ecaa4e.ちな ... #EducationalDataMining #LearningAnalytics #StudentPerformancePrediction #KnowledgeTracing #EDM
Issue Date: 2021-05-28 How Deep is Knowledge Tracing?, Mozer+, EDM16 CommentDKTでは考慮できているが、BKTでは考慮できていない4種類のregularityを指摘し、それらを考慮ようにBKT（forgetting, interactions among skills, incorporasting latent student abilities）を拡張したところ、DKT ... #RecommenderSystems #Pocket #RecSys #Admin'sPick
Issue Date: 2018-12-27 Deep Neural Networks for YouTube Recommendations, Covington+, RecSys16 #DocumentSummarization #NaturalLanguageGeneration #Pocket #NLP
Issue Date: 2018-10-06 Neural Headline Generation with Minimum Risk Training, Ayana+, N_A, arXiv16 Summary自動見出し生成のために、最小リスクトレーニング戦略を使用してモデルパラメータを最適化し、見出し生成の改善を実現する。提案手法は英語と中国語の見出し生成タスクで最先端のシステムを上回る性能を示す。 #MachineLearning #Pocket #GraphConvolutionalNetwork #NeurIPS #Admin'sPick
Issue Date: 2018-03-30 Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering, Defferrard+, NIPS16 CommentGCNを勉強する際は読むと良いらしい。あわせてこのへんも： Semi-Supervised Classification with Graph Convolutional Networks, Kipf+, ICLR'17 https://github.com/tkipf/gcn ... #Tutorial #MachineLearning #ICML
Issue Date: 2018-02-22 Tutorial: Deep Reinforcement Learning, David Silver, ICML16 #MachineLearning #Normalization #Admin'sPick
Issue Date: 2018-02-19 Layer Normalization, Ba+, arXiv16 Comment解説スライド： https://www.slideshare.net/KeigoNishida/layer-normalizationnipsTraining state-of-the-art, deep neural networks is computationally expensive. O ... #NaturalLanguageGeneration #Pocket #NLP #CoNLL #Admin'sPick
Issue Date: 2018-02-14 Generating Sentences from a Continuous Space, Bowman+, CoNLL16 CommentVAEを利用して文生成【Variational Autoencoder徹底解説】 https://qiita.com/kenmatsu4/items/b029d697e9995d93aa24 ... #Tutorial #GenerativeAdversarialNetwork #NeurIPS
Issue Date: 2018-02-06 Generative Adversarial Networks （GANS）, NIPS16 CommentGoodfellow氏によるGANチュートリアル ... #RecommenderSystems #CollaborativeFiltering #WSDM #Admin'sPick
Issue Date: 2018-01-02 Collaborative Denoising Auto-Encoders for Top-N Recommender Systems, Wu+, WSDM16 CommentDenoising Auto-Encoders を用いたtop-N推薦手法、Collaborative Denoising Auto-Encoder (CDAE)を提案。モデルベースなCollaborative Filtering手法に相当する。corruptedなinputを復元するようなDe# ... #Tutorial #SentimentAnalysis #NLP #EMNLP
Issue Date: 2018-01-01 Neural Network for Sentiment Analysis, EMNLP16 #Single #DocumentSummarization #Document #Supervised #NLP #Abstractive #ACL #Admin'sPick
Issue Date: 2017-12-31 Incorporating Copying Mechanism in Sequence-to-Sequence Learning, Gu+, ACL16 Comment解説スライド：https://www.slideshare.net/akihikowatanabe3110/incorporating-copying-mechanism-in-sequene-to-sequence-learning単語のコピーと生成、両方を行えるネットワークを提案。 locati ... #Single #DocumentSummarization #Document #Supervised #NLP #Abstractive #IJCAI
Issue Date: 2017-12-31 Distraction-Based Neural Networks for Modeling Documents, Chen+, IJCAI16 CommentNeuralなモデルで「文書」の要約を行う研究。提案手法では、attention-basedなsequence-to-sequenceモデルにdistractionと呼ばれる機構を導入することを提案。 distractionを導入するmotivationは、入力文書中の異なる情報を横断 ... #Single #DocumentSummarization #Document #Supervised #NLP #Extractive #ACL
Issue Date: 2017-12-31 Neural Summarization by Extracting Sentences and Words, Cheng+, ACL16 CommentExtractiveかつNeuralな単一文書要約ならベースラインとして使用した方がよいかも ... #NaturalLanguageGeneration #NLP #Dataset #ConceptToTextGeneration #EMNLP
Issue Date: 2017-12-31 Neural Text Generation from Structured Data with Application to the Biography Domain, Lebret+, Lebret+, EMNLP16 #BeamSearch #NLP #EMNLP
Issue Date: 2017-12-30 Sequence-to-Sequence Learning as Beam-Search Optimization, Wiseman+, EMNLP16 Commentseq2seqを学習する際には、gold-history（これまで生成した単語がgoldなものと一緒）を使用し、次に続く単語の尤度を最大化するように学習するが、これには、 1. Explosure Bias: test時ではtraining時と違いgold historyを使えないし、trai ... #Sentence #NLP #LanguageModel #ACL
Issue Date: 2017-12-28 Larger-context language modelling with recurrent neural networks, Wang+, ACL16 Comment## 概要通常のNeural Language Modelはsentence間に独立性の仮定を置きモデル化されているが、この独立性を排除し、preceding sentencesに依存するようにモデル化することで、言語モデルのコーパスレベルでのPerplexityが改善したという話。提案した言語 ... #DocumentSummarization #Document #Supervised #NLP #Abstractive #IJCAI
Issue Date: 2017-12-28 Distraction-Based Neural Networks for Modeling Documents, Chen+, IJCAI16 CommentNeuralなモデルで「文書」の要約を行う研究。提案手法では、attention-basedなsequence-to-sequenceモデルにdistractionと呼ばれる機構を導入することを提案。 distractionを導入するmotivationは、入力文書中の異なる情報を横断Dist ... #Sentence #Embeddings #NLP #NAACL
Issue Date: 2017-12-28 Learning Distributed Representations of Sentences from Unlabelled Data, Hill+, NAACL16 CommentSentenceのrepresentationを学習する話代表的なsentenceのrepresentation作成手法(CBOW, SkipGram, SkipThought, Paragraph Vec, NMTなど)をsupervisedな評価（タスク志向+supervised）とun ... #MachineTranslation #NLP #ACL #Admin'sPick
Issue Date: 2017-12-28 Pointing the unknown words, Gulcehre+, ACL16 Commentテキストを生成する際に、source textからのコピーを行える機構を導入することで未知語問題に対処した話CopyNetと同じタイミングで（というか同じconferenceで）発表 ... #ComputerVision #Visual Words #CVPR
Issue Date: 2017-12-28 Image Captioning with Semantic Attention, You+, CVPR16. Comment画像そのものだけでなく、モデルへのInputにVisual Wordsを明示的に加えることで、captioningの精度が上がりましたという論文 ... #ComputerVision #Visual Words #CVPR
Issue Date: 2017-12-28 What Value Do Explicit High Level Concepts Have in Vision to Language Problems?, Wu+, CVPR16. #ComputerVision #ECCV
Issue Date: 2017-12-28 Generating Visual Explanations, Hendrickks+, ECCV16 #MachineTranslation #Pocket #NLP #Attention #ICLR #Admin'sPick
Issue Date: 2025-05-12 Neural Machine Translation by Jointly Learning to Align and Translate, Dzmitry Bahdanau+, ICLR15 Comment(Cross-)Attentionを初めて提案した研究。メモってなかったので今更ながら追加。Attentionはここからはじまった（と認識している） ... #RecommenderSystems #Pocket #CTRPrediction #SequentialRecommendation #SIGKDD
Issue Date: 2025-04-25 E-commerce in Your Inbox: Product Recommendations at Scale, Mihajlo Grbovic+, KDD15 CommentYahoo mailにおける商品推薦の研究![image](https://github.com/user-attachments/assets/6f54d2c7-6f30-411b-94c9-888c62811bd8)Yahoo mailのレシート情報から、商品購入に関する情報とtimest関連: ... #MachineTranslation #NLP #EMNLP #Admin'sPick
Issue Date: 2021-06-02 Effective Approaches to Attention-based Neural Machine Translation, Luong+, EMNLP15 CommentLuong論文。attentionの話しはじめると、だいたいBahdanau+か、Luong+論文が引用される。 Global Attentionと、Local Attentionについて記述されている。Global Attentionがよく利用される。 Global Attentionやはり菊 ... #MachineLearning #ICML #Admin'sPick
Issue Date: 2018-02-19 An Empirical Exploration of Recurrent Network Architectures, Jozefowicz+, ICML15 CommentGRUとLSTMの違いを理解するのに最適 ... #NLP #ACL #Admin'sPick
Issue Date: 2018-02-13 Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks, Tai+, ACL15 CommentTree-LSTM論文 ... #InformationRetrieval #Search #MultitaskLearning #QueryClassification #WebSearch #NAACL
Issue Date: 2018-02-05 Representation Learning Using Multi-Task Deep Neural Networks for Semantic Classification and Information Retrieval, Liu+, NAACL-HLT15 Commentクエリ分類と検索をNeural Netを用いてmulti-task learningする研究分類(multi-class classification)とランキング(pairwise learning-to-rank)という異なる操作が必要なタスクを、multi task learningの枠組みで ... #DocumentSummarization #Sentence #Supervised #NLP #Abstractive #EMNLP #Admin'sPick
Issue Date: 2017-12-31 A Neural Attention Model for Sentence Summarization, Rush+, EMNLP15 Comment解説スライド：https://www.slideshare.net/akihikowatanabe3110/a-neural-attention-model-for-sentence-summarization-65612331 ... #Single #DocumentSummarization #Sentence #Document #NLP #Dataset #Abstractive #EMNLP #Admin'sPick
Issue Date: 2017-12-28 LCSTS: A large scale chinese short text summarizatino dataset, Hu+, EMNLP15 CommentLarge Chinese Short Text Summarization (LCSTS) datasetを作成データセットを作成する際は、Weibo上の特定のorganizationの投稿の特徴を利用。 Weiboにニュースを投稿する際に、投稿の冒頭にニュースのvery short sCop ... #Document #Embeddings #NLP #ACL
Issue Date: 2017-12-28 A hierarchical neural autoencoder for paragraphs and documents, Li+, ACL15 Comment複数文を生成(今回はautoencoder)するために、standardなseq2seq LSTM modelを、拡張したという話。要は、paragraph/documentのrepresentationが欲しいのだが、アイデアとしては、word-levelの情報を扱うLSTM layerと ... #Document #Embeddings #SentimentAnalysis #NLP #EMNLP
Issue Date: 2017-12-28 Document Modeling with Gated Recurrent Neural Network for Sentiment Classification, Tang+, EMNLP15 Commentword level -> sentence level -> document level のrepresentationを求め、documentのsentiment classificationをする話。 documentのRepresentationを生成するときに参考になるやも。 sen ... #DocumentSummarization #Sentence #NLP #EMNLP #Admin'sPick
Issue Date: 2017-12-28 Sentence Compression by Deletion with LSTMs, Fillipova+, EMNLP15 Commentslide:https://www.slideshare.net/akihikowatanabe3110/sentence-compression-by-deletion-with-lstms ... #TimeSeriesDataProcessing #MachineLearning #Financial
Issue Date: 2017-12-31 Recurrent neural network and a hybrid model for prediction of stock returns, Akhter+, Expert Systems with Applications14 CommentStock returnのpredictionタスクに対してNNを適用。 AR-MRNNモデルをRNNに適用、高い性能を示している。 moving referenceをsubtractした値をinput-outputに用いることで、normalizationやdetrending等の前処理が不 ... #RecommenderSystems #InformationRetrieval #Contents-based #CIKM
Issue Date: 2021-06-01 Learning Deep Structured Semantic Models for Web Search using Clickthrough Data, Huang+, CIKM13 Comment日本語解説: https://shunk031.me/paper-survey/summary/others/Learning-Deep-Structured-Semantic-Models-for-Web-Search-using-Clickthrough-Data ... #RecommenderSystems #MatrixFactorization #NeurIPS #Admin'sPick
Issue Date: 2018-01-11 Deep content-based music recommendation, Oord+, NIPS13 CommentContents-Basedな音楽推薦手法(cold-start problemに強い)。 Weighted Matrix Factorization (WMF) (Implicit Feedbackによるデータに特化したMatrix Factorization手法) #225 に、Convolu ... #ComputerVision #NeurIPS #Admin'sPick #ImageClassification
Issue Date: 2025-05-13 ImageNet Classification with Deep Convolutional Neural Networks, Krizhevsky+, NIPS12 CommentILSVRC 2012において圧倒的な性能示したことで現代のDeepLearningの火付け役となった研究AlexNet。メモってなかったので今更ながら追加した。AlexNet以前の画像認識技術については牛久先生がまとめてくださっている（当時の課題とそれに対する解決法、しかしまだ課題が…と次々と課題 ... #CollaborativeFiltering #MatrixFactorization #EducationalDataMining #StudentPerformancePrediction
Issue Date: 2021-10-29 Collaborative Filtering Applied to Educational Data Mining, Andreas+, KDD Cup10 CommentKDD Cup'10のStudent Performance Predictionタスクにおいて3位をとった手法メモリベースドな協調フィルタリングと、Matirx Factorizationモデルを利用してStudent Performance Predictionを実施。最終的にこれらのモ ... #TimeSeriesDataProcessing #MachineLearning #Financial
Issue Date: 2017-12-31 Prediction-based portfolio optimization model using neural networks, Freitas+, Neurocomputing09 CommentStock returnのpredictionタスクに対してNNを適用。 NNのinput-outputとして、生のreturn値を用いるのではなく、ある時刻におけるreturnをsubtractした値(moving reference)を用いる、AR-MRNNモデルを提案。 ... #MachineLearning #Pocket #MoE(Mixture-of-Experts)
Issue Date: 2025-04-29 Adaptive Mixture of Local Experts, Jacobs+, Neural Computation91 CommentMixture of Expertsの起源と思ったのだが、下記研究の方が年号が古いようだが、こちらが起源ではなのか・・・？だがアブスト中に上記論文で提案されたMoEのパフォーマンスを比較する、といった旨の記述があるので時系列がよくわからない。[Evaluation of Adaptive Mixtu ... #Article #RecommenderSystems #Embeddings #EfficiencyImprovement #AWS #MLOps #Article #A/B Testing
Issue Date: 2025-06-29 日経電子版のアプリトップ「おすすめ」をTwo Towerモデルでリプレースしました, NIKKEI, 2025.05 Commentリアルタイム推薦をするユースケースにおいて、ルールベース+協調フィルタリング(Jubatus)からTwo Towerモデルに切り替えた際にレイテンシが300ms増えてしまったため、ボトルネックを特定し一部をパッチ処理にしつつもリアルタイム性を残すことで解決したという話。AWSの構成、A/Bテストや負 ... #Article #Embeddings #NLP #Word #STS (SemanticTextualSimilarity)
Issue Date: 2024-11-20 Zipf 白色化：タイプとトークンの区別がもたらす良質な埋め込み空間と損失関数, Sho Yokoi, 2024.11 Comment元論文: [Yokoi, Bao, Kurita, Shimodaira, “Zipfian Whitening,” NeurIPS 2024. ](https://arxiv.org/abs/2411.00680)The word embedding space in neural models ... #Article #RecommenderSystems #CTRPrediction #NewsRecommendation #MLOps #Evaluation #Article #A/B Testing
Issue Date: 2024-08-31 NewsPicksに推薦システムを本番投入する上で一番優先すべきだったこと, 2024.08 Comment>推薦モデルの良し悪しをより高い確度で評価できる実験を、より簡単に実行できる状態を作ることでした。平たく言えば「いかにA/Bテストしやすい推薦システムを設計するか」が最も重要だった訳です。オフライン評価とオンライン評価の相関がない系の話で、A/Bテストを容易に実施できる環境になかった、かつCTRあと ... #Article #ComputerVision #EfficiencyImprovement #NLP #LanguageModel #DiffusionModel #Article
Issue Date: 2023-10-29 StableDiffusion, LLMのGPUメモリ削減のあれこれ CommentGradient Accumulation, Gradient Checkpointingの説明が丁寧でわかりやすかった。 ... #Article #NLP #LanguageModel #Library #Transformer
Issue Date: 2023-05-04 OpenLLaMA CommentLLaMAと同様の手法を似たデータセットに適用し商用利用可能なLLaMAを構築した模様 ... #Article #EfficiencyImprovement #NLP #LanguageModel #Library #PEFT(Adaptor/LoRA)
Issue Date: 2023-04-25 LoRA論文解説, Hayato Tsukagoshi, 2023.04 Commentベースとなる事前学習モデルの一部の線形層の隣に、低ランク行列A,Bを導入し、A,Bのパラメータのみをfinetuningの対象とすることで、チューニングするパラメータ数を激減させた上で同等の予測性能を達成し、推論速度も変わらないようにするfinetuning手法の解説LoRAを使うと、でかすぎるモデ ... #Article #Tutorial #MachineLearning
Issue Date: 2023-01-21 tuning_playbook, Google Research CommentGoogleが公開したDeep Learningモデル学習のノウハウ。必読日本語訳https://github.com/Valkyrja3607/tuning_playbook_ja ... #Article #Tutorial #Library #Transformer
Issue Date: 2022-12-01 BetterTransformer, Out of the Box Performance for Hugging Face Transformers Commentたった1ライン追加するだけで、Transformerのinferenceが最大で4.5倍高速化されるBetterTransformerの解説記事better_model = BetterTransformer.transform(model) ... #Article #Tutorial #ComputerVision
Issue Date: 2022-10-27 CNN vs. ViT, 牛久先生 Comment・Swin Transformer, Depth-wise conv, ConvNeXt, ViTとCNNのロバスト性の違いの話があり勉強になる・最終的な結論が、CNNもTransformerも変わらない（明確な勝者はいない; 今のところ引き分け）というのはおもしろかったdepth-wise co ... #Article #Tutorial #NLP #Transformer
Issue Date: 2022-09-06 Transformerの最前線〜畳込みニューラルネットワークの先へ〜, 牛久先生, 2022 #Article #NLP #LanguageModel #PEFT(Adaptor/LoRA)
Issue Date: 2022-08-19 The Power of Scale for Parameter-Efficient Prompt Tuning, Lester+, Google Research, EMNLP‘21 Comment日本語解説: https://qiita.com/kts_plea/items/79ffbef685d362a7b6ceT5のような大規模言語モデルに対してfinetuningをかける際に、大規模言語モデルのパラメータは凍結し、promptをembeddingするパラメータを独立して学習する手法 ... #Article #AdaptiveLearning #EducationalDataMining #KnowledgeTracing
Issue Date: 2022-07-25 独立な学習者・項目ネットワークをもつ Deep-IRT, 堤+, 電子情報通信学会論文誌, 2021 Comment# モチベーション Deep-IRTで推定される能力値は項目の特性に依存しており、同一スキル内の全ての項目が等質であると仮定しているため、異なる困難度を持つ項目からの能力推定値を求められない。このため、能力パラメータや困難度パラメータの解釈性は、従来のIRTと比較して制約がある。一方、木下らが提案 ... #Article #ComputerVision #CVPR #Admin'sPick
Issue Date: 2021-11-04 Deep Residual Learning for Image Recognition, He+, Microsoft Research, CVPR’16 CommentResNet論文 ResNetでは、レイヤーの計算する関数を、残差F(x)と恒等関数xの和として定義する。これにより、レイヤーが入力との差分だけを学習すれば良くなり、モデルを深くしても最適化がしやすくなる効果ぎある。数レイヤーごとにResidual Connectionを導入し、恒等関数によるショ同 ... #Article #AdaptiveLearning #EducationalDataMining #StudentPerformancePrediction #KnowledgeTracing #L@S
Issue Date: 2021-10-29 Addressing Two Problems in Deep Knowledge Tracing via Prediction-Consistent Regularization, Yeung+, 2018, L@S CommentDeep Knowledge Tracing (DKT)では、下記の問題がある：該当スキルに正解/不正解したのにmasteryが下がる/上がる（Inputをreconstructしない）いきなり習熟度が伸びたり、下がったりする（時間軸に対してmastery levelがcons実装: ht ... #Article #NLP #LanguageModel
Issue Date: 2021-09-09 GPT-3から我々は何を学べば良いのか, 山本, Japio year book 2020 CommentGPT-3の概要:GPT-3はWebサイトから数年に渡って収集したCommon Crawlというデータセットから、570GBを抜粋し学習に利用。（英語ウィキペディアの約130倍）ある単語列に後続する単語を予測するという方法（自己回帰型言語モデル）で教師なし学習を繰り返し、言語モデルを学習。GPT-3 ... #Article #Survey #Pocket #NLP
Issue Date: 2021-06-17 Pre-Trained Models: Past, Present and Future, Han+, AI Open‘21 CommentLarge-scale pre-trained models (PTMs) such as [BERT](https://www.sciencedirect.com/topics/computer-science/bidirectional-encoder-representations-from ... #Article #Tools #Library #python #Article
Issue Date: 2021-06-12 pytorch_lightning tips CommentPyTorch Lightning 2021 (for MLコンペ)https://qiita.com/fam_taro/items/df8656a6c3b277f58781 ... #Article #EfficiencyImprovement #NLP #Transformer #ACL
Issue Date: 2021-06-10 FastSeq: Make Sequence Generation Faster, Yan+, ACL’21 CommentBART, DistilBART, T5, GPT2等のさまざまなTransformer-basedな手法で、4-9倍Inference speedを向上させる手法を提案。 ... #Article #Survey #NLP
Issue Date: 2021-06-09 A survey of Transformers, Lin+, AI Open‘22 CommentTransformersの様々な分野での亜種をまとめた論文![image](https://user-images.githubusercontent.com/12249301/121394765-a40f4280-c98c-11eb-8fac-0114715ec738.png)Transforme ... #Article #Tutorial #Tools #Library #python
Issue Date: 2021-06-06 TRTorch Commentpytorchの推論を高速化できるライブラリ。6倍ほど早くなった模様。TorchScriptを介して変換するので、PythonだけでなくC++でも動作できるらしい。 ... #Article #MachineTranslation #NLP #NAACL
Issue Date: 2021-06-03 Probing Word Translations in the Transformer and Trading Decoder for Encoder Layers, NAACL‘21 CommentTransformerに基づいたNMTにおいて、Encoderが入力を解釈し、Decoderが翻訳をしている、という通説を否定し、エンコーディング段階、さらにはinput embeddingの段階でそもそも翻訳が始まっていることを指摘。エンコーディングの段階ですでに翻訳が始まっているのであれば、エこ ... #Article #DocumentSummarization #NaturalLanguageGeneration #NLP #ACL
Issue Date: 2021-06-03 Incorporating Copying Mechanism in Sequence-to-Sequence Learning, Gu+, ACL’16 Comment#371 と同様コピーメカニズムを提案した論文。Joint Copy ModelやCOPYNETと呼ばれる。次の単語が "生成" されるのか "コピー" されるのかをスコアリングし、各単語がコピーされる確率と生成される確率をMixtureした同時確率分布で表現する（ #207 等でも説明されてい解 ... #Article #DocumentSummarization #NaturalLanguageGeneration #NLP #ACL
Issue Date: 2021-06-02 Pointing the Unknown Words, Gulcehre+, ACL’16 CommentConditional Copy Model （Pointer Softmax）を提案した論文。単語を生成する際に、語彙内の単語から生成する分布、原文の単語から生成する分布を求める。後者はattention distributionから。コピーするか否かを決める確率変数を導入し（sigmoid）、解 ... #Article #EducationalDataMining #LearningAnalytics #KnowledgeTracing
Issue Date: 2021-06-02 Deep Knowledge Tracingの拡張による擬似知識タグの生成, 中川+, 人口知能学会論文誌, 33巻, 33号, C, 2018 CommentDKTモデルは、前提として各問題に対して知識タグ（knowledge component）が付与されていることが前提となっている。しかし世の中には、知識タグが振られているデータばかりではないし、そもそもプログラミング教育といった伝統的な教育ではない分野については、そもそも知識タグを構造的に付与するこ ... #Article #SentimentAnalysis #NLP #RepresentationLearning
Issue Date: 2021-06-01 Sentiment analysis with deeply learned distributed representations of variable length texts, Hong+, Technical Report. Technical report, Stanford University, 2015 Comment#363 より、本論文を引用して「CNN ベースのモデルが、畳み込み演算により文から特定のローカルパターンを検出して抽出できるため、他のモデル（e.g. Recurrent Neural Network, Recursive Neural Network）よりも優れていることが経験的に示されている」 ... #Article #EducationalDataMining #LearningAnalytics #StudentPerformancePrediction
Issue Date: 2021-05-29 Behavior-Based Grade Prediction for MOOCs Via Time Series Neural Networks, Chiang+, IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 11, NO. 5, AUGUST 2017 CommentMOOCsでの生徒のgradeを予測するモデルを提案。MOOCsでは生徒のassessmentに対するreponseがsparseで、かつpersonalizedなモデルが必要なため成績予測はチャレンジングなタスク。 lecture-video-watching clickstreams を利用しN ... #Article #EducationalDataMining #LearningAnalytics #StudentPerformancePrediction #KnowledgeTracing
Issue Date: 2021-05-28 EKT: Exercise-aware Knowledge Tracing for Student Performance Prediction, Hu+, IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2019 CommentDKT等のDeepなモデルでは、これまで問題テキストの情報等は利用されてこなかったが、learning logのみならず、問題テキストの情報等もKTする際に活用した研究。 #354 をより洗練させjournal化させたものだと思われる。 #354 ではKTというより、問題の正誤を予測するモデモデ ... #Article #RecommenderSystems #CollaborativeFiltering #Pocket #FactorizationMachines #CTRPrediction #IJCAI
Issue Date: 2021-05-25 DeepFM: A Factorization-Machine based Neural Network for CTR Prediction, Guo+, IJCAI’17 CommentFactorization Machinesと、Deep Neural Networkを、Wide&Deepしました、という論文。Wide=Factorization Machines, Deep=DNN。高次のFeatureと低次のFeatureを扱っているだけでなく、FMによってフィールドご#2 ... #Article #RecommenderSystems #CollaborativeFiltering #Pocket #FactorizationMachines #CTRPrediction #SIGKDD
Issue Date: 2021-05-25 xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems, Lian+, KDD‘18 Comment#349 DeepFMの発展版#281 にも書いたが、下記リンクに概要が記載されている。 DeepFMに関する動向：https://data.gunosy.io/entry/deep-factorization-machines-2018 DeepFMの発展についても詳細に述べられていて、とても参 ... #Article #RecommenderSystems #LanguageModel #CIKM #SequentialRecommendation
Issue Date: 2021-05-25 BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations from Transformer, Sun+, CIKM2019 CommentBERTをrecsysのsequential recommendationタスクに転用してSoTA。しっかり読んで無いけどモデル構造はほぼBERTと一緒。異なる点は、Training時にNext Sentence Predictionは行わずClozeのみ行なっているという点。Clozeとは、実BE ... #Article #Tutorial #ComputerVision #EfficiencyImprovement #Article #ImageClassification
Issue Date: 2021-05-24 EfficientNet解説, omiita （オミータ）, 2019 Comment既存画像認識モデルの構造は変化させず、広さ、深さ、解像度を複合スケーリングすることで、従来よりも少ないパラメータ数、かつ学習速度でSoTAを達成。広さ、深さ、解像度はそれぞれ性能に互いに影響しあっており、従来のように別々にスケーリングするのではなく、3つのバランスをとりながらスケーリングする。スケー ... #Article #Survey #ComputerVision #NLP
Issue Date: 2021-05-19 MLP-like Architecture CommentgMLP:大規模なself-attentionが無いSpatial Gating Unitを搭載したシンプルなMLPでも、Transformerの性能に近づけたよ（特にCV）。つまり、self-attentionはessentialというわけではなさそうだよ。NLPの場合はgMLPだとTransまあ ... #Article #Tools #NLP #Dataset #LanguageModel #Library #Article
Issue Date: 2020-03-13 BERT 日本語Pre-trained Model, NICT, 2020 CommentNICTが公開。既に公開されているBERTモデルとのベンチマークデータでの性能比較も行なっており、その他の公開済みBERTモデルをoutperformしている。 ... #Article #Survey #NLP #LanguageModel #Slide #Admin'sPick
Issue Date: 2019-11-09 事前学習言語モデルの動向 _ Survey of Pretrained Language Models, Kyosuke Nishida, 2019 Comment[2019/06まで] ・ELMo（双方向2層LSTM言語モデル）・GPT（left-to-rightの12層Transformer自己回帰言語モデル）・BERT（24層のTransformer双方向言語モデル）・MT-DNN（BERTの上にマルチタスク層を追加した研究）・XLM（ELMo, ... #Article #Tools #NLP #Library
Issue Date: 2019-09-22 【黒橋研】BERT日本語Pretrainedモデル Comment【huggingface transformersで使える日本語モデルのまとめ】 https://tech.yellowback.net/posts/transformers-japanese-models ... #Article #Tutorial #GraphBased
Issue Date: 2019-05-31 Representation Learning on Graphs: Methods and Applications, Hamilton+, 2017 #Article #Tutorial #Tools #NLP
Issue Date: 2018-11-16 AllenNLP Commenthttps://docs.google.com/presentation/d/17NoJY2SnC2UMbVegaRCWA7Oca7UCZ3vHnMqBV4SUayc/preview?slide=id.g43b8d8e880_0_8 ... #Article #Tutorial #MachineLearning #NLP
Issue Date: 2018-06-29 Pytorchによるtransformer実装チュートリアル #Article #Tutorial #MachineLearning #NLP
Issue Date: 2018-02-19 ニューラルネット勉強会（LSTM編）, Seitaro Shinagawa, 2016 CommentLSTMの基礎から、実装する上でのTipsがまとまっている。 zero padding, dropoutのかけかた、normalizationの手法など。 ... #Article #Tutorial #NLP #Slide #Admin'sPick
Issue Date: 2018-01-15 自然言語処理のためのDeep Learning, Yuta Kikuchi #Article #RecommenderSystems #CollaborativeFiltering #MatrixFactorization #SIGKDD #Admin'sPick
Issue Date: 2018-01-11 Collaborative Deep Learning for Recommender Systems Wang+, KDD’15 CommentRating Matrixからuserとitemのlatent vectorを学習する際に、Stacked Denoising Auto Encoder（SDAE）によるitemのembeddingを活用する話。 Collaborative FilteringとContents-based Fil解 ... #Article #Survey #TimeSeriesDataProcessing
Issue Date: 2017-12-31 Artificial neural networks in business: Two decades of research, Tkac+, Applied Soft Computing 2016 Commentビジネスドメイン(e.g. Stock market price prediction)におけるニューラルネットワークの活用事例をまとめたSurvey。時系列データの取り扱いなどの参考になるかも。 ... #Article #NaturalLanguageGeneration #NLP #DataToTextGeneration #NAACL
Issue Date: 2017-12-31 What to talk about and how? Selective Generation using LSTMs with Coarse-to-Fine Alignment, Mei+, NAACL-HLT’16 Commentcontent-selectionとsurface realizationをencoder-decoder alignerを用いて同時に解いたという話。普通のAttention basedなモデルにRefinerとPre-Selectorと呼ばれる機構を追加。通常のattentionにはatte ... #Article #Tutorial #EfficiencyImprovement
Issue Date: 2017-12-31 Efficient Methods and Hardware for Deep Learning, Han, Stanford University, 2017 #Article #Document #NLP #QuestionAnswering #NeurIPS
Issue Date: 2017-12-28 Teaching Machines to Read and Comprehend, Hermann+, NIPS 2015 Commentだいぶ前に読んだので割とうろおぼえ。 CNN/DailyMailデータセットの作成を行なった論文（最近Neuralな文”書”要約の学習でよく使われるやつ）。 CNN/DailyMailにはニュース記事に対して、人手で作成した要約が付与されており、要約中のEntityを穴埋めにするなどして、 ...