목록Deep Learning/Paper (4)
Patrick's 데이터 세상

https://arxiv.org/abs/1910.13461 BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension We present BART, a denoising autoencoder for pretraining sequence-to-sequence models. BART is trained by (1) corrupting text with an arbitrary noising function, and (2) learning a model to reconstruct the original text. It uses a standard Tranformer-b..

https://arxiv.org/abs/1901.11196 EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks We present EDA: easy data augmentation techniques for boosting performance on text classification tasks. EDA consists of four simple but powerful operations: synonym replacement, random insertion, random swap, and random deletion. On five text classificati arxiv.org 소개 Te..

https://arxiv.org/abs/2003.10555 ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators Masked language modeling (MLM) pre-training methods such as BERT corrupt the input by replacing some tokens with [MASK] and then train a model to reconstruct the original tokens. While they produce good results when transferred to downstream NLP tasks, the arxiv.org 제목 ELECTRA: Pre-train..

https://arxiv.org/abs/1905.03677v1 Learning Loss for Active Learning The performance of deep neural networks improves with more annotated data. The problem is that the budget for annotation is limited. One solution to this is active learning, where a model asks human to annotate data that it perceived as uncertain. A variet arxiv.org 제목 Learning Loss for Active Learning 저자 Donggeun Yoo, In So Kw..