Should We Still Pretrain Encoders with Masked Language Modeling?
Paper
•
2507.00994
•
Published
•
80
Research material on research about pre-training encoders, with extensive comparison on masked language modeling paradigm vs causal langage modeling.