Shuffled-token Detection for Refining Pre-trained RoBERTa

Subhadarshi Panda,Anjali Agrawal,Jeewon Ha,Benjamin Bloch

Shuffled-token Detection for Refining Pre-trained RoBERTa

2021

Subhadarshi Panda
Anjali Agrawal
Jeewon Ha
Benjamin Bloch

State-of-the-art transformer models have achieved robust performance on a variety of NLP tasks. Many of these approaches have employed domain agnostic pre-training tasks to train models that yield highly generalized sentence representations that can be fine-tuned for specific downstream tasks. We propose refining a pre-trained NLP model using the objective of detecting shuffled tokens. We use a sequential approach by starting with the pre-trained RoBERTa model and training it using our approach. Applying random shuffling strategy on the word-level, we found that our approach enables the RoBERTa model achieve better performance on 4 out of 7 GLUE tasks. Our results indicate that learning to detect shuffled tokens is a promising approach to learn more coherent sentence representations.

Keywords:

transformer
Computer science
Security token
Shuffling
Machine learning
Sentence
Domain (software engineering)
Artificial intelligence

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations