Improved Pretraining for Domain-specific Contextual Embedding Models

Subendhu Rongali,Abhyuday Jagannatha,Bhanu Pratap Singh Rawat,Hong Yu

Improved Pretraining for Domain-specific Contextual Embedding Models

2020

Subendhu Rongali
Abhyuday Jagannatha
Bhanu Pratap Singh Rawat
Hong Yu

We investigate methods to mitigate catastrophic forgetting during domain-specific pretraining of contextual embedding models such as BERT, DistilBERT, and RoBERTa. Recently proposed domain-specific models such as BioBERT, SciBERT and ClinicalBERT are constructed by continuing the pretraining phase on a domain-specific text corpus. Such pretraining is susceptible to catastrophic forgetting, where the model forgets some of the information learned in the general domain. We propose the use of two continual learning techniques (rehearsal and elastic weight consolidation) to improve domain-specific training. Our results show that models trained by our proposed approaches can better maintain their performance on the general domain tasks, and at the same time, outperform domain-specific baseline models on downstream domain tasks.

Keywords:

Natural language processing
Embedding
Artificial intelligence
Consolidation (soil)
Computer science
Forgetting
Text corpus
Machine learning
continual learning

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations