Glossary of 🤗 Transformers


General terms

  • autoencoding models: see MLM
  • autoregressive models: see CLM
  • CLM: causal language modeling, a pretraining task where the model reads the texts in order and has to predict the next word. It’s usually done by reading the whole sentence but using a mask inside the model to hide the future tokens at a certain timestep.
  • MLM: masked language modeling, a pretraining task where the model sees a corrupted version of the texts, usually done by masking some tokens randomly, and has to predict the original text.
  • multimodal: a task that combines texts with another kind of inputs (for instance images).
  • NLG: natural language generation, all tasks related to generating text ( for instance talk with transformers, translation)
  • NLP: natural language processing, a generic way to say “deal with texts”.
  • NLU: natural language understanding, all tasks related to understanding what is in a text (for instance classifying the whole text, individual words)
  • pretrained model: a model that has been pretrained on some data (for instance all of Wikipedia). Pretraining methods involve a self-supervised objective, which can be reading the text and trying to predict the next word (see CLM) or masking some words and trying to predict them (see MLM).
  • RNN: recurrent neural network, a type of model that uses a loop over a layer to process texts.
  • seq2seq or sequence-to-sequence: models that generate a new sequence from an input, like translation models, or summarization models (such as Bart or T5).
  • token: a part of a sentence, usually a word, but can also be a subword (non-common words are often split in subwords) or a punctuation symbol.

Language Modeling


文章作者: CarlYoung
版权声明: 本博客所有文章除特別声明外,均采用 CC BY 4.0 许可协议。转载请注明来源 CarlYoung !
  目录