General terms
autoencoding models
: see MLMautoregressive models
: see CLMCLM
: causal language modeling, a pretraining task where the model reads the texts in order and has to predict the next word. It’s usually done by reading the whole sentence but using a mask inside the model to hide the future tokens at a certain timestep.MLM
: masked language modeling, a pretraining task where the model sees a corrupted version of the texts, usually done by masking some tokens randomly, and has to predict the original text.multimodal
: a task that combines texts with another kind of inputs (for instance images).NLG
: natural language generation, all tasks related to generating text ( for instance talk with transformers, translation)NLP
: natural language processing, a generic way to say “deal with texts”.NLU
: natural language understanding, all tasks related to understanding what is in a text (for instance classifying the whole text, individual words)pretrained model
: a model that has been pretrained on some data (for instance all of Wikipedia). Pretraining methods involve a self-supervised objective, which can be reading the text and trying to predict the next word (see CLM) or masking some words and trying to predict them (see MLM).RNN
: recurrent neural network, a type of model that uses a loop over a layer to process texts.seq2seq or sequence-to-sequence
: models that generate a new sequence from an input, like translation models, or summarization models (such as Bart or T5).token
: a part of a sentence, usually a word, but can also be a subword (non-common words are often split in subwords) or a punctuation symbol.