{{backlinks>.}} ====== 🤖 Grands modèles de langage ====== * [[wpfr>Grand modèle de langage]] (LLM) ===== Fonctionnement ===== * [[wp>Large language model]] * [[wp>Generative pre-trained transformer]] * [[wp>Transformer (deep learning architecture)]] * [[https://arxiv.org/pdf/2401.02038v2|Understanding LLMs: A Comprehensive Overview from Training to Inference]] * The training of LLMs can be broadly divided into three steps. * [1] The first step involves **data collection** and processing. * [2] The second step encompasses the **pre-training** process, which includes determining the model’s architecture and pretraining tasks and utilizing suitable parallel training algorithms to complete the training. * [3] The third step involves **finetuning** and alignment. In this section, we will provide an overview of the model training techniques. This will include an introduction to the relevant training datasets, data preparation and preprocessing, model architecture, specific training methodologies, model evaluation, and commonly used training frameworks for LLMs