Source : Schéma des étapes de construction d'un modèle LLM
To post-train models, we take a pre-trained base model, do supervised fine-tuning on a broad set of ideal responses written by humans or existing models, and then run reinforcement learning with reward signals from a variety of sources.
During reinforcement learning, we present the language model with a prompt and ask it to write responses. We then rate its response according to the reward signals, and update the language model to make it more likely to produce higher-rated responses and less likely to produce lower-rated responses.
Source : OpenAI, Expanding on what we missed with sycophancy
Site sous licence Creative Commons BY-NC-ND v4.0 : Milovann
Yanatchkov