SFT
Supervised Fine-Tuning
SFT tasks are structured datasets used to train Large Language Models (LLMs) with human created data.
Overview
Each SFT task is designed to improve a model’s capabilities with source data or improvements to a model’s response.
Task Structure
A SFT task consists of:
- A prompt (question or instruction)
- A model response
- A human written response (optimal response)
Additional annotations can be included when necessary for classification of the prompt & response.
Messages
Each turn consists of sequential messages that represent a user prompt, model response and human written response.
User message
Contains the initial prompt or instruction (role: user
).
Model Response
Contains the reference model’s answer (role: assistant
).
The model response includes a source_id
that uniquely identifies the model
that generated the response.
Message Annotations
The model response can be evaluated across multiple dimensions, which may include:
- Instruction following
- Truthfulness
- Factuality
- Tone
If the model response is rewritten, the rewrite
annotation is added to the list.
Message’s annotations
include the ratings for each dimension.
The rating dimensions are flexible and can be customized based on project requirements and objectives.
Turn-Level Annotations
The annotations
at the turn level, specifies preference related or aggregated information. Some common examples are:
- Detailed
justification
for why a certain response is better - Any comparative analysis between model responses
Expanded SFT Task Output
This is a sample expanded sample SFT Task output returned by /v2/task
.