ai
RLHF
Reinforcement learning from human feedback.
Definition
RLHF (Reinforcement Learning from Human Feedback) is a training technique that uses human ratings to align model outputs with human preferences. Made ChatGPT possible.
ai
Reinforcement learning from human feedback.
Definition
RLHF (Reinforcement Learning from Human Feedback) is a training technique that uses human ratings to align model outputs with human preferences. Made ChatGPT possible.