Alignment

Nature Study: Training Language Models to Be 'Warm' Reduces Accuracy and Increases Sycophancy

Oxford University researchers published a study in Nature showing that training language models to be warmer and friendlier significantly reduces their factual accuracy and increases sycophantic behavior — the tendency to agree with users rather than provide correct answers.