Search results for: RLHF
Displayed results came from alternative search method.
Search results for: RLHF
-
WikiPrefs: human preferences dataset build from text edits
Open Research DataThe WikiPrefs dataset is a human preferences dataset for Large Language Models alignment. It was built using the EditPrefs method from historical edits of Wikipedia featured articles