Search results for: RLHF
Didn't find any results in this context but found some in other!
Other results Pokaż wszystkie wyniki (1)
Search results for: RLHF
-
WikiPrefs: human preferences dataset build from text edits
Open Research DataThe WikiPrefs dataset is a human preferences dataset for Large Language Models alignment. It was built using the EditPrefs method from historical edits of Wikipedia featured articles