matt783 - Overview
Navigation Menu
Pinned Loading
-
Forked from resistzzz/Co-rewarding
Co-Reward: Self-supervised RL for LLM Reasoning via Contrastive Agreement
Python
matt783 - Overview
Forked from resistzzz/Co-rewarding
Co-Reward: Self-supervised RL for LLM Reasoning via Contrastive Agreement
Python