rewardhacker00 - Overview
oat oat Public
Forked from sail-sg/oat
🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.
Python
rewardhacker00 - Overview
oat oat Public
Forked from sail-sg/oat
🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.
Python