rewardhacker00 - Overview

oat oat Public

Forked from sail-sg/oat

🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.

Python