THUKEG

Pinned Loading

  1. GLM (General Language Model)

    Python 3.4k 345

  2. slime is an LLM post-training framework for RL Scaling.

    Python 4.5k 598

  3. An optimized deep prompt tuning strategy comparable to fine-tuning across scales and tasks

    Python 2.1k 207

  4. ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)

    Python 692 50

  5. RL Scaling and Test-Time Scaling (ICML'25)

    114 1

  6. Scaling Agentic Reinforcement Learning with a Multi-Turn, Multi-Task Framework

    Python 230 16