THUKEG

Pinned Loading

GLM (General Language Model)

Python 3.4k 345
slime is an LLM post-training framework for RL Scaling.

Python 4.5k 598
An optimized deep prompt tuning strategy comparable to fine-tuning across scales and tasks

Python 2.1k 207
ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)

Python 692 50
RL Scaling and Test-Time Scaling (ICML'25)

114 1
Scaling Agentic Reinforcement Learning with a Multi-Turn, Multi-Task Framework

Python 230 16