dingn42 - Overview
Build something that works.
Tsinghua University
[ICLR 2026] SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning
Python 1.6k 102
Scalable RL solution for advanced reasoning of language models
Python 1.8k 111
[NeurIPS 2025] TTRL: Test-Time Reinforcement Learning
Python 1.1k 81
Large-scale, Informative, and Diverse Multi-round Chat Data (and Models)
Python 2.8k 137
MiniCPM4 & MiniCPM4.1: Ultra-Efficient LLMs on End Devices, achieving 3+ generation speedup on reasoning tasks
Jupyter Notebook 8.8k 569
An Open-Source Framework for Prompt-Learning.
Python 4.9k 485