【FlexCheckpoint】Upgrade FlexCheckpoint by xingmingyyj · Pull Request #75613 · PaddlePaddle/Paddle
xingmingyyj
changed the title
upgrade_fc
【FlexCheckpoint】Upgrade FlexCheckpoint
From00 previously approved these changes Oct 10, 2025
SigureMo pushed a commit to cattidea/Paddle that referenced this pull request
Oct 14, 2025xingmingyyj added a commit to xingmingyyj/Paddle that referenced this pull request
Oct 22, 2025swgu98 pushed a commit that referenced this pull request
Oct 23, 2025…#75996) * [Flex CP]Fix merge_sharded_state_dict with aoa and offload (#75062) * fix merge_state_dict with aoa and offload * add tests * refine * fix * fix * add log * fix * fix * 【FlexCheckpoint】Upgrade some macros and optimize load_state_dict communication (#75282) * upgrad macros and load_state_dict comm task fix fix support 0-d tensor fix balance save and fix * fix test * Add the test about the sharded_state_dict of optimizer (#75067) * fix the share_weight_bug * add note * add the unit test * set the timeout * add more test * Trigger CI rebuild * fix the CmakeLists * handle_missing_edge_cases_in_fc (#75413) * up_grade fc (#75613) fix and add test fix fix fix fix cmakelists add notion --------- Co-authored-by: Chen Zhiyang <1792266893@qq.com> Co-authored-by: Tianyu Zheng <129518799+zty-king@users.noreply.github.com>
xingmingyyj added a commit to xingmingyyj/Paddle that referenced this pull request
Nov 5, 2025sneaxiy pushed a commit that referenced this pull request
Nov 6, 2025….2 (#76249) * 【FlexCP】merge_sharded_state_dict support distribute merge (#75005) * fix data is nullptr * add dist merge * change test * change test * 【FlexCP】add Skip param param for merge_shard_state_dict (#75061) * fix data is nullptr * add dist merge * change test * change test * add skip optimizer param * [Flex CP]Fix merge_sharded_state_dict with aoa and offload (#75062) * fix merge_state_dict with aoa and offload * add tests * refine * fix * fix * add log * fix * fix * 【FlexCheckpoint】Upgrade some macros and optimize load_state_dict communication (#75282) * upgrad macros and load_state_dict comm task fix fix support 0-d tensor fix balance save and fix * fix test * Add the test about the sharded_state_dict of optimizer (#75067) * fix the share_weight_bug * add note * add the unit test * set the timeout * add more test * Trigger CI rebuild * fix the CmakeLists * handle_missing_edge_cases_in_fc (#75413) * up_grade fc (#75613) fix and add test fix fix fix fix cmakelists add notion * 【FlexCheckpoint】fix_the_layer_id_macro (#75556) * fix_the_layer_id_macro * fix the ctest * add expert_id_macro * fix the assert bug * fix the code style * Pr support load hf checkpoint (#75928) * support hf checkpoint fix support cast add id macro fix * add test and fix some bug * fix full param bug * add full param cast test --------- Co-authored-by: xingmingyyj <zxm_3791@163.com> * 【Flexcheckpoint】add_get_var_mapping_chain_macro (#76013) * add_get_var_mapping_chain_macro * add note * fix the bug input_vars and resolve_mapping_chain * fix the code style * fit the dtype assert bug * fix the bug * fix the merge_sharded_state_dict bug * fix aoa transpose corner case (#76234) --------- Co-authored-by: xiaoguoguo626807 <100397923+xiaoguoguo626807@users.noreply.github.com> Co-authored-by: Chen Zhiyang <1792266893@qq.com> Co-authored-by: Tianyu Zheng <129518799+zty-king@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters