272 Commits

Author SHA1 Message Date
sunjunjie
bf59cb5f66 !3371 [pytorch][bugfix]restore the einsum operation for next states of mamba
Merge pull request !3371 from sunjunjie/2.1.0
2025-09-24 06:59:07 +00:00
丁子叉
3a8385c4fa !3372 [pytorch][bugfix]fix profile step setting and qwen3 scripts
Merge pull request !3372 from 丁子叉/210_profile
2025-09-24 01:39:06 +00:00
chenzeng
9f409f6a42 !3288 [pytorch][bugfix]optimize attention mask memory in tuning and dpo.
Merge pull request !3288 from chenzeng/2.1.0
2025-09-16 11:19:28 +00:00
jzh
710228fb5f !3284 [pytorch][bugfix] icsl for nltk.load
Merge pull request !3284 from jzh/210icsl
2025-09-13 01:57:47 +00:00
jzh
56b3cda738 !3127 [pytorch][bugfix] update icsl for weights_only
Merge pull request !3127 from jzh/210_aicsl
2025-08-11 15:58:41 +00:00
jzh
6785f325f3 !3113 [pytorch][bugfix]fix some bug for icsl
Merge pull request !3113 from jzh/210_uicsl
2025-08-08 07:13:20 +00:00
mhh001
a4567e5b65 !3099 [pytorch][feature] optimize memory usage
Merge pull request !3099 from mhh001/2.1.0
2025-08-07 01:50:11 +00:00
qu_yueze
24749ad078 !3036 [pytorch][bugfix]fix ckpt of mamba2
Merge pull request !3036 from qu_yueze/2.1.0
2025-07-22 03:37:35 +00:00
sunjunjie
0d412be9c5 !3020 [pytorch][bugfix]fix the stuck problem of dense+moe mixed structure model
Merge pull request !3020 from sunjunjie/2.1.0
2025-07-21 02:56:54 +00:00
shenjiarun
23ef979834 !2978 [pytorch][bugfix]Fix redundant validation for RL
Merge pull request !2978 from shenjiarun/2.1.0
2025-07-16 08:35:20 +00:00
yanzhixiao
1258bd2e1c !3005 [pytorch][bugfix]add warning for use_mc2
Merge pull request !3005 from yanzhixiao/warning-use-mc2
2025-07-12 11:17:44 +00:00
shenjiarun
f7cdc199ed !2928 [pytorch][bugfix]fix when use attention_mask_type instead of cp_attent…
Merge pull request !2928 from shenjiarun/2.1.0
2025-07-11 07:48:01 +00:00
sunjunjie
a1dac6b3f0 !2994 [pytorch][bugfix]qwen3 235b update HCCL_CONNECT_TIMEOUT
Merge pull request !2994 from sunjunjie/2.1.0
2025-07-11 06:47:58 +00:00
mhh001
969c88bbfb !2996 [pytorch][bugfix]fix bug in mla when use --mla-fa-divide-qk
Merge pull request !2996 from mhh001/2.1.0
2025-07-11 01:51:28 +00:00
jzh
f6185391ae !2988 [pytorch][bugfix]update for support 32k-pack
Merge pull request !2988 from jzh/2.1.0
2025-07-10 09:23:19 +00:00
jzh
7b034d1703 !2970 [pytorch][cleancode]update for cleancode
Merge pull request !2970 from jzh/2.1.0
2025-07-05 06:36:10 +00:00
jzh
80dca6c59d !2965 [pytorch][cleancode]cleancode for parallel_state
Merge pull request !2965 from jzh/2.1.0
2025-07-05 02:56:48 +00:00
sunjunjie
bcc5f2dbc7 !2968 [pytorch][bugfix]Solve the evaluation timeout problem
Merge pull request !2968 from sunjunjie/2.1.0
2025-07-05 01:39:40 +00:00
jzh
288fbb9745 !2960 [pytorch][bugfix]oot
Merge pull request !2960 from jzh/210-oot
2025-07-04 10:03:36 +00:00
qu_yueze
ebc6e52826 !2886 [pytorch][feature]add noop-layer and tp-extend-ep in mg-hf of ckpt
Merge pull request !2886 from qu_yueze/2.1.0
2025-07-04 08:30:06 +00:00
shenjiarun
2da96c84d0 !2959 [python][fixbug]Add comment to human eval utils
Merge pull request !2959 from shenjiarun/2.1.0
2025-07-03 13:29:04 +00:00
HanhuiChen
4176febcb5 !2955 [pytorch][bugfix]cleancode
Merge pull request !2955 from HanhuiChen/2.1.0
2025-07-03 09:38:38 +00:00
wangyibo
5c726a2314 !2940 [mindspore][bugfix]cleancode for mindspeed_llm/mindspore
Merge pull request !2940 from wangyibo/2.1.0
2025-07-02 07:51:56 +00:00
jzh
4b4731f963 !2942 [pytorch][bugfix]cleancode
Merge pull request !2942 from jzh/210_cleancode
2025-07-02 06:57:23 +00:00
Dring
eaa6e5a2d0 !2932 [mindspore][bugfix][2.1.0]fix patch
Merge pull request !2932 from Dring/2.1.0
2025-07-01 02:40:08 +00:00
xinyuan
0d18e2980b !2920 [mindspore][bugfix]fix_wrapper_patch
Merge pull request !2920 from xinyuan/fix_wrapper_patch2.1.0
2025-06-28 10:09:14 +00:00
Dring
e2460e8ece !2903 【mindspore】【bugfix】【2.1.0】fix glm4 memory
Merge pull request !2903 from Dring/2.1.0
2025-06-28 01:17:52 +00:00
wangjialin
4cab8be881 !2869 [mindspore][bugfix] add utils to update wrapper
Merge pull request !2869 from wangjialin/reset_wrapper_patch_2.1.0
2025-06-26 09:48:38 +00:00
yanzhixiao
c26554670e !2902 [pytorch][bugfix]fix 2.1.0 load ckpt
Merge pull request !2902 from yanzhixiao/bugfix-2.1.0
2025-06-25 03:37:01 +00:00
陈子豪
86874ccdd7 !2885 [mindspore][bugfix]add index_copy_ patch
Merge pull request !2885 from 陈子豪/master
2025-06-23 12:20:22 +00:00
杨承翰
a774a0a652 !2887 [mindspore][bugfix]Solve the problem ticket regarding the validation of --reuse-fp32-param usage
Merge pull request !2887 from 杨承翰/wtd_2.1.0
2025-06-23 11:25:47 +00:00
杨承翰
b308dce790 !2842 [mindspore][bugfix]Fix error enabling moe-zerc + dualpipe
Merge pull request !2842 from 杨承翰/bugfix-2.1.0
2025-06-23 06:17:23 +00:00
xinyuan
6bcc7569c2 !2876 [mindspore][bugfix] fix_forward_step_patch
Merge pull request !2876 from xinyuan/fix_forward_step_patch_2.1.0
2025-06-23 05:02:13 +00:00
xinyuan
e3ca71c40a !2875 [mindspore][bugfix] fix recompute activation and grad accumulation
Merge pull request !2875 from xinyuan/bugfix_recomputeactivation_gradaccumulation_2.1.0
2025-06-23 05:01:54 +00:00
xinyuan
b2e705ca3f !2874 [mindspore][bugfix] add_ms_dualpipe_forward_step_warp
Merge pull request !2874 from xinyuan/add_ms_dualpipe_forward_step_warp_2.1.0
2025-06-23 05:01:29 +00:00
ningbenzhe1
e85d303a4b !2872 [pytorch][remove ]remove rlxf
Merge pull request !2872 from ningbenzhe1/2.1.0
2025-06-21 06:26:57 +00:00
qu_yueze
63daf08170 !2865 [pytorch][bugfix]fix tp-extend-ep in ckpt
Merge pull request !2865 from qu_yueze/2.1.0
2025-06-20 08:11:56 +00:00
朱家兴
ed7cdad717 !2852 【mindspore】【bugfix】Fix the bug where the parameters of the concat API are inconsistent.
Merge pull request !2852 from 朱家兴/2.1.0
2025-06-19 13:26:27 +00:00
shengjy
00bdda3c32 !2846 [2.1.0][pytorch][bugfix] moe token drop for expert bias
Merge pull request !2846 from shengjy/drop_expert_bias
2025-06-19 09:25:17 +00:00
wangjialin
d88b311c92 !2800 [mindspore][bugfix] fix PackProb's patch of moe-unperm2-mem-optim
Merge pull request !2800 from wangjialin/fix_moe_unperm2_mem_optim
2025-06-16 08:24:02 +00:00
shengjy
f38bb1fd30 !2667 [pytorch][feature]dualpipe in sft
Merge pull request !2667 from shengjy/sft_dp
2025-06-16 08:12:31 +00:00
shenjiarun
7ce6c0c786 !2801 [pytorch][feature]Reset Attention Mask adapt to Causal Attention Mask
Merge pull request !2801 from shenjiarun/master
2025-06-14 11:09:22 +00:00
shengjy
ed47019e69 !2814 [pytorch][bugfix]dualpipe mtp micro batch loss scale
Merge pull request !2814 from shengjy/dpscale
2025-06-13 07:26:37 +00:00
LeiZhenzhen
3ce08c5cef !2811 【refactor】high availability to features_manager
Merge pull request !2811 from LeiZhenzhen/master
2025-06-12 02:30:55 +00:00
ZhihaoLi
540fb7e122 !2786 add moba attn and mindspore args parser
Merge pull request !2786 from ZhihaoLi/moba
2025-06-10 12:50:50 +00:00
ZhihaoLi
62f7f9fd79 !2802 delete not-found arg gradient_accumulation_fusion
Merge pull request !2802 from ZhihaoLi/master
2025-06-10 08:35:16 +00:00
ZhihaoLi
f392e9661e !2784 add npu_matmul_add_fp32 and npu_groupmatmul_add_fp32(mindspore patch)
Merge pull request !2784 from ZhihaoLi/master
2025-06-09 11:41:11 +00:00
guozhihua
82d6440fff !2748 Optimized the performance of mamba2 pretrain.
Merge pull request !2748 from guozhihua/master
2025-06-09 08:30:34 +00:00
xinyuan
7a812cdc70 !2767 ms_dualpipe_bugfix
Merge pull request !2767 from xinyuan/ms_dualpipe_bugfix_0604
2025-06-06 02:46:06 +00:00
sunjunjie
a56bca1ccb !2768 add qwen3 moe dir& rename scripts
Merge pull request !2768 from sunjunjie/master
2025-06-05 14:02:11 +00:00