sunjunjie
|
bf59cb5f66
|
!3371 [pytorch][bugfix]restore the einsum operation for next states of mamba
Merge pull request !3371 from sunjunjie/2.1.0
|
2025-09-24 06:59:07 +00:00 |
|
丁子叉
|
3a8385c4fa
|
!3372 [pytorch][bugfix]fix profile step setting and qwen3 scripts
Merge pull request !3372 from 丁子叉/210_profile
|
2025-09-24 01:39:06 +00:00 |
|
chenzeng
|
9f409f6a42
|
!3288 [pytorch][bugfix]optimize attention mask memory in tuning and dpo.
Merge pull request !3288 from chenzeng/2.1.0
|
2025-09-16 11:19:28 +00:00 |
|
jzh
|
710228fb5f
|
!3284 [pytorch][bugfix] icsl for nltk.load
Merge pull request !3284 from jzh/210icsl
|
2025-09-13 01:57:47 +00:00 |
|
jzh
|
56b3cda738
|
!3127 [pytorch][bugfix] update icsl for weights_only
Merge pull request !3127 from jzh/210_aicsl
|
2025-08-11 15:58:41 +00:00 |
|
jzh
|
6785f325f3
|
!3113 [pytorch][bugfix]fix some bug for icsl
Merge pull request !3113 from jzh/210_uicsl
|
2025-08-08 07:13:20 +00:00 |
|
mhh001
|
a4567e5b65
|
!3099 [pytorch][feature] optimize memory usage
Merge pull request !3099 from mhh001/2.1.0
|
2025-08-07 01:50:11 +00:00 |
|
qu_yueze
|
24749ad078
|
!3036 [pytorch][bugfix]fix ckpt of mamba2
Merge pull request !3036 from qu_yueze/2.1.0
|
2025-07-22 03:37:35 +00:00 |
|
sunjunjie
|
0d412be9c5
|
!3020 [pytorch][bugfix]fix the stuck problem of dense+moe mixed structure model
Merge pull request !3020 from sunjunjie/2.1.0
|
2025-07-21 02:56:54 +00:00 |
|
shenjiarun
|
23ef979834
|
!2978 [pytorch][bugfix]Fix redundant validation for RL
Merge pull request !2978 from shenjiarun/2.1.0
|
2025-07-16 08:35:20 +00:00 |
|
yanzhixiao
|
1258bd2e1c
|
!3005 [pytorch][bugfix]add warning for use_mc2
Merge pull request !3005 from yanzhixiao/warning-use-mc2
|
2025-07-12 11:17:44 +00:00 |
|
shenjiarun
|
f7cdc199ed
|
!2928 [pytorch][bugfix]fix when use attention_mask_type instead of cp_attent…
Merge pull request !2928 from shenjiarun/2.1.0
|
2025-07-11 07:48:01 +00:00 |
|
sunjunjie
|
a1dac6b3f0
|
!2994 [pytorch][bugfix]qwen3 235b update HCCL_CONNECT_TIMEOUT
Merge pull request !2994 from sunjunjie/2.1.0
|
2025-07-11 06:47:58 +00:00 |
|
mhh001
|
969c88bbfb
|
!2996 [pytorch][bugfix]fix bug in mla when use --mla-fa-divide-qk
Merge pull request !2996 from mhh001/2.1.0
|
2025-07-11 01:51:28 +00:00 |
|
jzh
|
f6185391ae
|
!2988 [pytorch][bugfix]update for support 32k-pack
Merge pull request !2988 from jzh/2.1.0
|
2025-07-10 09:23:19 +00:00 |
|
jzh
|
7b034d1703
|
!2970 [pytorch][cleancode]update for cleancode
Merge pull request !2970 from jzh/2.1.0
|
2025-07-05 06:36:10 +00:00 |
|
jzh
|
80dca6c59d
|
!2965 [pytorch][cleancode]cleancode for parallel_state
Merge pull request !2965 from jzh/2.1.0
|
2025-07-05 02:56:48 +00:00 |
|
sunjunjie
|
bcc5f2dbc7
|
!2968 [pytorch][bugfix]Solve the evaluation timeout problem
Merge pull request !2968 from sunjunjie/2.1.0
|
2025-07-05 01:39:40 +00:00 |
|
jzh
|
288fbb9745
|
!2960 [pytorch][bugfix]oot
Merge pull request !2960 from jzh/210-oot
|
2025-07-04 10:03:36 +00:00 |
|
qu_yueze
|
ebc6e52826
|
!2886 [pytorch][feature]add noop-layer and tp-extend-ep in mg-hf of ckpt
Merge pull request !2886 from qu_yueze/2.1.0
|
2025-07-04 08:30:06 +00:00 |
|
shenjiarun
|
2da96c84d0
|
!2959 [python][fixbug]Add comment to human eval utils
Merge pull request !2959 from shenjiarun/2.1.0
|
2025-07-03 13:29:04 +00:00 |
|
HanhuiChen
|
4176febcb5
|
!2955 [pytorch][bugfix]cleancode
Merge pull request !2955 from HanhuiChen/2.1.0
|
2025-07-03 09:38:38 +00:00 |
|
wangyibo
|
5c726a2314
|
!2940 [mindspore][bugfix]cleancode for mindspeed_llm/mindspore
Merge pull request !2940 from wangyibo/2.1.0
|
2025-07-02 07:51:56 +00:00 |
|
jzh
|
4b4731f963
|
!2942 [pytorch][bugfix]cleancode
Merge pull request !2942 from jzh/210_cleancode
|
2025-07-02 06:57:23 +00:00 |
|
Dring
|
eaa6e5a2d0
|
!2932 [mindspore][bugfix][2.1.0]fix patch
Merge pull request !2932 from Dring/2.1.0
|
2025-07-01 02:40:08 +00:00 |
|
xinyuan
|
0d18e2980b
|
!2920 [mindspore][bugfix]fix_wrapper_patch
Merge pull request !2920 from xinyuan/fix_wrapper_patch2.1.0
|
2025-06-28 10:09:14 +00:00 |
|
Dring
|
e2460e8ece
|
!2903 【mindspore】【bugfix】【2.1.0】fix glm4 memory
Merge pull request !2903 from Dring/2.1.0
|
2025-06-28 01:17:52 +00:00 |
|
wangjialin
|
4cab8be881
|
!2869 [mindspore][bugfix] add utils to update wrapper
Merge pull request !2869 from wangjialin/reset_wrapper_patch_2.1.0
|
2025-06-26 09:48:38 +00:00 |
|
yanzhixiao
|
c26554670e
|
!2902 [pytorch][bugfix]fix 2.1.0 load ckpt
Merge pull request !2902 from yanzhixiao/bugfix-2.1.0
|
2025-06-25 03:37:01 +00:00 |
|
陈子豪
|
86874ccdd7
|
!2885 [mindspore][bugfix]add index_copy_ patch
Merge pull request !2885 from 陈子豪/master
|
2025-06-23 12:20:22 +00:00 |
|
杨承翰
|
a774a0a652
|
!2887 [mindspore][bugfix]Solve the problem ticket regarding the validation of --reuse-fp32-param usage
Merge pull request !2887 from 杨承翰/wtd_2.1.0
|
2025-06-23 11:25:47 +00:00 |
|
杨承翰
|
b308dce790
|
!2842 [mindspore][bugfix]Fix error enabling moe-zerc + dualpipe
Merge pull request !2842 from 杨承翰/bugfix-2.1.0
|
2025-06-23 06:17:23 +00:00 |
|
xinyuan
|
6bcc7569c2
|
!2876 [mindspore][bugfix] fix_forward_step_patch
Merge pull request !2876 from xinyuan/fix_forward_step_patch_2.1.0
|
2025-06-23 05:02:13 +00:00 |
|
xinyuan
|
e3ca71c40a
|
!2875 [mindspore][bugfix] fix recompute activation and grad accumulation
Merge pull request !2875 from xinyuan/bugfix_recomputeactivation_gradaccumulation_2.1.0
|
2025-06-23 05:01:54 +00:00 |
|
xinyuan
|
b2e705ca3f
|
!2874 [mindspore][bugfix] add_ms_dualpipe_forward_step_warp
Merge pull request !2874 from xinyuan/add_ms_dualpipe_forward_step_warp_2.1.0
|
2025-06-23 05:01:29 +00:00 |
|
ningbenzhe1
|
e85d303a4b
|
!2872 [pytorch][remove ]remove rlxf
Merge pull request !2872 from ningbenzhe1/2.1.0
|
2025-06-21 06:26:57 +00:00 |
|
qu_yueze
|
63daf08170
|
!2865 [pytorch][bugfix]fix tp-extend-ep in ckpt
Merge pull request !2865 from qu_yueze/2.1.0
|
2025-06-20 08:11:56 +00:00 |
|
朱家兴
|
ed7cdad717
|
!2852 【mindspore】【bugfix】Fix the bug where the parameters of the concat API are inconsistent.
Merge pull request !2852 from 朱家兴/2.1.0
|
2025-06-19 13:26:27 +00:00 |
|
shengjy
|
00bdda3c32
|
!2846 [2.1.0][pytorch][bugfix] moe token drop for expert bias
Merge pull request !2846 from shengjy/drop_expert_bias
|
2025-06-19 09:25:17 +00:00 |
|
wangjialin
|
d88b311c92
|
!2800 [mindspore][bugfix] fix PackProb's patch of moe-unperm2-mem-optim
Merge pull request !2800 from wangjialin/fix_moe_unperm2_mem_optim
|
2025-06-16 08:24:02 +00:00 |
|
shengjy
|
f38bb1fd30
|
!2667 [pytorch][feature]dualpipe in sft
Merge pull request !2667 from shengjy/sft_dp
|
2025-06-16 08:12:31 +00:00 |
|
shenjiarun
|
7ce6c0c786
|
!2801 [pytorch][feature]Reset Attention Mask adapt to Causal Attention Mask
Merge pull request !2801 from shenjiarun/master
|
2025-06-14 11:09:22 +00:00 |
|
shengjy
|
ed47019e69
|
!2814 [pytorch][bugfix]dualpipe mtp micro batch loss scale
Merge pull request !2814 from shengjy/dpscale
|
2025-06-13 07:26:37 +00:00 |
|
LeiZhenzhen
|
3ce08c5cef
|
!2811 【refactor】high availability to features_manager
Merge pull request !2811 from LeiZhenzhen/master
|
2025-06-12 02:30:55 +00:00 |
|
ZhihaoLi
|
540fb7e122
|
!2786 add moba attn and mindspore args parser
Merge pull request !2786 from ZhihaoLi/moba
|
2025-06-10 12:50:50 +00:00 |
|
ZhihaoLi
|
62f7f9fd79
|
!2802 delete not-found arg gradient_accumulation_fusion
Merge pull request !2802 from ZhihaoLi/master
|
2025-06-10 08:35:16 +00:00 |
|
ZhihaoLi
|
f392e9661e
|
!2784 add npu_matmul_add_fp32 and npu_groupmatmul_add_fp32(mindspore patch)
Merge pull request !2784 from ZhihaoLi/master
|
2025-06-09 11:41:11 +00:00 |
|
guozhihua
|
82d6440fff
|
!2748 Optimized the performance of mamba2 pretrain.
Merge pull request !2748 from guozhihua/master
|
2025-06-09 08:30:34 +00:00 |
|
xinyuan
|
7a812cdc70
|
!2767 ms_dualpipe_bugfix
Merge pull request !2767 from xinyuan/ms_dualpipe_bugfix_0604
|
2025-06-06 02:46:06 +00:00 |
|
sunjunjie
|
a56bca1ccb
|
!2768 add qwen3 moe dir& rename scripts
Merge pull request !2768 from sunjunjie/master
|
2025-06-05 14:02:11 +00:00 |
|