686 Commits

Author SHA1 Message Date
i-robot
773328361c !7779 【master】【bugfix】文档拼写整改
Merge pull request !7779 from 森镇/code_docs_fix_spelling
2025-12-04 07:40:58 +00:00
i-robot
5cde08f9cc !7787 【master】【mcore】【bugfix】Fix the incorrect path in the YAML file and Fix…
Merge pull request !7787 from zhangyihui/master-bugfix
2025-12-03 14:42:12 +00:00
zhangyihuiben
d261447482 【master】【mcore】【bugfix】Fix the incorrect path in the YAML file and Fix inconsistencies in formatting, capitalization, and typos in the documentation. 2025-12-03 16:59:15 +08:00
senzhen
a044bdbc35 文档拼写整改 2025-12-03 14:46:56 +08:00
zxq
e2ee4478fb 【master】【bug-fix】修改文档中的拼写错误 2025-12-02 09:59:46 +08:00
i-robot
dc5ae6a17b !7600 fix compute dtype in residual and rope.
Merge pull request !7600 from niujunhao/bugfix/fix_compute_dtype
2025-11-11 13:09:13 +00:00
niujunhao
3b7a97f709 fix residual fp32 and rope dtype. 2025-11-11 15:50:54 +08:00
Hsshuai
63cf1fba97 fix precision issue when vocab_emb_dp=True
(cherry picked from commit 10bbf69209)
2025-11-10 21:11:19 +08:00
zhangyihuiben
add4cda1e7 【master】【mcore】【bugfix】fix moe_aux_loss in qwen3_moe 2025-10-29 15:19:16 +08:00
i-robot
8b297995e9 !7525 【文档】Qwen3系列README添加训转推文档链接
Merge pull request !7525 from SaiYao/code_docs_qwen3_t2i
2025-10-20 02:34:37 +00:00
SaiYao
a8cf6a293d 【文档】Qwen3系列README添加训转推文档链接 2025-10-17 17:19:57 +08:00
zhangyihuiben
7abfc7e024 【master】【mcore】【bugfix】delete recompute_slice_activation 2025-10-17 10:48:45 +08:00
senzhen
f135e3b1cd 修复qwen3训练报错 2025-10-15 16:30:14 +08:00
i-robot
2289d16afb !6996 【code_docs】glm4文档
Merge pull request !6996 from nashturing/glm_master_doc
2025-09-24 04:03:54 +00:00
JavaZero
6d172177d1 set vocab_emb_dp default to False 2025-09-19 17:09:51 +08:00
i-robot
80b13bc2c5 !7030 新增预训练和微调模板配置
Merge pull request !7030 from lan/dev_model_yaml
2025-09-10 06:19:59 +00:00
lanxiang
3bf3a0d135 新增预训练和微调模板配置 2025-09-10 11:43:33 +08:00
JavaZero
b7cfd27505 update parallel configuration for training scripts in Qwen3 README 2025-09-09 16:23:55 +08:00
i-robot
bac1b0c8d8 !7250 【docs】【master】添加qwen3 微调文档
Merge pull request !7250 from JavaZero/code_docs_add_mcore_qwen3_sft_readme
2025-09-09 07:50:04 +00:00
JavaZero
6c8af8f670 add qwen3 sft readme 2025-09-09 11:14:10 +08:00
ccsszz
d25fa86623 mcore deepseek yaml add use_fused_mla switch 2025-09-08 19:42:56 +08:00
pengjingyou
8002390917 【bugfix】【infer】router模块根据配置选择融合算子及路由算法的激活函数 2025-09-05 16:08:34 +08:00
JavaZero
770a9bae03 support qwen3 32b finetune 2025-09-04 11:40:05 +08:00
zhangyihuiben
0d8c4a0334 【master】【bugfix】 qwen3-moe 配置参数错误 2025-09-02 15:11:08 +08:00
zhangyihuiben
6d6ab4b138 【master】【feature】qwen3-moe 预训练yaml+README.md 2025-09-01 21:33:20 +08:00
husichao
a139117591 add new MLP with interleaved weight layout 2025-08-22 19:39:20 +08:00
Xinrui Chen
0bba89b148 [Docs] Add model readme template 2025-08-20 17:37:43 +08:00
nashturing
cb8278bfd6 【code_docs】glm4_moe文档 2025-08-18 20:52:11 +08:00
nashturing
1efb128a11 【code_docs】glm4文档 2025-08-14 17:14:34 +08:00
zxq
00d6fcd205 新增Qwen3模型文档(推理部分),包含模型描述、支持规格、使用样例及配置建议 2025-07-26 18:22:35 +08:00
i-robot
1ac1d70eae !6877 新增 Qwen3 模型文档,包含模型描述、支持规格、使用样例及配置建议
Merge pull request !6877 from JavaZero/code_docs_add_mcore_qwen3_readme
2025-07-25 08:11:39 +00:00
JavaZero
05766c8f0a 新增 Qwen3 模型文档,包含模型描述、支持规格、使用样例及配置建议 2025-07-25 14:34:35 +08:00
i-robot
3af357374d !6880 modify links
Merge pull request !6880 from 宦晓玲/code_docs_0723
2025-07-25 03:29:59 +00:00
huan
83a307e139 modify links 2025-07-24 09:35:55 +08:00
i-robot
78438775c3 !6860 【master】训练配置简化
Merge pull request !6860 from lan/dev_yaml_update
2025-07-23 10:21:53 +00:00
lanxiang
f5656852e8 训练配置简化 2025-07-22 10:44:24 +08:00
i-robot
d6b44f26e9 !6584 【dev】mcore添加qwen3
Merge pull request !6584 from JavaZero/mcore_qwen3
2025-07-16 08:03:57 +00:00
JavaZero
b18b9a6a4b mcore add qwen3 2025-07-16 15:22:38 +08:00
zxq
dc386615df 【feature】【dev】修改Qwen2模型配置及yaml,适配run_mindformer 2025-07-14 14:39:13 +08:00
Yule100
352d8377af deepseek3 run_mindformer适配 2025-06-27 11:50:34 +08:00
sunyu-xuan
c915432b5a optimize config and fix qwen3_moe name in comment 2025-06-26 20:04:21 +08:00
i-robot
19677dc409 !6465 【dev】下架 bert以及T5最小集
Merge pull request !6465 from yiyison/bert_off
2025-06-26 01:47:23 +00:00
i-robot
4371c76746 !6598 bugfix hf-tokenizer
Merge pull request !6598 from Yule100/bugfix_hf_tokenizer
2025-06-26 01:21:35 +00:00
Yule100
3a8a0ac796 bugfix hf-tokenizer 2025-06-25 19:59:36 +08:00
yiyison
e8b9bcc58d 下架bert以及t5模型代码 2025-06-25 17:47:21 +08:00
i-robot
1b03cc9e38 !6371 【dev】【下架】下架过时api
Merge pull request !6371 from 魏琢艺/delete_api
2025-06-25 06:53:54 +00:00
i-robot
6fab43fbe5 !6614 [Models] Delete Llama2
Merge pull request !6614 from Xinrui Chen/dev-del-llama
2025-06-25 06:53:29 +00:00
Xinrui Chen
80b61f5668 [Models] Delete Llama2 2025-06-24 21:27:49 +08:00
qinsichun
74ee12de76 fix_generation_load 2025-06-24 18:50:17 +08:00
魏琢艺
7d75d5692f delete api 2025-06-24 17:24:56 +08:00