-
Notifications
You must be signed in to change notification settings - Fork 644
[Graph Optimization] SOT+CUDAGraph support ERNIE4.5T VL 28B / 424B #4645
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Graph Optimization] SOT+CUDAGraph support ERNIE4.5T VL 28B / 424B #4645
Conversation
|
Thanks for your contribution! |
|
You may need to update your branch to resolve errors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| if self.num_shared_experts > 0: | ||
| shared_experts_out = self.shared_experts(hidden_states) | ||
| hidden_states, vl_moe_meta.text_input, vl_moe_meta.image_input = text_image_gather_scatter( | ||
| hidden_states, text_input, image_input = text_image_gather_scatter( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里和xpu那个修改会冲突吗
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
XPU 的 #4636 合入后,这个再合入就没问题
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Motivation
SOT+CUDAGraph 跑通 ERNIE4.5T VL 28B / 424B
前置PR: #4610
Modifications
Usage or Command
Accuracy Tests
NULL
Checklist
[FDConfig],[APIServer],[Engine],[Scheduler],[PD Disaggregation],[Executor],[Graph Optimization],[Speculative Decoding],[RL],[Models],[Quantization],[Loader],[OP],[KVCache],[DataProcessor],[BugFix],[Docs],[CI],[Optimization],[Feature],[Benchmark],[Others],[XPU],[HPU],[GCU],[DCU],[Iluvatar],[Metax]]pre-commitbefore commit.releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.