Skip to content

Conversation

@wangdongxing4
Copy link

Since each block is responsible for a set of groups, the Block-Stride Loop approach should be used to process groups during the receive phase. Therefore, the increment of the for loop should be blockDim.x, rather than gridDim.x * expertsPerBlock.

Since each block is responsible for a set of groups, the Block-Stride Loop approach should be used to process groups during the receive phase.
Therefore, the increment of the for loop should be blockDim.x, rather than gridDim.x * expertsPerBlock.
@abcdabcd987 abcdabcd987 requested a review from nandor July 28, 2025 17:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant