Skip to content

Use allreduce_coalesced for factor allreduce #35

@gpauloski

Description

@gpauloski

pytorch/pytorch#62140

"grouped comm on a set of unflattened tensors can be more performant than flattening+a single flat nccl call."

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesthelp wantedExtra attention is neededpytorch-1.11Features available in PyTorch 1.11

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions