- 
                Notifications
    You must be signed in to change notification settings 
- Fork 4.5k
Pull requests: hpcaitech/ColossalAI
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
      fix: resolve multi-node training hanging in Kubernetes environments
      
    
        
          #6377
            opened Aug 5, 2025  by
            amyanger
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      fix: wrong dp-rank condition when enable pp
      
    
        
          #6374
            opened Jul 30, 2025  by
            liuqh16
            
        
        
            
    
  
    Loading…
 
        
          
   
        
      
    
      
        
      
      
  
    2 of 11 tasks
  
      [feat] Support Zero Bubble StreamRL-like RL Training
      
    
      
  
        
          #6356
            opened Jun 27, 2025  by
            YeAnbang
            
        
        
            
    
  
    Loading…
 
        
          
   
        
      
    
      
        
      
      
  
    11 tasks
  
      [checkpoint_io] Fix gather_state_dict_fast
      
    
      
  
        
          #6333
            opened May 28, 2025  by
            pbelevich
            
        
        
            
    
  
    Loading…
 
        
          
   
        
      
    
      
        
      
      
  
    1 of 2 tasks
  
      [DOC]: Update the documentation of ShardConfig for 1D, 2D, 2.5D, 3D tensor parallelism 
      
    
      
  
        
          #6278
            opened Apr 26, 2025  by
            tongyu0924
            
        
        
            
    
  
    Loading…
 
        
          
   
        
      
    
      
        
      
      
  
    11 tasks
  
      [applications/ColossalChat/examples/training_scripts/lora_finetune.py]: Fixed bug, added save_interval and added auto resume functions
      
    
      
  
        
          #6223
            opened Feb 26, 2025  by
            bbbolt
            
        
        
            
    
  
    Loading…
 
        
          
   
        
      
    
      
        
      
      
  
    1 task
  
      [checkpointio]support distributed checkpoint io for model saving.
      
    
      
  
        
          #6181
            opened Jan 16, 2025  by
            flybird11111
            
        
        
            
    
  
    Loading…
 
        
          
   
        
      
    
      
        
      
      
  
    11 tasks
  
      [fix] launching error on special env variables (#6032)
      
    
        
          #6173
            opened Dec 31, 2024  by
            GCS-ZHN
            
        
        
            
    
  
    Loading…
 
        
          
   
        
      
    
      
        
      
      
  
    9 of 10 tasks
  
      [enhance] make input datatype ready for allgather
      
    
        
          #6162
            opened Dec 18, 2024  by
            BurkeHulk
            
        
        
            
    
  
    Loading…
 
        
          
   
        
      
    
      
        
      
      
  
    11 tasks
  
      [feature] support Gemma2Model for tensor parallem training
      
    
      
  
        
          #6122
            opened Nov 9, 2024  by
            jing-4369
            
        
        
            
    
  
    Loading…
 
        
          
   
        
      
    
      
        
      
      
  
    7 of 11 tasks
  
      [Colossalai-Ascend] Support llama2-7b, chatglm2-6b finetune and inference on NPU
      
    
        
          #6118
            opened Nov 8, 2024  by
            duanjunwen
            
        
        
            
    
  
    Loading…
 
        
          
   
        
      
    
      
        
      
      
  
    11 tasks
  
      It is recommended to use np.asarray instead of np.array to avoid cessary copies of the data
      
    
      
  
        
          #6015
            opened Aug 17, 2024  by
            monster29000
            
        
        
            
    
  
    Loading…
 
        
          
   
        
      
    
      
        
      
      
  
    11 tasks
  
      [WIP][Infer] Inference Distributed RPC Framework Optimization
        
              
                colossal-inference
              
                tensor-parallel
  related to the tensor-parallel feature 
        
      
    
        
          #5756
            opened May 27, 2024  by
            LRY89757
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      [Inference] Clean duplicated vector utils
      
    
        
          #5715
            opened May 14, 2024  by
            Courtesy-Xs
            
        
        
            
    
  
    Loading…
 
        
          
   
        
      
    
      
        
      
      
  
    11 tasks
  
Previous Next
  
  
  ProTip!
  Updated in the last three days: updated:>2025-10-27.