-
Notifications
You must be signed in to change notification settings - Fork 13.5k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
CUDA: Remove unneded bias/gate dims in fused mmvq
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#16858
opened Oct 30, 2025 by
ORippler
Loading…
CUDA: add expert reduce kernel
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
testing
Everything test related
#16857
opened Oct 30, 2025 by
am17an
Loading…
cann: update L2_NORM op support
Ascend NPU
issues specific to Ascend NPUs
ggml
changes relating to the ggml tensor library for machine learning
#16856
opened Oct 30, 2025 by
TecJesh
Loading…
webui: recognize AsciiDoc files as valid text files
examples
server
#16850
opened Oct 29, 2025 by
jhradilek
Loading…
Enable CUDA graphs for embed gemma 300m
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#16844
opened Oct 29, 2025 by
ArshM17-NV
Loading…
CUDA: Volta tensor core support for MMF
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#16843
opened Oct 29, 2025 by
JohannesGaessler
Loading…
improve CUDA cpy memory bandwidth when copying transposed tensor
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
testing
Everything test related
#16841
opened Oct 29, 2025 by
bssrdf
Loading…
clip : use FA
Apple Metal
https://en.wikipedia.org/wiki/Metal_(API)
examples
ggml
changes relating to the ggml tensor library for machine learning
testing
Everything test related
ggml-hexagon: respect input size when getting/setting tensor data
ggml
changes relating to the ggml tensor library for machine learning
#16836
opened Oct 29, 2025 by
l3utterfly
Loading…
hip: add RDNA4 support for mmf and mma
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#16835
opened Oct 29, 2025 by
zhang-hui-yulo
Loading…
cpu: introduce chunking for repack matmuls and enable matmul-id chunking
ggml
changes relating to the ggml tensor library for machine learning
#16833
opened Oct 29, 2025 by
max-krasnyansky
Loading…
Model: Minimax M2
python
python script changes
testing
Everything test related
#16831
opened Oct 28, 2025 by
pwilkin
Loading…
CUDA: Conv2d tensor core
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
vulkan: remove the need for the dryrun
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#16826
opened Oct 28, 2025 by
jeffbolznv
Loading…
docs: explain CUDA 11 compilation [no ci]
documentation
Improvements or additions to documentation
#16824
opened Oct 28, 2025 by
JohannesGaessler
Loading…
Implement SparseK Attention mechanism — new GGML operator with CPU backend (GPU planned next)
ggml
changes relating to the ggml tensor library for machine learning
testing
Everything test related
#16817
opened Oct 28, 2025 by
yael-works
Loading…
ggml webgpu: minor set rows optimization
ggml
changes relating to the ggml tensor library for machine learning
#16810
opened Oct 27, 2025 by
reeselevine
Loading…
ggml-cpu: templateify ggml_compute_forward_rope_f32 and _f16
ggml
changes relating to the ggml tensor library for machine learning
testing
Everything test related
#16805
opened Oct 27, 2025 by
duduta
Loading…
vulkan: Fix crash when FP16 mul_mat accumulation is not supported
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#16796
opened Oct 27, 2025 by
rillomas
Loading…
Previous Next
ProTip!
Follow long discussions with comments:>50.