Skip to content

Conversation

@reeselevine
Copy link
Collaborator

  • Better parallelization of SET_ROWS by having multiple threads work on each row, as well as vectorization
  • Adds more useful labels to buffers for debugging
  • Adds Dawn-specific toggles which disable some safety protections when running natively, for better performance

Better matrix multiplication coming soon!

reeselevine and others added 5 commits October 15, 2025 19:04
* updated optimization, fixed errors

* non vectorized version now dispatches one thread per element

* Simplify

* Change logic for set_rows pipelines

---------

Co-authored-by: Neha Abbas <nehaabbas@macbookpro.lan>
Co-authored-by: Neha Abbas <nehaabbas@ReeseLevines-MacBook-Pro.local>
Co-authored-by: Reese Levine <reeselevine1@gmail.com>
@github-actions github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Oct 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants