ggml-hexagon: respect input size when getting/setting tensor data #16836
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
It seems currently the input
sizeparameter when getting/setting tensors is ignored. This crashes when attempting to save/load state data, because we only save/load filled kv cells duringllama_kv_cache::state_read_dataandllama_kv_cache::state_write_data. The cache saving/loading wants to read partial tensors, so it fails theassertinggml_backend_hexagon_buffer_get_tensorandggml_backend_hexagon_buffer_set_tensor.This PR updates get/set tensor to read and repack partial rows based on the passed in
sizeinput. This is tested to allow saving and loading kv caches successfully on an S25+ Ultra.