-
Notifications
You must be signed in to change notification settings - Fork 666
feat: kvbm v2 memory crate #4012
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Ryan Olson <rolson@nvidia.com>
WalkthroughAdds a new Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
Poem
Pre-merge checks❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 9
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (14)
Cargo.toml(1 hunks)lib/memory/Cargo.toml(1 hunks)lib/memory/src/actions.rs(1 hunks)lib/memory/src/arena.rs(1 hunks)lib/memory/src/device.rs(1 hunks)lib/memory/src/disk.rs(1 hunks)lib/memory/src/lib.rs(1 hunks)lib/memory/src/nixl.rs(1 hunks)lib/memory/src/offset.rs(1 hunks)lib/memory/src/pinned.rs(1 hunks)lib/memory/src/prelude.rs(1 hunks)lib/memory/src/system.rs(1 hunks)lib/memory/src/tests.rs(1 hunks)lib/memory/src/torch.rs(1 hunks)
🧰 Additional context used
🧠 Learnings (5)
📓 Common learnings
Learnt from: GuanLuo
Repo: ai-dynamo/dynamo PR: 2746
File: lib/llm/src/grpc/service/tensor.rs:269-447
Timestamp: 2025-09-19T18:21:03.693Z
Learning: In lib/llm/src/grpc/service/tensor.rs, GuanLuo prefers using unsafe Vec::from_raw_parts operations to avoid data copying for performance reasons when converting raw tensor data from Vec<u8> to typed vectors like Vec<u32>, Vec<f32>, etc., even when it involves safety risks.
📚 Learning: 2025-09-19T18:21:03.693Z
Learnt from: GuanLuo
Repo: ai-dynamo/dynamo PR: 2746
File: lib/llm/src/grpc/service/tensor.rs:269-447
Timestamp: 2025-09-19T18:21:03.693Z
Learning: In lib/llm/src/grpc/service/tensor.rs, GuanLuo prefers using unsafe Vec::from_raw_parts operations to avoid data copying for performance reasons when converting raw tensor data from Vec<u8> to typed vectors like Vec<u32>, Vec<f32>, etc., even when it involves safety risks.
Applied to files:
lib/memory/src/torch.rs
📚 Learning: 2025-08-18T20:51:51.324Z
Learnt from: PeaBrane
Repo: ai-dynamo/dynamo PR: 2465
File: lib/runtime/src/pipeline/network/egress/push_router.rs:0-0
Timestamp: 2025-08-18T20:51:51.324Z
Learning: The runtime crate cannot depend on the llm crate due to architectural dependency constraints, preventing imports from lib/llm into lib/runtime.
Applied to files:
Cargo.toml
📚 Learning: 2025-08-15T02:01:01.238Z
Learnt from: keivenchang
Repo: ai-dynamo/dynamo PR: 2453
File: deploy/dynamo_check.py:739-741
Timestamp: 2025-08-15T02:01:01.238Z
Learning: In the dynamo project, missing cargo should be treated as a fatal error in dynamo_check.py because developers need cargo to build the Rust components. The tool is designed to validate the complete development environment, not just import functionality.
Applied to files:
lib/memory/Cargo.toml
📚 Learning: 2025-09-18T21:49:28.906Z
Learnt from: oandreeva-nv
Repo: ai-dynamo/dynamo PR: 2989
File: lib/llm/src/block_manager/block/transfer/cuda.rs:416-420
Timestamp: 2025-09-18T21:49:28.906Z
Learning: Pinned memory (page-locked host memory) being accessible by GPU via DMA doesn't change the fact that host and device pointers exist in separate virtual address spaces, so overlap checks between host and device pointers are still invalid regardless of whether the host memory is pinned.
Applied to files:
lib/memory/src/pinned.rs
🧬 Code graph analysis (10)
lib/memory/src/offset.rs (3)
lib/memory/src/arena.rs (6)
size(170-172)addr(167-169)storage_kind(173-175)as_any(176-178)nixl_descriptor(179-187)nixl_descriptor(201-213)lib/memory/src/lib.rs (12)
size(96-96)size(116-118)size(174-176)addr(93-93)addr(113-115)addr(169-171)storage_kind(99-99)storage_kind(119-121)as_any(102-102)as_any(122-124)nixl_descriptor(105-105)nixl_descriptor(125-127)lib/memory/src/nixl.rs (8)
size(34-36)size(92-94)addr(88-90)storage_kind(96-98)as_any(100-102)nixl_descriptor(104-106)descriptor(55-55)descriptor(114-122)
lib/memory/src/disk.rs (6)
lib/memory/src/arena.rs (6)
size(170-172)addr(167-169)storage_kind(173-175)as_any(176-178)nixl_descriptor(179-187)nixl_descriptor(201-213)lib/memory/src/device.rs (6)
size(92-94)addr(88-90)storage_kind(96-98)as_any(100-102)nixl_descriptor(104-106)nixl_params(111-118)lib/memory/src/lib.rs (12)
size(96-96)size(116-118)size(174-176)addr(93-93)addr(113-115)addr(169-171)storage_kind(99-99)storage_kind(119-121)as_any(102-102)as_any(122-124)nixl_descriptor(105-105)nixl_descriptor(125-127)lib/memory/src/offset.rs (5)
size(59-61)addr(55-57)storage_kind(63-65)as_any(67-69)nixl_descriptor(71-76)lib/memory/src/system.rs (6)
size(90-92)addr(86-88)storage_kind(94-96)as_any(98-100)nixl_descriptor(102-104)nixl_params(109-111)lib/memory/src/nixl.rs (7)
size(34-36)size(92-94)addr(88-90)storage_kind(96-98)as_any(100-102)nixl_descriptor(104-106)nixl_params(17-17)
lib/memory/src/actions.rs (3)
lib/memory/src/pinned.rs (2)
memset(131-142)size(105-107)lib/memory/src/system.rs (3)
memset(115-126)size(90-92)as_slice(130-132)lib/memory/src/lib.rs (4)
size(96-96)size(116-118)size(174-176)as_slice(181-183)
lib/memory/src/lib.rs (8)
lib/memory/src/offset.rs (8)
offset(44-46)fmt(18-24)addr(55-57)size(59-61)storage_kind(63-65)as_any(67-69)nixl_descriptor(71-76)new(31-41)lib/memory/src/actions.rs (12)
std(70-70)std(72-72)std(79-79)std(105-105)std(113-113)std(169-169)std(171-171)std(178-178)std(207-207)std(215-215)as_slice(32-32)slice(45-56)lib/memory/src/arena.rs (9)
fmt(66-72)fmt(154-163)addr(167-169)size(170-172)storage_kind(173-175)as_any(176-178)nixl_descriptor(179-187)nixl_descriptor(201-213)new(99-123)lib/memory/src/nixl.rs (7)
fmt(78-84)addr(88-90)size(34-36)size(92-94)storage_kind(96-98)as_any(100-102)nixl_descriptor(104-106)lib/memory/src/device.rs (6)
addr(88-90)size(92-94)storage_kind(96-98)as_any(100-102)nixl_descriptor(104-106)new(44-61)lib/memory/src/disk.rs (6)
addr(162-164)size(166-168)storage_kind(170-172)as_any(174-176)nixl_descriptor(177-179)new(28-37)lib/memory/src/pinned.rs (6)
addr(101-103)size(105-107)storage_kind(109-111)as_any(113-115)nixl_descriptor(117-119)new(44-67)lib/memory/src/system.rs (7)
addr(86-88)size(90-92)storage_kind(94-96)as_any(98-100)nixl_descriptor(102-104)new(22-56)as_slice(130-132)
lib/memory/src/nixl.rs (7)
lib/memory/src/arena.rs (7)
size(170-172)agent_name(223-225)addr(167-169)storage_kind(173-175)as_any(176-178)nixl_descriptor(179-187)nixl_descriptor(201-213)lib/memory/src/lib.rs (12)
size(96-96)size(116-118)size(174-176)addr(93-93)addr(113-115)addr(169-171)storage_kind(99-99)storage_kind(119-121)as_any(102-102)as_any(122-124)nixl_descriptor(105-105)nixl_descriptor(125-127)lib/memory/src/offset.rs (5)
size(59-61)addr(55-57)storage_kind(63-65)as_any(67-69)nixl_descriptor(71-76)lib/memory/src/device.rs (7)
nixl_params(111-118)size(92-94)device_id(69-71)addr(88-90)storage_kind(96-98)as_any(100-102)nixl_descriptor(104-106)lib/memory/src/disk.rs (6)
nixl_params(184-207)size(166-168)addr(162-164)storage_kind(170-172)as_any(174-176)nixl_descriptor(177-179)lib/memory/src/pinned.rs (7)
nixl_params(124-127)as_ptr(73-75)size(105-107)addr(101-103)storage_kind(109-111)as_any(113-115)nixl_descriptor(117-119)lib/memory/src/system.rs (7)
nixl_params(109-111)as_ptr(62-64)size(90-92)addr(86-88)storage_kind(94-96)as_any(98-100)nixl_descriptor(102-104)
lib/memory/src/system.rs (6)
lib/memory/src/actions.rs (13)
std(70-70)std(72-72)std(79-79)std(105-105)std(113-113)std(169-169)std(171-171)std(178-178)std(207-207)std(215-215)memset(21-21)as_slice(32-32)slice(45-56)lib/memory/src/arena.rs (8)
new(99-123)drop(242-244)addr(167-169)size(170-172)storage_kind(173-175)as_any(176-178)nixl_descriptor(179-187)nixl_descriptor(201-213)lib/memory/src/device.rs (8)
new(44-61)drop(75-84)addr(88-90)size(92-94)storage_kind(96-98)as_any(100-102)nixl_descriptor(104-106)nixl_params(111-118)lib/memory/src/offset.rs (7)
new(31-41)addr(55-57)size(59-61)storage_kind(63-65)as_any(67-69)nixl_descriptor(71-76)offset(44-46)lib/memory/src/pinned.rs (11)
new(44-67)as_ptr(73-75)as_mut_ptr(82-84)drop(88-97)addr(101-103)size(105-107)storage_kind(109-111)as_any(113-115)nixl_descriptor(117-119)nixl_params(124-127)memset(131-142)lib/memory/src/nixl.rs (9)
as_ptr(30-32)drop(70-74)addr(88-90)size(34-36)size(92-94)storage_kind(96-98)as_any(100-102)nixl_descriptor(104-106)nixl_params(17-17)
lib/memory/src/device.rs (3)
lib/memory/src/pinned.rs (5)
cuda_context(14-25)new(44-67)drop(88-97)size(105-107)nixl_params(124-127)lib/memory/src/nixl.rs (5)
device_id(44-46)drop(70-74)size(34-36)size(92-94)nixl_params(17-17)lib/memory/src/system.rs (4)
new(22-56)drop(77-82)size(90-92)nixl_params(109-111)
lib/memory/src/arena.rs (4)
lib/memory/src/lib.rs (13)
new(164-166)size(96-96)size(116-118)size(174-176)addr(93-93)addr(113-115)addr(169-171)storage_kind(99-99)storage_kind(119-121)as_any(102-102)as_any(122-124)nixl_descriptor(105-105)nixl_descriptor(125-127)lib/memory/src/nixl.rs (14)
storage(127-129)size(34-36)size(92-94)addr(88-90)storage_kind(96-98)as_ptr(30-32)as_any(100-102)nixl_descriptor(104-106)descriptor(55-55)descriptor(114-122)mem_type(40-42)device_id(44-46)agent_name(52-52)agent_name(110-112)lib/memory/src/offset.rs (7)
new(31-41)size(59-61)offset(44-46)addr(55-57)storage_kind(63-65)as_any(67-69)nixl_descriptor(71-76)lib/memory/src/system.rs (7)
new(22-56)size(90-92)addr(86-88)storage_kind(94-96)as_ptr(62-64)as_any(98-100)nixl_descriptor(102-104)
lib/memory/src/tests.rs (6)
lib/memory/src/nixl.rs (9)
storage(127-129)size(34-36)size(92-94)storage_kind(96-98)addr(88-90)device_id(44-46)register_with_nixl(169-196)agent_name(52-52)agent_name(110-112)lib/memory/src/device.rs (5)
new(44-61)size(92-94)storage_kind(96-98)addr(88-90)device_id(69-71)lib/memory/src/disk.rs (6)
new(28-37)size(166-168)storage_kind(170-172)addr(162-164)path(131-133)new_at(39-125)lib/memory/src/lib.rs (10)
new(164-166)size(96-96)size(116-118)size(174-176)storage_kind(99-99)storage_kind(119-121)addr(93-93)addr(113-115)addr(169-171)create_buffer(149-151)lib/memory/src/pinned.rs (4)
new(44-67)size(105-107)storage_kind(109-111)addr(101-103)lib/memory/src/system.rs (4)
new(22-56)size(90-92)storage_kind(94-96)addr(86-88)
lib/memory/src/pinned.rs (4)
lib/memory/src/actions.rs (11)
std(70-70)std(72-72)std(79-79)std(105-105)std(113-113)std(169-169)std(171-171)std(178-178)std(207-207)std(215-215)memset(21-21)lib/memory/src/device.rs (9)
cuda_context(13-24)device_id(69-71)new(44-61)drop(75-84)addr(88-90)size(92-94)storage_kind(96-98)nixl_descriptor(104-106)nixl_params(111-118)lib/memory/src/nixl.rs (9)
device_id(44-46)as_ptr(30-32)drop(70-74)addr(88-90)size(34-36)size(92-94)storage_kind(96-98)nixl_descriptor(104-106)nixl_params(17-17)lib/memory/src/system.rs (9)
new(22-56)as_ptr(62-64)drop(77-82)addr(86-88)size(90-92)storage_kind(94-96)nixl_descriptor(102-104)nixl_params(109-111)memset(115-126)
🪛 GitHub Actions: Copyright Checks
Cargo.toml
[error] 1-1: Copyright checkers detected missing or invalid copyright headers. File 'lib/memory/Cargo.toml' is missing required header pattern.
lib/memory/Cargo.toml
[error] 1-1: Copyright checkers detected missing or invalid copyright headers. File 'lib/memory/Cargo.toml' is missing required header pattern.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (16)
- GitHub Check: sglang (arm64)
- GitHub Check: trtllm (amd64)
- GitHub Check: trtllm (arm64)
- GitHub Check: sglang (amd64)
- GitHub Check: operator (arm64)
- GitHub Check: operator (amd64)
- GitHub Check: vllm (amd64)
- GitHub Check: vllm (arm64)
- GitHub Check: Build and Test - dynamo
- GitHub Check: clippy (.)
- GitHub Check: tests (launch/dynamo-run)
- GitHub Check: clippy (launch/dynamo-run)
- GitHub Check: tests (lib/runtime/examples)
- GitHub Check: tests (.)
- GitHub Check: tests (lib/bindings/python)
- GitHub Check: clippy (lib/bindings/python)
🔇 Additional comments (11)
Cargo.toml (1)
10-10: Verify default-members inclusion for lib/memory.The new
lib/memorycrate is added tomembersbut not todefault-members, unlike other core library crates (lib/llm,lib/runtime,lib/tokens, etc.). Confirm whether this exclusion is intentional (e.g., if the crate is slow to build or only used internally).lib/memory/Cargo.toml (1)
16-25: All external dependency versions are current and compatible.The dependencies declared are nixl-sys 0.7 (latest 0.7.0), nix 0.30 (latest 0.30.1), offset-allocator 0.2 (latest 0.2.0), and libc 0.2 (latest 0.2.176). All are within compatible semver ranges as declared in the manifest.
lib/memory/src/offset.rs (5)
8-25: LGTM: Clean structure and debug formatting.The
OffsetBufferstruct design is appropriate for representing sub-region views, and the manualDebugimplementation provides clear visibility into the view's composition.
31-41: LGTM: Robust bounds validation.The constructor correctly validates that the offset region fits within the base buffer bounds, using
checked_addto prevent integer overflow.
44-51: LGTM: Simple and correct accessors.The accessor methods provide appropriate read-only access to the offset and base buffer.
55-57: Verify that address arithmetic won't overflow.The
addr()method performs unchecked additionself.base.addr() + self.offset. While the constructor validates thatoffset + size <= base.size(), it doesn't guarantee that adding the offset to the base address won't overflowusize::MAX. This is unlikely in practice for valid memory addresses, but could be a concern if base addresses are near the upper limit of the address space.Consider using
checked_addhere for defensive programming:fn addr(&self) -> usize { - self.base.addr() + self.offset + self.base.addr().saturating_add(self.offset) }
71-76: LGTM: Consistent with arena.rs pattern.The
nixl_descriptor()implementation correctly adjusts the base descriptor to reflect the offset sub-region, consistent with the pattern used inarena.rs(lines 172-186).lib/memory/src/prelude.rs (1)
1-6: LGTM: Standard prelude pattern.The re-exports are appropriate for exposing the memory crate's public API surface, following standard Rust conventions for prelude modules.
lib/memory/src/torch.rs (3)
4-8: LGTM: Clean enum design.The
TorchDeviceenum provides a clean abstraction for device types with appropriate derives and an extensibleOthervariant for future device types.
10-21: LGTM: Idiomatic helper methods.The helper methods
is_cuda()andcuda_device_index()are well-designed and follow Rust conventions, providing convenient access to device information.
23-29: LGTM: Well-designed trait for PyTorch interop.The
TorchTensortrait provides a clean abstraction for tensor metadata with appropriate supertraits (Send + Sync + Debug). The use ofu64fordata_ptr()is reasonable given the target platforms for CUDA.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Signed-off-by: Ryan Olson <ryanolson@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Signed-off-by: Ryan Olson <ryanolson@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Signed-off-by: Ryan Olson <ryanolson@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Signed-off-by: Ryan Olson <ryanolson@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Signed-off-by: Ryan Olson <ryanolson@users.noreply.github.com>
Signed-off-by: Ryan Olson <rolson@nvidia.com>
Signed-off-by: Ryan Olson <rolson@nvidia.com>
Signed-off-by: Ryan Olson <rolson@nvidia.com>
Signed-off-by: Ryan Olson <rolson@nvidia.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Signed-off-by: Ryan Olson <ryanolson@users.noreply.github.com>
|
@coderabbitai - full review |
|
@ryanolson: I'll perform a full review of the PR. ✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
♻️ Duplicate comments (1)
lib/memory/src/actions.rs (1)
173-189: Handle zero-sized element types before dividing
as_slice_typed_mutstill divides bysize_of::<T>()without guarding against ZSTs, so passingT = ()(or any zero-sized type) will panic in safe code. Please mirror the immutable path: rejectelem_size == 0up front and reuse the cached value for both the division and modulo checks.Apply this diff:
- let ptr = bytes.as_mut_ptr() as *mut T; - let len = bytes.len() / std::mem::size_of::<T>(); - - if !(bytes.as_ptr() as usize).is_multiple_of(std::mem::align_of::<T>()) { + let ptr = bytes.as_mut_ptr() as *mut T; + let elem_size = std::mem::size_of::<T>(); + if elem_size == 0 { + return Err(StorageError::Unsupported( + "zero-sized types are not supported".into(), + )); + } + let len = bytes.len() / elem_size; + + if !(bytes.as_ptr() as usize).is_multiple_of(std::mem::align_of::<T>()) { return Err(StorageError::Unsupported(format!( "memory not aligned for type (required alignment: {})", std::mem::align_of::<T>() ))); } - if bytes.len() % std::mem::size_of::<T>() != 0 { + if bytes.len() % elem_size != 0 { return Err(StorageError::Unsupported(format!( "size {} is not a multiple of type size {}", bytes.len(), - std::mem::size_of::<T>() + elem_size ))); }
🧹 Nitpick comments (2)
lib/memory/src/pinned.rs (1)
13-25: Consider exposing device selection to PinnedStorage::new.The
cuda_contextfunction supports arbitrary device IDs, butPinnedStorage::new(line 51) hardcodes device 0. If multi-device pinned allocations are expected, consider adding adevice_idparameter toPinnedStorage::newto leverage this flexibility.lib/memory/src/lib.rs (1)
61-63: Remove or implement CUDA feature gating.Line 61 has a commented
#[cfg(feature = "cuda")], but theCudavariant (lines 62-63) is not actually feature-gated. This appears in multiple places (lines 61, 76, 80). Either remove the comments or implement consistent feature gating to make the CUDA dependency optional.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (18)
Cargo.toml(3 hunks)lib/config/Cargo.toml(1 hunks)lib/config/src/lib.rs(1 hunks)lib/memory/Cargo.toml(1 hunks)lib/memory/src/actions.rs(1 hunks)lib/memory/src/arena.rs(1 hunks)lib/memory/src/device.rs(1 hunks)lib/memory/src/disk.rs(1 hunks)lib/memory/src/lib.rs(1 hunks)lib/memory/src/nixl.rs(1 hunks)lib/memory/src/nixl/agent.rs(1 hunks)lib/memory/src/nixl/config.rs(1 hunks)lib/memory/src/offset.rs(1 hunks)lib/memory/src/pinned.rs(1 hunks)lib/memory/src/prelude.rs(1 hunks)lib/memory/src/system.rs(1 hunks)lib/memory/src/tests.rs(1 hunks)lib/memory/src/torch.rs(1 hunks)
🧰 Additional context used
🧠 Learnings (11)
📚 Learning: 2025-10-17T22:57:12.526Z
Learnt from: oandreeva-nv
Repo: ai-dynamo/dynamo PR: 3700
File: lib/llm/src/block_manager.rs:19-19
Timestamp: 2025-10-17T22:57:12.526Z
Learning: The dynamo library (ai-dynamo/dynamo repository) is currently developed to run on Linux only, so Linux-specific code and syscalls do not require platform guards.
Applied to files:
Cargo.tomllib/config/Cargo.tomllib/memory/Cargo.toml
📚 Learning: 2025-08-18T20:51:51.324Z
Learnt from: PeaBrane
Repo: ai-dynamo/dynamo PR: 2465
File: lib/runtime/src/pipeline/network/egress/push_router.rs:0-0
Timestamp: 2025-08-18T20:51:51.324Z
Learning: The runtime crate cannot depend on the llm crate due to architectural dependency constraints, preventing imports from lib/llm into lib/runtime.
Applied to files:
Cargo.toml
📚 Learning: 2025-08-15T02:01:01.238Z
Learnt from: keivenchang
Repo: ai-dynamo/dynamo PR: 2453
File: deploy/dynamo_check.py:739-741
Timestamp: 2025-08-15T02:01:01.238Z
Learning: In the dynamo project, missing cargo should be treated as a fatal error in dynamo_check.py because developers need cargo to build the Rust components. The tool is designed to validate the complete development environment, not just import functionality.
Applied to files:
Cargo.tomllib/memory/Cargo.toml
📚 Learning: 2025-10-17T22:57:18.773Z
Learnt from: oandreeva-nv
Repo: ai-dynamo/dynamo PR: 3700
File: lib/llm/src/block_manager/numa_allocator.rs:4-6
Timestamp: 2025-10-17T22:57:18.773Z
Learning: The Dynamo library (lib/llm) is developed to run on Linux only, so platform-specific cfg guards for Linux-only features are not required.
Applied to files:
Cargo.toml
📚 Learning: 2025-09-22T18:09:23.513Z
Learnt from: KrishnanPrash
Repo: ai-dynamo/dynamo PR: 3165
File: components/backends/sglang/src/dynamo/sglang/args.py:201-202
Timestamp: 2025-09-22T18:09:23.513Z
Learning: KrishnanPrash suggested adding early validation for custom Jinja template paths in the Rust layer (lib/bindings/python/rust/lib.rs) to benefit both vLLM and SGLang workflows, using PathBuf::from() and path.exists() checks with appropriate PyFileNotFoundError handling.
Applied to files:
lib/memory/src/disk.rs
📚 Learning: 2025-09-11T03:24:47.820Z
Learnt from: kthui
Repo: ai-dynamo/dynamo PR: 3004
File: lib/runtime/src/pipeline/network/ingress/push_handler.rs:271-277
Timestamp: 2025-09-11T03:24:47.820Z
Learning: In lib/runtime/src/pipeline/network/ingress/push_handler.rs, the maintainer prefers to keep the existing error comparison logic using format!("{:?}", err) == STREAM_ERR_MSG unchanged until proper error types are implemented, even though it has technical debt. Avoid suggesting changes to working legacy code that will be refactored later.
Applied to files:
lib/memory/src/disk.rslib/memory/src/actions.rs
📚 Learning: 2025-09-22T18:09:23.513Z
Learnt from: KrishnanPrash
Repo: ai-dynamo/dynamo PR: 3165
File: components/backends/sglang/src/dynamo/sglang/args.py:201-202
Timestamp: 2025-09-22T18:09:23.513Z
Learning: Both vLLM and SGLang implementations of --custom-jinja-template pass the path directly to register_llm() without any validation. KrishnanPrash suggested implementing early validation in the Rust layer (lib/bindings/python/rust/lib.rs) using PathBuf::from() and path.exists() checks with PyFileNotFoundError for consistent error handling across both backends.
Applied to files:
lib/memory/src/disk.rs
📚 Learning: 2025-09-19T18:21:03.693Z
Learnt from: GuanLuo
Repo: ai-dynamo/dynamo PR: 2746
File: lib/llm/src/grpc/service/tensor.rs:269-447
Timestamp: 2025-09-19T18:21:03.693Z
Learning: In lib/llm/src/grpc/service/tensor.rs, GuanLuo prefers using unsafe Vec::from_raw_parts operations to avoid data copying for performance reasons when converting raw tensor data from Vec<u8> to typed vectors like Vec<u32>, Vec<f32>, etc., even when it involves safety risks.
Applied to files:
lib/memory/src/torch.rs
📚 Learning: 2025-09-18T21:49:28.906Z
Learnt from: oandreeva-nv
Repo: ai-dynamo/dynamo PR: 2989
File: lib/llm/src/block_manager/block/transfer/cuda.rs:416-420
Timestamp: 2025-09-18T21:49:28.906Z
Learning: Pinned memory (page-locked host memory) being accessible by GPU via DMA doesn't change the fact that host and device pointers exist in separate virtual address spaces, so overlap checks between host and device pointers are still invalid regardless of whether the host memory is pinned.
Applied to files:
lib/memory/src/pinned.rs
📚 Learning: 2025-08-05T22:51:59.230Z
Learnt from: dmitry-tokarev-nv
Repo: ai-dynamo/dynamo PR: 2300
File: pyproject.toml:64-66
Timestamp: 2025-08-05T22:51:59.230Z
Learning: The ai-dynamo/dynamo project does not ship ARM64 wheels, so platform markers to restrict dependencies to x86_64 are not needed in pyproject.toml dependencies.
Applied to files:
lib/config/Cargo.toml
📚 Learning: 2025-08-30T20:43:10.091Z
Learnt from: keivenchang
Repo: ai-dynamo/dynamo PR: 2797
File: .devcontainer/devcontainer.json:12-12
Timestamp: 2025-08-30T20:43:10.091Z
Learning: In the dynamo project, devcontainer.json files use templated container names (like "dynamo-vllm-devcontainer") that are automatically processed by the copy_devcontainer.sh script to generate framework-specific configurations with unique names, preventing container name collisions.
Applied to files:
lib/config/Cargo.toml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
- GitHub Check: sglang (amd64)
- GitHub Check: vllm (arm64)
- GitHub Check: trtllm (arm64)
- GitHub Check: trtllm (amd64)
- GitHub Check: operator (arm64)
- GitHub Check: operator (amd64)
- GitHub Check: sglang (arm64)
- GitHub Check: vllm (amd64)
- GitHub Check: Build and Test - dynamo
- GitHub Check: tests (lib/runtime/examples)
- GitHub Check: tests (launch/dynamo-run)
- GitHub Check: tests (lib/bindings/python)
- GitHub Check: tests (.)
- GitHub Check: Mirror Repository to GitLab
🔇 Additional comments (3)
lib/memory/src/pinned.rs (2)
117-119: Clarify nixl_descriptor design intent.
nixl_descriptor()returnsNone, butNixlCompatibleis implemented below (lines 123-128), suggesting NIXL registration is supported. Is this intentional? If pinned memory requires explicit NIXL registration via theNixlCompatibleinterface rather than automatic descriptor creation, consider adding a comment explaining this design choice.
130-146: LGTM: Overflow protection properly implemented.The previous overflow concern has been correctly addressed using
checked_add(lines 132-134). The bounds checking and unsafe operations are now properly protected.lib/memory/src/lib.rs (1)
108-151: LGTM: Clean type erasure pattern.The
Bufferwrapper aroundArc<dyn MemoryDescription>provides an excellent type-erased abstraction. TheDerefimplementation enables transparent access, and thecreate_bufferhelper offers good ergonomics for wrapping concrete storage types.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Signed-off-by: Ryan Olson <ryanolson@users.noreply.github.com>
Signed-off-by: Ryan Olson <rolson@nvidia.com>
Summary by CodeRabbit
New Features
Tests
Chores