-
Notifications
You must be signed in to change notification settings - Fork 796
[UR][L0v2 adapter] Fix buffers for integrated gpu #20495
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: sycl
Are you sure you want to change the base?
Conversation
Signed-off-by: Mateusz P. Nowak <mateusz.p.nowak@intel.com>
5516bc4 to
c3ce32b
Compare
Signed-off-by: Mateusz P. Nowak <mateusz.p.nowak@intel.com>
49057b5 to
d002d08
Compare
| "opencl": ("cpu", "gpu", "fpga"), | ||
| "cuda": "gpu", | ||
| "level_zero": "gpu", | ||
| "level_zero": ("gpu", "0", "1"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed for local tests, but may be useful in CI, too, depending on hw config. Allows to point out which gpu should be used for tests (eg. integrated/discrete). Example:
llvm-lit --param "sycl_devices=level_zero_v2:1" - the tests are run on device level_zero:1 with v2 adapter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After closer look, as some tests appeared as unsupported, additional one-line fix was necessary.
So, if any of reviewers thinks the change does not belong to this PR, I am also OK with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should be changing the default config. CI and other dev systems where this is used may have a different configuration where :0 and :1 mean e.g., different dGPUs. Or there may not be a second GPU at all.
| void *rawPtr; | ||
| UR_CALL_THROWS(hContext->getDefaultUSMPool()->allocate( | ||
| hContext, nullptr, nullptr, UR_USM_TYPE_HOST, size, &rawPtr)); | ||
| if (hostPtr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when can hostPtr be null?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just checking the argument. Can we be sure it will never be null?
| auto ret = | ||
| getMemoryAttrs(hContext->getZeHandle(), hostPtr, nullptr, &memProps); | ||
|
|
||
| if (ret == UR_RESULT_SUCCESS && memProps.type != ZE_MEMORY_TYPE_UNKNOWN) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why check against ZE_MEMORY_TYPE_UNKNOWN and not something specific?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand specification right, mem type can be known to be USM (host, device or shared) or unknown. The condition checks if mem type is USM.
| wait_list_view & /*waitListView*/) { | ||
| // TODO: if writeBackPtr is set, we should copy the data back | ||
| /* nop */ | ||
| // No-op: integrated buffers use zero-copy, no synchronization needed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if we go in this path when creating the buffer:
"// Import failed - allocate backing buffer and set up copy-back"
?
If "import failed" is a possible scenario, we should have a test. If it's not, that code path should be removed.
| // TODO: if writeBackPtr is set, we should map to that pointer | ||
| // because that's what SYCL expects, SYCL will attempt to call free | ||
| // on the resulting pointer leading to double free with the current | ||
| // implementation. Investigate the SYCL implementation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was this comment false?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, the comment was not false as long as map/unmap functions missed full handling of copy-back path. Last commit addresses the issue.
cd3bd6e to
5c52b4c
Compare
Signed-off-by: Mateusz P. Nowak <mateusz.p.nowak@intel.com>
5c52b4c to
2a0d2df
Compare
This PR implements zero-copy buffers for integrated GPUs.