-
Couldn't load subscription status.
- Fork 48
Description
Version
latest master / 4.0.0
What behaviour are you expecting?
I was reproducing the server-client setup via HAL server as in https://github.com/codeplaysoftware/oneapi-construction-kit/tree/main/examples/hal_cpu_remote_server, then I noticed my big kernels are erroring out on (both of) my RISC-V device(s). I am sure the (both) device(s) have sufficient memory, and in fact the allocation takes place as expected.
What actual behaviour are you seeing?
I am seeing the following from the local client (first lines as expected):
$ HAL_REMOTE_PORT=5906 ./test $((1<<25))
Running on ock cpu
Allocated 128 MB
$ HAL_REMOTE_PORT=5906 ./test $((1<<26))
Running on ock cpu
Allocated 256 MB
terminate called after throwing an instance of 'sycl::_V1::runtime_error'
what(): Native API failed. Native API returns: -999 (Unknown PI error) -999 (Unknown PI error)
Aborted (core dumped)
and on the RISC-V server, I get a seg fault shortly as: Segmentation fault (core dumped). And after that, attempting to restart the server on the same port, I fail with Unable to start server on requested port 5906, node 127.0.0.1.
On the other hand, empty kernel, or no kernel at all is OK.
What steps are required to reproduce the bug?
To reproduce, on the client side:
#include <sycl/sycl.hpp>
int main(int argc, char **argv) {
unsigned long long len = 1 << 28;
if (argc > 1) {
len = std::stoull(argv[1]);
}
sycl::queue queue(sycl::accelerator_selector_v);
std::cout << "Running on " << queue.get_device().get_info<sycl::info::device::name>() << std::endl;
float *d_a = sycl::malloc_device<float>(len, queue);
queue.wait();
std::cout << "Allocated " << len * sizeof(float) / 1024 / 1024 << " MB" << std::endl;
queue.parallel_for(sycl::range<1>(len), [=](sycl::id<1> idx) {
d_a[idx] = idx;
}).wait();
return 0;
}On the server, simply listen on a port as usual.
Minimal test case
No response
Anything else we should know?
No response