RISCV64-CI: don't rely on dependency resolution for qemu-user #5506

martin-frbg · 2025-10-14T20:47:23Z

No description provided.

martin-frbg · 2025-10-31T12:02:42Z

@ChipKerchner do we expect casts from bfloat16 to float32 to "just work" for C code on RISCV64 ? AFAICT this is not implemented at least in the cross-compiler setup that this gh workflow uses (even when using latest LLVM with latest riscv-gnu-toolchain), causing test failures as the intermediate result0 = (float)A[ai] * (float) B[bi] in your sbgemm kernel turns the small bfloat16 numbers into huge floats...

ChipKerchner · 2025-10-31T12:19:08Z

Scalar casting should just work from bfloat16 to float. I don't see any issue. These are the qemu flags I use.

qemu-riscv64 -cpu rv64,g=true,f=true,d=true,c=true,v=true,vlen=256,elen=64,vext_spec=v1.0,zfh=true,zvfh=true,zvfbfwma=true,rvv_ma_all_1s=true,rvv_ta_all_1s=true,zbc=true,zvbc=true -L /home/ckerchner/tools/tt-riscv-toolchain-ae8a01f3/sysroot

ChipKerchner · 2025-10-31T12:50:06Z

Actually after I sync, I'm seeing a failure in sbgemm - sbgemv seems fine. BTW, I didn't write sbgemm.

martin-frbg · 2025-10-31T13:13:54Z

Thanks for the flags - unfortunately adding the missing ones did not change the outcome for me. And I'm getting SGEMV FAILURES: 789504 as well with that setup, while the BGEMM test passes (as do all float16 ones). Most likely your TT toolchain is more advanced, and I should just leave out the SB tests in this CI job for now ?
I just noticed the use of plain (float) casts in some of the code, while the tests all go to sbf16tos() for conversions.

ChipKerchner · 2025-10-31T13:19:07Z

Are you saying that some architectures besides RISC-V are using plain casts to float while others are using a external function?

ChipKerchner · 2025-10-31T13:24:19Z

BTW, I tried an external function and I'm still getting failures.

martin-frbg · 2025-10-31T13:36:39Z

Are you saying that some architectures besides RISC-V are using plain casts to float while others are using a external function?

No, on the contrary I see RISC-V using plain casts while everything else uses an external function.
And at least the first few intermediate calculations in the sbgemm_kernel_16x8_zvl256 seem to make more sense now that I've changed them from casts to using the float16to32 wrapper around sbf16tos as in the test helper header

ChipKerchner · 2025-10-31T15:48:59Z

Strange thing is SHGEMM uses the same type casting and all pass there.

martin-frbg · 2025-10-31T17:04:18Z

Yes, this got me thinking that maybe there is a conflict between the compiler having (or being expected to have) some "native" support for a floating point "bf16" type and OpenBLAS' fallback solution of assuming bfloat16 is an uint_16.
Replacing all obvious casts with calls to the conversion function did not solve the test errors for me, however - a lot of the result matrix elements became similar enough to their SGEMM counterparts, but not all. And I have no way of finding out if it is the cross-compiler at fault, or qemu-riscv64 10.1 not handling all aspects of bfloat16 correctly. My Banana PI F3 does great for checking fp16 code but appears to lack support for the bfloat16 extensions

ChipKerchner · 2025-10-31T17:39:33Z

Yes, unfortunately the BananaPi does NOT support the bf16 format.

Another weird thing is the test pass for sizes 1 -> 100 but fail for size = 256.

martin-frbg added 20 commits October 14, 2025 22:46

install qemu-user package directly

87c0cd2

add riscv64 elf loader&libraries package

466314f

add library and loader paths

08109f3

Update riscv64_vector.yml

822a7f3

Merge branch 'OpenMathLib:develop' into fixup5496

8412041

add sbgemm/shgemm test for zvl256b target

ec03b83

Enable relevant b/hfloat extensions in qemu cpu string

c1878de

comment out the sh/sb options and tests as they require a newer qemu

1ef1100

add local build of qemu-10.1.1

c26e223

typo fix

0a79085

Update riscv64_vector.yml

928e4e9

Update riscv64_vector.yml

211f1ed

fix gist link to qemu

f305d25

Update riscv64_vector.yml

aaf5329

Update riscv64_vector.yml

d2ec4e0

Update riscv64_vector.yml

99196a6

Update riscv64_vector.yml

18da9c3

Update riscv64_vector.yml

db84577

Update riscv64_vector.yml

8c234ce

fix gcc version

d5d0ce9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RISCV64-CI: don't rely on dependency resolution for qemu-user #5506

RISCV64-CI: don't rely on dependency resolution for qemu-user #5506

Uh oh!

martin-frbg commented Oct 14, 2025

Uh oh!

martin-frbg commented Oct 31, 2025

Uh oh!

ChipKerchner commented Oct 31, 2025

Uh oh!

ChipKerchner commented Oct 31, 2025

Uh oh!

martin-frbg commented Oct 31, 2025

Uh oh!

ChipKerchner commented Oct 31, 2025

Uh oh!

ChipKerchner commented Oct 31, 2025

Uh oh!

martin-frbg commented Oct 31, 2025

Uh oh!

ChipKerchner commented Oct 31, 2025

Uh oh!

martin-frbg commented Oct 31, 2025

Uh oh!

ChipKerchner commented Oct 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

RISCV64-CI: don't rely on dependency resolution for qemu-user #5506

Are you sure you want to change the base?

RISCV64-CI: don't rely on dependency resolution for qemu-user #5506

Uh oh!

Conversation

martin-frbg commented Oct 14, 2025

Uh oh!

martin-frbg commented Oct 31, 2025

Uh oh!

ChipKerchner commented Oct 31, 2025

Uh oh!

ChipKerchner commented Oct 31, 2025

Uh oh!

martin-frbg commented Oct 31, 2025

Uh oh!

ChipKerchner commented Oct 31, 2025

Uh oh!

ChipKerchner commented Oct 31, 2025

Uh oh!

martin-frbg commented Oct 31, 2025

Uh oh!

ChipKerchner commented Oct 31, 2025

Uh oh!

martin-frbg commented Oct 31, 2025

Uh oh!

ChipKerchner commented Oct 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants