-
Couldn't load subscription status.
- Fork 1.1k
Open
Labels
Description
Summary
A recent change in src/cpu/rv64/rvv_matmul.cpp seems to have introduced a regression, causing the test_benchdnn_modeC_matmul_ci_cpu test to fail.
Version
[xzz@localhost benchdnn]$ ONEDNN_VERBOSE=all ./benchdnn --matmul --stag=ab --dtag=ab --attr-dropout=0.5:12345678 1x1:1x1
onednn_verbose,v1,info,oneDNN v3.10.0 (commit 382986ec76ca0cb199614ff470c8b52923347f63)
onednn_verbose,v1,info,cpu,runtime:OpenMP,nthr:64
onednn_verbose,v1,info,cpu,isa:Generic
onednn_verbose,v1,info,gpu,runtime:none
onednn_verbose,v1,info,graph,backend,0:dnnl_backend
onednn_verbose,v1,primitive,info,template:operation,engine,primitive,implementation,prop_kind,memory_descriptors,attributes,auxiliary,problem_desc,exec_time
onednn_verbose,v1,graph,info,template:operation,engine,partition_id,partition_kind,op_names,data_formats,logical_tensors,fpmath_mode,implementation,backend,exec_time
onednn_verbose,v1,primitive,create:cache_miss,cpu,matmul,RISCV64GCV,undef,src:f32::blocked:ab::f0 wei:f32:a:blocked:ab::f0 dst:f32::blocked:ab::f0,attr-dropout:any,,1x1:1x1,0.0839844
onednn_verbose,v1,primitive,create:cache_hit,cpu,matmul,RISCV64GCV,undef,src:f32::blocked:ab::f0 wei:f32:a:blocked:ab::f0 dst:f32::blocked:ab::f0,attr-dropout:any,,1x1:1x1,0.00708008
onednn_verbose,v1,common,create:check,memory,unsupported format tag,src/common/memory.cpp:190
Error: Function 'initialize_memory_create' at (/home/xzz/oneDNN-raw-2/tests/benchdnn/dnnl_memory.cpp:867) returned 'invalid_arguments'
[CHECK_MEM][ERROR]: Allocations were not cleared
[CHECK_MEM][ERROR]: Total size wasn't reduced to 0
Environment
- CPU make and model
[xzz@localhost benchdnn]$ lscpu
Architecture: riscv64
Byte Order: Little Endian
CPU(s): 64
On-line CPU(s) list: 0-63
Vulnerabilities:
Gather data sampling: Not affected
Indirect target selection: Not affected
Itlb multihit: Not affected
L1tf: Not affected
Mds: Not affected
Meltdown: Not affected
Mmio stale data: Not affected
Reg file data sampling: Not affected
Retbleed: Not affected
Spec rstack overflow: Not affected
Spec store bypass: Not affected
Spectre v1: Not affected
Spectre v2: Not affected
Srbds: Not affected
Tsx async abort: Not affected
[xzz@localhost benchdnn]$ cat /proc/cpuinfo
processor : 0
hart : 1
isa : rv64imafdcv_zicbom_zicboz_zicntr_zicond_zicsr_zifencei_zihintntl_zihintpause_zihpm_zawrs_zfa_zfh_zfhmin_zca_zcb_zcd_zba_zbb_zbc_zbs_zve32f_zve32x_zve64d_zve64f_zve64x_zvfh_zvfhmin_sscofpmf_sstc_svinval_svnapot_svpbmt
mmu : sv48
mvendorid : 0x5b7
marchid : 0x80000000090c0d00
mimpid : 0x2047000
hart isa : rv64imafdcv_zicbom_zicboz_zicntr_zicond_zicsr_zifencei_zihintntl_zihintpause_zihpm_zawrs_zfa_zfh_zfhmin_zca_zcb_zcd_zba_zbb_zbc_zbs_zve32f_zve32x_zve64d_zve64f_zve64x_zvfh_zvfhmin_sscofpmf_sstc_svinval_svnapot_svpbmt
processor : 1
hart : 0
isa : rv64imafdcv_zicbom_zicboz_zicntr_zicond_zicsr_zifencei_zihintntl_zihintpause_zihpm_zawrs_zfa_zfh_zfhmin_zca_zcb_zcd_zba_zbb_zbc_zbs_zve32f_zve32x_zve64d_zve64f_zve64x_zvfh_zvfhmin_sscofpmf_sstc_svinval_svnapot_svpbmt
mmu : sv48
mvendorid : 0x5b7
marchid : 0x80000000090c0d00
mimpid : 0x2047000
hart isa : rv64imafdcv_zicbom_zicboz_zicntr_zicond_zicsr_zifencei_zihintntl_zihintpause_zihpm_zawrs_zfa_zfh_zfhmin_zca_zcb_zcd_zba_zbb_zbc_zbs_zve32f_zve32x_zve64d_zve64f_zve64x_zvfh_zvfhmin_sscofpmf_sstc_svinval_svnapot_svpbmt
processor : 2
hart : 2
isa : rv64imafdcv_zicbom_zicboz_zicntr_zicond_zicsr_zifencei_zihintntl_zihintpause_zihpm_zawrs_zfa_zfh_zfhmin_zca_zcb_zcd_zba_zbb_zbc_zbs_zve32f_zve32x_zve64d_zve64f_zve64x_zvfh_zvfhmin_sscofpmf_sstc_svinval_svnapot_svpbmt
mmu : sv48
mvendorid : 0x5b7
marchid : 0x80000000090c0d00
mimpid : 0x2047000
hart isa : rv64imafdcv_zicbom_zicboz_zicntr_zicond_zicsr_zifencei_zihintntl_zihintpause_zihpm_zawrs_zfa_zfh_zfhmin_zca_zcb_zcd_zba_zbb_zbc_zbs_zve32f_zve32x_zve64d_zve64f_zve64x_zvfh_zvfhmin_sscofpmf_sstc_svinval_svnapot_svpbmt
processor : 3
hart : 3
isa : rv64imafdcv_zicbom_zicboz_zicntr_zicond_zicsr_zifencei_zihintntl_zihintpause_zihpm_zawrs_zfa_zfh_zfhmin_zca_zcb_zcd_zba_zbb_zbc_zbs_zve32f_zve32x_zve64d_zve64f_zve64x_zvfh_zvfhmin_sscofpmf_sstc_svinval_svnapot_svpbmt
mmu : sv48
mvendorid : 0x5b7
marchid : 0x80000000090c0d00
mimpid : 0x2047000
hart isa : rv64imafdcv_zicbom_zicboz_zicntr_zicond_zicsr_zifencei_zihintntl_zihintpause_zihpm_zawrs_zfa_zfh_zfhmin_zca_zcb_zcd_zba_zbb_zbc_zbs_zve32f_zve32x_zve64d_zve64f_zve64x_zvfh_zvfhmin_sscofpmf_sstc_svinval_svnapot_svpbmt
......
- OS version
[xzz@localhost benchdnn]$ uname -a
Linux localhost.localdomain 6.12.35.eos30.riscv64+ #3 SMP Thu Jul 3 16:04:12 CST 2025 riscv64 riscv64 riscv64 GNU/Linux
- Compiler version
[xzz@localhost benchdnn]$ clang --version
clang version 17.0.6 ( 17.0.6-17.eos30)
Target: riscv64-EulixOS-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
- CMake version
[xzz@localhost benchdnn]$ cmake --version
cmake version 3.27.9
CMake suite maintained and supported by Kitware (kitware.com/cmake).
- CMake output log
[xzz@localhost build2]$ cmake .. -DDNNL_ARCH_OPT_FLAGS=-march=rv64gcv -DCMAKE_CXX_FLAGS=-march=rv64gcv -DCMAKE_C_FLAGS=-march=rv64gcv
-- CMAKE_BUILD_TYPE is unset, defaulting to Release
-- The C compiler identification is Clang 17.0.6
-- The CXX compiler identification is Clang 17.0.6
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/lib64/ccache/clang - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/lib64/ccache/clang++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- DNNL_TARGET_ARCH: RV64
-- DNNL_LIBRARY_NAME: dnnl
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Found OpenMP_C: -fopenmp=libomp (found version "5.1")
-- Found OpenMP_CXX: -fopenmp=libomp (found version "5.1")
-- Found OpenMP: TRUE (found version "5.1")
-- Performing Test CAN_COMPILE_RVV_INTRINSICS
-- Performing Test CAN_COMPILE_RVV_INTRINSICS - Success
-- Performing Test CAN_COMPILE_ZVFH_INTRINSICS
-- Performing Test CAN_COMPILE_ZVFH_INTRINSICS - Success
-- Can compile RVV Intrinsics: TRUE
-- Can compile Zvfh Intrinsics: TRUE
-- DNNL_RISCV_USE_RVV_INTRINSICS: TRUE
-- DNNL_RISCV_USE_ZVFH_INTRINSICS: TRUE
-- Using RV64 march flag: -march=rv64gcv_zvfh
-- Found Doxygen: /usr/bin/doxygen (found version "1.9.6") found components: doxygen dot
-- Could NOT find Doxyrest (missing: DOXYREST_EXECUTABLE)
-- Found Python: /usr/bin/python3.11 (found suitable version "3.11.6", minimum required is "3.7") found components: Interpreter
-- Could NOT find Sphinx (missing: SPHINX_EXECUTABLE)
-- Found Git: /usr/bin/git (found version "2.43.0")
-- Enabled testing coverage: CI
-- Enabled workload: TRAINING
-- Enabled primitives: ALL
-- Enabled primitive CPU ISA: ALL
-- Enabled primitive GPU ISA: ALL
-- Enabled GeMM kernels ISA: ALL
-- Primitive cache is enabled
-- Graph component is enabled
-- Configuring done (8.7s)
-- Generating done (1.2s)
-- Build files have been written to: /home/xzz/oneDNN-raw-2/build2
- git hash
382986ec76ca0cb199614ff470c8b52923347f63
Steps to reproduce
ctest -R test_benchdnn_modeC_matmul_ci_cpu
or (reproduce fastly)
./benchdnn --matmul --stag=ab --dtag=ab --attr-dropout=0.5:12345678 1x1:1x1
Observed behavior
The test fails with the following error log:
create: --matmul --stag=ab --dtag=ab --attr-dropout=0.5:12345678 1x1:1x1
run: --matmul --stag=ab --dtag=ab --attr-dropout=0.5:12345678 1x1:1x1
Error: Function 'initialize_memory_create' at (/home/xiazhuozhao/oneDNN/tests/benchdnn/dnnl_memory.cpp:867) returned 'invalid_arguments'
[CHECK_MEM][ERROR]: Allocations were not cleared
[CHECK_MEM][ERROR]: Total size wasn't reduced to 0
Expected behavior
xiazhuozhao-desktop% ./benchdnn --matmul --stag=ab --dtag=ab --attr-dropout=0.5:12345678 1x1:1x1
0:PASSED (41 ms) __REPRO: --matmul --stag=ab --dtag=ab --attr-dropout=0.5:12345678 1x1:1x1
tests:1 passed:1 skipped:0 mistrusted:0 unimplemented:0 invalid_arguments:0 failed:0 listed:0
total: 0.05s; create_pd: 0.01s (12%); create_prim: 0.00s (1%); fill: 0.02s (40%); execute: 0.00s (3%); compute_ref: 0.00s (0%); compare: 0.00s (9%);