Skip to content

Conversation

@zhangjian29
Copy link
Contributor

@zhangjian29 zhangjian29 commented Sep 30, 2025

Description

This PR adds binary operation support for rvv_postops.hpp using a primitive-based approach, following the implementation style of acl_post_ops (aarch64) and jit_brgemm_post_ops (x64). The binary operations are implemented by integrating with rvv_binary introduced in #3899 (already merged).

Refactoring rvv_postops

Background

The current rvv_postops.hpp implementation uses an apply method during execution, which accepts a vfloat32m1_t input and vl and returns a vfloat32m1_t output. However, this kind of fine-grained approach:

  • Limits the vectorization of other postop sources
  • Requires redundant vectoization logic when applying postops
  • Restricts data type support extensibility

Since only two implementations currently use this postop approach (merged: rvv_matmul, unmerged: #3948), making changes at this early stage requires minimal effort.

Motivation

To address these limitations, we propose a primitive-based implementation following the acl_post_ops(AArch64) and jit_brgemm_post_ops(x64) patterns for better maintainability.

Key Changes

  • Added primitive-style init and execute methods (aligned with acl_postops.hpp)
  • Enabled binary operation support via rvv_binary.hpp integration (future rvv_binary changes won't affect this PR)
  • Kept the original post_ops_ok and apply methods unchanged in a fused way, making other implementation used original version still correct.
  • Supported post-ops:
    1. relu (via original methods)
    2. Any binary operations (via new primitive methods): add, div, max, min, mul, sub, ge, gt, le, lt, eq, ne, and the ternary select.
  • Demonstrated functionality by adding primitive-style binary postop support to rvv_nchw_pooling

Future Plans

  1. Rebase after cpu: rv64: add support for rv64 binary feature with RISC-V Vector Extension #3899 is merged (done)
  2. Update postop implementation in rvv_matmul and other related code
  3. Remove the original post_ops_ok and apply methods safely
  4. Add support for eltwise and sum operations
  5. Extend data type supports.

Checklist

General

  • Do all unit and benchdnn tests (make test and make test_benchdnn_*) pass locally for each commit?
  • Have you formatted the code using clang-format?

Performance improvements

  • Have you submitted performance data that demonstrates performance improvements?

We test the changes using the RISC-V GNU toolchain(14.2) and verify the functionality under the QEMU RISCV64 emulator with cmd:

ONEDNN_VERBOSE=1 ./benchdnn --pool --dir=FWD_I --attr-post-ops=add:f32,div:f32,mul:f32,max:f32,min:f32,sub:f32,ge:f32,gt:f32,le:f32,lt:f32,eq:f32,ne:f32,select:f32,linear:0.5:-1,relu --batch=shapes_basic

All tests passed with proper dispatching confirmed (relu and linear are dispatched to ref while others are to RISCV64GCV implementation). Calls to the implemented rvv_nchw_pooling with all kinds of postops can be traced by searching for RISCV64GCV in:

@zhangjian29 zhangjian29 requested a review from a team as a code owner September 30, 2025 06:11
@zhangjian29 zhangjian29 changed the title Add rvv postops binary op support via a primitive-based approach cpu: rv64: postops: add rvv postops binary op support via a primitive-based approach Sep 30, 2025
@zhangjian29 zhangjian29 marked this pull request as draft September 30, 2025 06:13
@zhangjian29 zhangjian29 force-pushed the add-rvv-postops-binary branch from f82c9e0 to 57065a8 Compare October 4, 2025 00:13
@zhangjian29 zhangjian29 force-pushed the add-rvv-postops-binary branch from 57065a8 to f18c4bb Compare October 9, 2025 07:24
@zhangjian29 zhangjian29 marked this pull request as ready for review October 9, 2025 07:26
@zhangjian29
Copy link
Contributor Author

Hi @mgouicem ,

Could you please take a look at this PR when you have time? Thank you so much!

Copy link
Contributor

@mgouicem mgouicem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution @zhangjian29.
In general relying on primitives and splitting post-ops init and execute makes sense.

However, this PR currently calls init and creates primitives related to post-ops (binary) within the main primitive execute calls (poolin), which is contrary to oneDNN external API principles (when calling execute, there should no more be any creation overhead).

@vpirogov vpirogov requested a review from mgouicem October 27, 2025 21:58
@zhangjian29 zhangjian29 force-pushed the add-rvv-postops-binary branch 2 times, most recently from 72dfaee to de2f273 Compare October 28, 2025 02:01
@zhangjian29
Copy link
Contributor Author

Thank you so much for your feedback @mgouicem. Ready for review now.

@zhangjian29 zhangjian29 requested a review from mgouicem October 28, 2025 12:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants