- 
                Notifications
    You must be signed in to change notification settings 
- Fork 15k
[OpenMP][Flang] Fix atomic operations on complex types #165366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| @llvm/pr-subscribers-flang-fir-hlfir @llvm/pr-subscribers-flang-openmp Author: Krish Gupta (KrxGu) ChangesFixes #165184 ProblemWhen using OpenMP atomic read/write on Fortran  Root CauseIn  On 64-bit systems, this resulted in only 8 bytes being transferred regardless of the actual struct size. SolutionChanged both functions to use  
 TestsAdded three regression tests: 
 Full diff: https://github.com/llvm/llvm-project/pull/165366.diff 4 Files Affected: 
 diff --git a/flang/test/Integration/OpenMP/atomic-read-complex.f90 b/flang/test/Integration/OpenMP/atomic-read-complex.f90
new file mode 100644
index 0000000000000..53ad7acbfaf49
--- /dev/null
+++ b/flang/test/Integration/OpenMP/atomic-read-complex.f90
@@ -0,0 +1,44 @@
+!===----------------------------------------------------------------------===!
+! This directory can be used to add Integration tests involving multiple
+! stages of the compiler (for eg. from Fortran to LLVM IR). It should not
+! contain executable tests. We should only add tests here sparingly and only
+! if there is no other way to test. Repeat this message in each test that is
+! added to this directory and sub-directories.
+!===----------------------------------------------------------------------===!
+
+! REQUIRES: x86-registered-target || aarch64-registered-target
+
+! RUN: %flang_fc1 -triple x86_64-unknown-linux-gnu -emit-llvm -fopenmp %s -o - | FileCheck %s
+! RUN: %flang_fc1 -triple aarch64-unknown-linux-gnu -emit-llvm -fopenmp %s -o - | FileCheck %s
+
+! Test that atomic read operations with complex types emit the correct
+! size to __atomic_load. This is a regression test for issue #165184.
+!
+! For complex(4) (8 bytes total): should call __atomic_load(i64 8, ...)
+! For complex(8) (16 bytes total): should call __atomic_load(i64 16, ...)
+! For complex(16) (32 bytes total): should call __atomic_load(i64 32, ...)
+
+program atomic_read_complex
+  implicit none
+
+  ! Test complex(4) - single precision (8 bytes)
+  complex(4) :: c41, c42
+  ! Test complex(8) - double precision (16 bytes)
+  complex(8) :: c81, c82
+  
+  c42 = (1.0_4, 1.0_4)
+  c82 = (1.0_8, 1.0_8)
+
+  ! CHECK-LABEL: define {{.*}} @_QQmain
+
+  ! Single precision complex: 8 bytes
+  ! CHECK: call void @__atomic_load(i64 8, ptr {{.*}}, ptr {{.*}}, i32 {{.*}})
+!$omp atomic read
+  c41 = c42
+  
+  ! Double precision complex: 16 bytes (this was broken before the fix)
+  ! CHECK: call void @__atomic_load(i64 16, ptr {{.*}}, ptr {{.*}}, i32 {{.*}})
+!$omp atomic read
+  c81 = c82
+
+end program atomic_read_complex
diff --git a/flang/test/Integration/OpenMP/atomic-write-complex.f90 b/flang/test/Integration/OpenMP/atomic-write-complex.f90
new file mode 100644
index 0000000000000..7302328fba3bf
--- /dev/null
+++ b/flang/test/Integration/OpenMP/atomic-write-complex.f90
@@ -0,0 +1,44 @@
+!===----------------------------------------------------------------------===!
+! This directory can be used to add Integration tests involving multiple
+! stages of the compiler (for eg. from Fortran to LLVM IR). It should not
+! contain executable tests. We should only add tests here sparingly and only
+! if there is no other way to test. Repeat this message in each test that is
+! added to this directory and sub-directories.
+!===----------------------------------------------------------------------===!
+
+! REQUIRES: x86-registered-target || aarch64-registered-target
+
+! RUN: %flang_fc1 -triple x86_64-unknown-linux-gnu -emit-llvm -fopenmp %s -o - | FileCheck %s
+! RUN: %flang_fc1 -triple aarch64-unknown-linux-gnu -emit-llvm -fopenmp %s -o - | FileCheck %s
+
+! Test that atomic write operations with complex types emit the correct
+! size to __atomic_store. This is a regression test for issue #165184.
+!
+! For complex(4) (8 bytes total): should call __atomic_store(i64 8, ...)
+! For complex(8) (16 bytes total): should call __atomic_store(i64 16, ...)
+! For complex(16) (32 bytes total): should call __atomic_store(i64 32, ...)
+
+program atomic_write_complex
+  implicit none
+
+  ! Test complex(4) - single precision (8 bytes)
+  complex(4) :: c41, c42
+  ! Test complex(8) - double precision (16 bytes)  
+  complex(8) :: c81, c82
+  
+  c42 = (1.0_4, 1.0_4)
+  c82 = (1.0_8, 1.0_8)
+
+  ! CHECK-LABEL: define {{.*}} @_QQmain
+  
+  ! Single precision complex: 8 bytes
+  ! CHECK: call void @__atomic_store(i64 8, ptr {{.*}}, ptr {{.*}}, i32 {{.*}})
+!$omp atomic write
+  c41 = c42
+  
+  ! Double precision complex: 16 bytes (this was broken before the fix)
+  ! CHECK: call void @__atomic_store(i64 16, ptr {{.*}}, ptr {{.*}}, i32 {{.*}})
+!$omp atomic write
+  c81 = c82
+
+end program atomic_write_complex
diff --git a/flang/test/Integration/OpenMP/issue-165184-atomic-complex8.f90 b/flang/test/Integration/OpenMP/issue-165184-atomic-complex8.f90
new file mode 100644
index 0000000000000..8c930191ad233
--- /dev/null
+++ b/flang/test/Integration/OpenMP/issue-165184-atomic-complex8.f90
@@ -0,0 +1,32 @@
+! RUN: %flang_fc1 -emit-llvm -fopenmp -o - %s | FileCheck %s
+! RUN: %flang -fopenmp %s -o %t && %t | FileCheck %s --check-prefix=EXEC
+
+! Regression test for issue #165184:
+! Atomic write to complex(8) was only storing 8 bytes instead of 16,
+! causing the imaginary part to remain 0.
+
+program test_atomic_write_complex8
+  implicit none
+  complex(8) :: c81, c82
+  
+  c81 = (0.0_8, 0.0_8)
+  c82 = (0.0_8, 0.0_8)
+  
+  ! Verify the fix: __atomic_store should be called with size 16
+  ! CHECK: call void @__atomic_store(i64 16,
+  
+!$omp parallel
+!$omp atomic write
+  c81 = c82 + (1.0_8, 1.0_8)
+!$omp end parallel
+  
+  ! EXEC: c81 = (1.00000000000000,1.00000000000000)
+  write(*,*) "c81 = ", c81
+  
+  ! Verify both parts are correct
+  if (real(c81) /= 1.0_8 .or. aimag(c81) /= 1.0_8) then
+    write(*,*) "ERROR: Expected (1.0,1.0) but got", c81
+    stop 1
+  end if
+  
+end program test_atomic_write_complex8
diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
index 286ed039b1214..c71b4a1b6eaa1 100644
--- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
@@ -9338,9 +9338,8 @@ OpenMPIRBuilder::createAtomicRead(const LocationDescription &Loc,
     // target does not support `atomicrmw` of the size of the struct
     LoadInst *OldVal = Builder.CreateLoad(XElemTy, X.Var, "omp.atomic.read");
     OldVal->setAtomic(AO);
-    const DataLayout &LoadDL = OldVal->getModule()->getDataLayout();
-    unsigned LoadSize =
-        LoadDL.getTypeStoreSize(OldVal->getPointerOperand()->getType());
+    const DataLayout &DL = OldVal->getModule()->getDataLayout();
+    unsigned LoadSize = DL.getTypeStoreSize(XElemTy);
     OpenMPIRBuilder::AtomicInfo atomicInfo(
         &Builder, XElemTy, LoadSize * 8, LoadSize * 8, OldVal->getAlign(),
         OldVal->getAlign(), true /* UseLibcall */, AllocaIP, X.Var);
@@ -9384,9 +9383,8 @@ OpenMPIRBuilder::createAtomicWrite(const LocationDescription &Loc,
     XSt->setAtomic(AO);
   } else if (XElemTy->isStructTy()) {
     LoadInst *OldVal = Builder.CreateLoad(XElemTy, X.Var, "omp.atomic.read");
-    const DataLayout &LoadDL = OldVal->getModule()->getDataLayout();
-    unsigned LoadSize =
-        LoadDL.getTypeStoreSize(OldVal->getPointerOperand()->getType());
+    const DataLayout &DL = OldVal->getModule()->getDataLayout();
+    unsigned LoadSize = DL.getTypeStoreSize(XElemTy);
     OpenMPIRBuilder::AtomicInfo atomicInfo(
         &Builder, XElemTy, LoadSize * 8, LoadSize * 8, OldVal->getAlign(),
         OldVal->getAlign(), true /* UseLibcall */, AllocaIP, X.Var);
 | 
e536d92    to
    b0b04fd      
    Compare
  
    b0b04fd    to
    e820cd5      
    Compare
  
    | @kparzysz or @tblah Can one of you please review this fix for #165184? The issue was that atomic operations on  The fix changes the size calculation in  Thanks in advance for your time! CI all green | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| In the actual commit, please only put a summary of the PR, for example 
 | 
Fixes llvm#165184 In OMPIRBuilder::createAtomicRead() and createAtomicWrite(), the size parameter for __atomic_load/__atomic_store was incorrectly computed from the pointer type instead of the pointee (element) type. On 64-bit systems, this resulted in only 8 bytes being transferred regardless of the actual struct size. Changed both functions to use XElemTy (element type) instead of the pointer type when computing LoadSize. This ensures the full struct is transferred.
e820cd5    to
    5524ecd      
    Compare
  
    | If the fix is in OpenMPIRBuilder, it is probably best to have the test also close to it. | 
| @kparzysz Commit message updated and squashed per review. CI all green!! @kiranchandramohan Re: colocating tests with the fix in OpenMPIRBuilder, plan is: 
 I can keep the Flang IR tests as frontend coverage, or drop them if you prefer only the two tests above. Please advise. @kparzysz Does this test plan align with your expectations? and are we on the same page? | 
| You can just add the additional tests to this PR. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the bug fix!
| ✅ With the latest revision this PR passed the C/C++ code formatter. | 
OpenMP atomic operations on complex(8) were using pointer size (8 bytes) instead of the actual element size (16 bytes), causing only 8 bytes to be stored/loaded instead of the full 16 bytes. Fixed by using DL.getTypeStoreSize(XElemTy) instead of pointer type size in createAtomicRead and createAtomicWrite for both atomic operations. Added Flang IR tests to verify correct size generation for complex(4) and complex(8) atomic operations, and unit test to verify struct types with __atomic_store/__atomic_load. Fixes llvm#165184
1b45069    to
    81e5343      
    Compare
  
    | @kiranchandramohan , @kparzysz - I've addressed the feedback and made the requested changes. The PR now has: Unit test for atomic operations on struct types (test  
 The commit history has been squashed into a single clean commit with a descriptive message. Please review when you get a chance. CI is all green, Good to Merge!! | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
Fixes llvm#165184 In OMPIRBuilder::createAtomicRead() and createAtomicWrite(), the size parameter for __atomic_load/__atomic_store was incorrectly computed from the pointer type instead of the pointee (element) type. On 64-bit systems, this resulted in only 8 bytes being transferred regardless of the actual struct size. Changed both functions to use XElemTy (element type) instead of the pointer type when computing LoadSize. This ensures the full struct is transferred.
Fixes #165184
Problem
When using OpenMP atomic read/write on Fortran
complex(8)(16 bytes) orcomplex(16)(32 bytes), only the real part was being stored/loaded. The imaginary part remained unchanged (typically 0).Root Cause
In
OMPIRBuilder::createAtomicRead()andcreateAtomicWrite(), the size parameter for__atomic_load/__atomic_storewas incorrectly computed from the pointer type instead of the pointee (element) type.On 64-bit systems, this resulted in only 8 bytes being transferred regardless of the actual struct size.
Solution
Changed both functions to use
XElemTy(element type) instead of the pointer type when computingLoadSize. This ensures the full struct is transferred:complex(4): 8 bytes ✓complex(8): 16 bytes ✓ (was 8 before)complex(16): 32 bytes ✓ (was 8 before)Tests
Added three regression tests:
atomic-write-complex.f90- Verifies IR for atomic writeatomic-read-complex.f90- Verifies IR for atomic readissue-165184-atomic-complex8.f90- Direct reproducer from the issue