Skip to content

Conversation

@int3
Copy link
Contributor

@int3 int3 commented Oct 27, 2025

Since equalsVariable runs a lot more times, we want to minimize the work it
needs to do. Anything not dependent on the icfEqClass values should get hoisted
out.

With this change, ICF runs ~1.7% faster when linking clang.

Benchmarking approach:

    cbdr sample -b ~/extract-icf-time.sh ~/old/ld64.lld bin/ld64.lld --timeout=300s | cbdr analyze -s 95

extract-icf-time.sh runs the clang link command with the --icf=all --time-trace flags, then parses out the ICF duration from the resulting time
trace using jq:

    jq '{ICF: (.traceEvents[] | select(.name == "Fold Identical Code Sections") | .dur)}'

Output:

             </Users/jezng/extract-icf-time.sh ["/Users/jezng/old/ld64.lld"]>  </Users/jezng/extract-icf-time.sh ["bin/ld64.lld"]>  difference (95% CI)
    ICF      83678.207 ± 1502.778                                              82234.751 ± 1290.984                                 [  -2.0% ..   -1.4%]
    samples  208                                                               225

@llvmbot
Copy link
Member

llvmbot commented Oct 27, 2025

@llvm/pr-subscribers-lld

Author: Jez Ng (int3)

Changes

Since equalsVariable runs a lot more times, we want to minimize the work it
needs to do. Anything not dependent on the icfEqClass values should get hoisted
out.

With this change, ICF runs ~1.7% faster when linking clang.

Benchmarking approach:

    cbdr sample -b ~/extract-icf-time.sh ~/old/ld64.lld bin/ld64.lld --timeout=300s | cbdr analyze -s 95

extract-icf-time.sh runs the clang link command with the --icf=all --time-trace flags, then parses out the ICF duration from the resulting time
trace using jq:

    jq '{ICF: (.traceEvents[] | select(.name == "Fold Identical Code Sections") | .dur)}'

Output:

             &lt;/Users/jezng/extract-icf-time.sh ["/Users/jezng/old/ld64.lld"]&gt;  &lt;/Users/jezng/extract-icf-time.sh ["bin/ld64.lld"]&gt;  difference (95% CI)
    ICF      83678.207 ± 1502.778                                              82234.751 ± 1290.984                                 [  -2.0% ..   -1.4%]
    samples  208                                                               225

Full diff: https://github.com/llvm/llvm-project/pull/165325.diff

1 Files Affected:

  • (modified) lld/MachO/ICF.cpp (+34-22)
diff --git a/lld/MachO/ICF.cpp b/lld/MachO/ICF.cpp
index 7b31378c3781e..aa72e0d6b15d6 100644
--- a/lld/MachO/ICF.cpp
+++ b/lld/MachO/ICF.cpp
@@ -179,8 +179,30 @@ bool ICF::equalsConstant(const ConcatInputSection *ia,
       return isecA->getOffset(ra.addend) == isecB->getOffset(rb.addend);
     }
   };
-  return std::equal(ia->relocs.begin(), ia->relocs.end(), ib->relocs.begin(),
-                    f);
+  if (!std::equal(ia->relocs.begin(), ia->relocs.end(), ib->relocs.begin(), f))
+    return false;
+
+  // Check unwind info structural compatibility: if there are symbols with
+  // associated unwind info, check that both sections have compatible symbol
+  // layouts. For simplicity, we only attempt folding when all symbols are at
+  // offset zero within the section (which is typically the case with
+  // .subsections_via_symbols.)
+  auto hasUnwind = [](Defined *d) { return d->unwindEntry() != nullptr; };
+  const auto *itA = llvm::find_if(ia->symbols, hasUnwind);
+  const auto *itB = llvm::find_if(ib->symbols, hasUnwind);
+  if (itA == ia->symbols.end())
+    return itB == ib->symbols.end();
+  if (itB == ib->symbols.end())
+    return false;
+  const Defined *da = *itA;
+  const Defined *db = *itB;
+  if (da->value != 0 || db->value != 0)
+    return false;
+  auto isZero = [](Defined *d) { return d->value == 0; };
+  return std::find_if_not(std::next(itA), ia->symbols.end(), isZero) ==
+             ia->symbols.end() &&
+         std::find_if_not(std::next(itB), ib->symbols.end(), isZero) ==
+             ib->symbols.end();
 }
 
 // Compare the "moving" parts of two ConcatInputSections -- i.e. everything not
@@ -220,28 +242,18 @@ bool ICF::equalsVariable(const ConcatInputSection *ia,
   if (!std::equal(ia->relocs.begin(), ia->relocs.end(), ib->relocs.begin(), f))
     return false;
 
-  // If there are symbols with associated unwind info, check that the unwind
-  // info matches. For simplicity, we only handle the case where there are only
-  // symbols at offset zero within the section (which is typically the case with
-  // .subsections_via_symbols.)
+  // Compare unwind info equivalence classes.
   auto hasUnwind = [](Defined *d) { return d->unwindEntry() != nullptr; };
   const auto *itA = llvm::find_if(ia->symbols, hasUnwind);
-  const auto *itB = llvm::find_if(ib->symbols, hasUnwind);
-  if (itA == ia->symbols.end())
-    return itB == ib->symbols.end();
-  if (itB == ib->symbols.end())
-    return false;
-  const Defined *da = *itA;
-  const Defined *db = *itB;
-  if (da->unwindEntry()->icfEqClass[icfPass % 2] !=
-          db->unwindEntry()->icfEqClass[icfPass % 2] ||
-      da->value != 0 || db->value != 0)
-    return false;
-  auto isZero = [](Defined *d) { return d->value == 0; };
-  return std::find_if_not(std::next(itA), ia->symbols.end(), isZero) ==
-             ia->symbols.end() &&
-         std::find_if_not(std::next(itB), ib->symbols.end(), isZero) ==
-             ib->symbols.end();
+  if (itA != ia->symbols.end()) {
+    const Defined *da = *itA;
+    // equalsConstant() guarantees that both sections have unwind info.
+    const Defined *db = *llvm::find_if(ib->symbols, hasUnwind);
+    if (da->unwindEntry()->icfEqClass[icfPass % 2] !=
+        db->unwindEntry()->icfEqClass[icfPass % 2])
+      return false;
+  }
+  return true;
 }
 
 // Find the first InputSection after BEGIN whose equivalence class differs

@llvmbot
Copy link
Member

llvmbot commented Oct 27, 2025

@llvm/pr-subscribers-lld-macho

Author: Jez Ng (int3)

Changes

Since equalsVariable runs a lot more times, we want to minimize the work it
needs to do. Anything not dependent on the icfEqClass values should get hoisted
out.

With this change, ICF runs ~1.7% faster when linking clang.

Benchmarking approach:

    cbdr sample -b ~/extract-icf-time.sh ~/old/ld64.lld bin/ld64.lld --timeout=300s | cbdr analyze -s 95

extract-icf-time.sh runs the clang link command with the --icf=all --time-trace flags, then parses out the ICF duration from the resulting time
trace using jq:

    jq '{ICF: (.traceEvents[] | select(.name == "Fold Identical Code Sections") | .dur)}'

Output:

             &lt;/Users/jezng/extract-icf-time.sh ["/Users/jezng/old/ld64.lld"]&gt;  &lt;/Users/jezng/extract-icf-time.sh ["bin/ld64.lld"]&gt;  difference (95% CI)
    ICF      83678.207 ± 1502.778                                              82234.751 ± 1290.984                                 [  -2.0% ..   -1.4%]
    samples  208                                                               225

Full diff: https://github.com/llvm/llvm-project/pull/165325.diff

1 Files Affected:

  • (modified) lld/MachO/ICF.cpp (+34-22)
diff --git a/lld/MachO/ICF.cpp b/lld/MachO/ICF.cpp
index 7b31378c3781e..aa72e0d6b15d6 100644
--- a/lld/MachO/ICF.cpp
+++ b/lld/MachO/ICF.cpp
@@ -179,8 +179,30 @@ bool ICF::equalsConstant(const ConcatInputSection *ia,
       return isecA->getOffset(ra.addend) == isecB->getOffset(rb.addend);
     }
   };
-  return std::equal(ia->relocs.begin(), ia->relocs.end(), ib->relocs.begin(),
-                    f);
+  if (!std::equal(ia->relocs.begin(), ia->relocs.end(), ib->relocs.begin(), f))
+    return false;
+
+  // Check unwind info structural compatibility: if there are symbols with
+  // associated unwind info, check that both sections have compatible symbol
+  // layouts. For simplicity, we only attempt folding when all symbols are at
+  // offset zero within the section (which is typically the case with
+  // .subsections_via_symbols.)
+  auto hasUnwind = [](Defined *d) { return d->unwindEntry() != nullptr; };
+  const auto *itA = llvm::find_if(ia->symbols, hasUnwind);
+  const auto *itB = llvm::find_if(ib->symbols, hasUnwind);
+  if (itA == ia->symbols.end())
+    return itB == ib->symbols.end();
+  if (itB == ib->symbols.end())
+    return false;
+  const Defined *da = *itA;
+  const Defined *db = *itB;
+  if (da->value != 0 || db->value != 0)
+    return false;
+  auto isZero = [](Defined *d) { return d->value == 0; };
+  return std::find_if_not(std::next(itA), ia->symbols.end(), isZero) ==
+             ia->symbols.end() &&
+         std::find_if_not(std::next(itB), ib->symbols.end(), isZero) ==
+             ib->symbols.end();
 }
 
 // Compare the "moving" parts of two ConcatInputSections -- i.e. everything not
@@ -220,28 +242,18 @@ bool ICF::equalsVariable(const ConcatInputSection *ia,
   if (!std::equal(ia->relocs.begin(), ia->relocs.end(), ib->relocs.begin(), f))
     return false;
 
-  // If there are symbols with associated unwind info, check that the unwind
-  // info matches. For simplicity, we only handle the case where there are only
-  // symbols at offset zero within the section (which is typically the case with
-  // .subsections_via_symbols.)
+  // Compare unwind info equivalence classes.
   auto hasUnwind = [](Defined *d) { return d->unwindEntry() != nullptr; };
   const auto *itA = llvm::find_if(ia->symbols, hasUnwind);
-  const auto *itB = llvm::find_if(ib->symbols, hasUnwind);
-  if (itA == ia->symbols.end())
-    return itB == ib->symbols.end();
-  if (itB == ib->symbols.end())
-    return false;
-  const Defined *da = *itA;
-  const Defined *db = *itB;
-  if (da->unwindEntry()->icfEqClass[icfPass % 2] !=
-          db->unwindEntry()->icfEqClass[icfPass % 2] ||
-      da->value != 0 || db->value != 0)
-    return false;
-  auto isZero = [](Defined *d) { return d->value == 0; };
-  return std::find_if_not(std::next(itA), ia->symbols.end(), isZero) ==
-             ia->symbols.end() &&
-         std::find_if_not(std::next(itB), ib->symbols.end(), isZero) ==
-             ib->symbols.end();
+  if (itA != ia->symbols.end()) {
+    const Defined *da = *itA;
+    // equalsConstant() guarantees that both sections have unwind info.
+    const Defined *db = *llvm::find_if(ib->symbols, hasUnwind);
+    if (da->unwindEntry()->icfEqClass[icfPass % 2] !=
+        db->unwindEntry()->icfEqClass[icfPass % 2])
+      return false;
+  }
+  return true;
 }
 
 // Find the first InputSection after BEGIN whose equivalence class differs

@int3 int3 requested review from alx32, ellishg and nico October 27, 2025 22:08
Comment on lines +202 to +205
return std::find_if_not(std::next(itA), ia->symbols.end(), isZero) ==
ia->symbols.end() &&
std::find_if_not(std::next(itB), ib->symbols.end(), isZero) ==
ib->symbols.end();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't check if all symbols have offset zero, it only checks if all symbols after the first one with unwind info has offset zero. Is that what we want?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my memory is a bit hazy here, but this is preserving the logic of the existing code :)

but I think it works because symbols is sorted in ascending order of offset, and since we're checking that the first one with unwind info has offset zero, this subsequent check ensures all symbols in the section have offset zero too.

Since equalsVariable runs a lot more times, we want to minimize the work it
needs to do. Anything not dependent on the icfEqClass values should get hoisted
out.

With this change, ICF runs ~1.7% faster when linking clang.

Benchmarking approach:

	cbdr sample -b ~/extract-icf-time.sh ~/old/ld64.lld bin/ld64.lld --timeout=300s | cbdr analyze -s 95

`extract-icf-time.sh` runs the clang link command with the `--icf=all
--time-trace` flags, then parses out the ICF duration from the resulting time
trace using `jq`:

	jq '{ICF: (.traceEvents[] | select(.name == "Fold Identical Code Sections") | .dur)}'

Output:
		 </Users/jezng/extract-icf-time.sh ["/Users/jezng/old/ld64.lld"]>  </Users/jezng/extract-icf-time.sh ["bin/ld64.lld"]>  difference (95% CI)
	ICF      83678.207 ± 1502.778                                              82234.751 ± 1290.984                                 [  -2.0% ..   -1.4%]
	samples  208                                                               225
@int3
Copy link
Contributor Author

int3 commented Oct 27, 2025

hmm, looks like the aarch64 test runner is not responding. the other tests seem fine though

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants