-
Notifications
You must be signed in to change notification settings - Fork 124
Enable ThreadSanitizer across the entire multi-threaded JIT pipeline #618
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Benchmarks
| Benchmark suite | Current: 061612c | Previous: 2b643f0 | Ratio |
|---|---|---|---|
Dhrystone |
1246 Average DMIPS over 10 runs |
1304 Average DMIPS over 10 runs |
1.05 |
Coremark |
917.55 Average iterations/sec over 10 runs |
914.776 Average iterations/sec over 10 runs |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
13b3ac5 to
8234e97
Compare
2497867 to
2b06e85
Compare
ThreadSanitizer (TSAN) can now detect race conditions across the entire multi-threaded JIT pipeline with full 4GB address space emulation. This enables testing of the tier-2 LLVM compilation thread while maintaining production memory layout. Memory Layout (TSAN-compatible): - Main memory: MAP_FIXED at 0x7d0000000000 (4GB) - JIT buffer: MAP_FIXED at 0x7d1000000000 - Both allocations within TSAN app range (0x7cf-0x7ff trillion) - Prevents conflicts with TSAN shadow memory (0x02a-0x7ce trillion) ASLR Mitigation: - Added setarch -R wrapper for TSAN test execution - Disables ASLR to prevent random allocations in shadow memory - Only affects test runs, not production builds SDL Conflict Resolution: - SDL (uninstrumented system library) creates threads TSAN cannot track - Disabled SDL when TSAN enabled to focus on built-in race detection - Production builds still fully support SDL
This commit adds ThreadSanitizer (TSAN) support for ARM64/Apple Silicon and fixes critical JIT instruction cache coherency issues. ARM64 TSAN Support: - Extended TSAN-compatible memory allocation to ARM64 architecture - Main memory allocated at fixed address 0x150000000000 (21TB) - JIT buffer allocated at 0x151000000000 with MAP_JIT for Apple Silicon - Both allocations avoid TSAN shadow memory and enable race detection - Note: Requires ASLR disabled on macOS (SIP restrictions may apply) JIT Cache Coherency Fixes: 1. Fixed pthread_jit_write_protect_np() ordering in update_branch_imm 2. Added sys_icache_invalidate() in update_branch_imm 3. Added cache invalidation in resolve_jumps() for x86_64 Fix JIT regalloc conflicts in memory load After reset_reg() clears the register allocator state, load instructions (lb/lh/lw/lbu/lhu) could reallocate the same host register for both the address and destination, causing data corruption. This commit uses map_vm_reg_reserved() to prevent reusing the address register.
b7da692 to
13ecaff
Compare
This commit introduces a comprehensive JIT debugging infrastructure to catch register allocation conflicts and cache coherency issues before they cause subtle runtime failures in production.
User-space emulation tests were failing because ENABLE_ELF_LOADER defaulted to 0, preventing ELF file loading. The fix automatically enables ELF_LOADER when SYSTEM=0, as user-space mode always requires it to load test ELF files.
75f6782 to
7711e50
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3 issues found across 13 files
Prompt for AI agents (all 3 issues)
Understand the root cause of the following 3 issues and fix them.
<file name=".github/workflows/main.yml">
<violation number="1" location=".github/workflows/main.yml:308">
Piping `apt update` through `tee` hides the command’s non-zero exit status, so cases like signature failures or apt lock conflicts skip retries and we continue with a broken package index. Please preserve the real exit code (e.g., enable `set -o pipefail` or capture `${PIPESTATUS[0]}` after the pipeline) before deciding to break.</violation>
<violation number="2" location=".github/workflows/main.yml:310">
The failure filter ignores `InRelease`, so an `apt update` error like `Failed to fetch … InRelease` is treated as success and we skip the remaining retries. Please include `InRelease` in the pattern so we keep retrying on those critical errors.</violation>
</file>
<file name="src/main.c">
<violation number="1" location="src/main.c:31">
The new TSAN options block is guarded by __SANITIZE_THREAD__ only, so clang TSAN builds skip __tsan_default_options entirely and never apply the intended runtime configuration.</violation>
</file>
React with 👍 or 👎 to teach cubic. Mention @cubic-dev-ai to give feedback, ask questions, or re-run the review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 issues found across 13 files
Prompt for AI agents (all 2 issues)
Understand the root cause of the following 2 issues and fix them.
<file name="Makefile">
<violation number="1" location="Makefile:109">
`setarch` is Linux-only; with TSAN enabled on macOS the test recipe expands to `setarch $(uname -m) -R …` and fails because the binary is missing. Please gate the wrapper by host OS or provide a macOS-safe alternative so TSAN checks remain usable cross-platform.</violation>
</file>
<file name=".github/workflows/main.yml">
<violation number="1" location=".github/workflows/main.yml:310">
`apt update` reports release failures as `Failed to fetch …/InRelease`, but this regex only looks for Packages/Sources/Release. As a result the retry loop treats an `InRelease` failure as success and never retries, undermining the fix and leaving the job without refreshed indices. Please include `InRelease` in the match so real release failures still trigger retries.</violation>
</file>
React with 👍 or 👎 to teach cubic. Mention @cubic-dev-ai to give feedback, ask questions, or re-run the review.
| override ENABLE_LTO := 0 # LTO interferes with TSAN instrumentation | ||
| CFLAGS += -DTSAN_ENABLED # Signal code to use TSAN-compatible allocations | ||
| # Disable ASLR for TSAN tests to prevent allocations in TSAN shadow memory | ||
| BIN_WRAPPER = setarch $(shell uname -m) -R |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
setarch is Linux-only; with TSAN enabled on macOS the test recipe expands to setarch $(uname -m) -R … and fails because the binary is missing. Please gate the wrapper by host OS or provide a macOS-safe alternative so TSAN checks remain usable cross-platform.
Prompt for AI agents
Address the following comment on Makefile at line 109:
<comment>`setarch` is Linux-only; with TSAN enabled on macOS the test recipe expands to `setarch $(uname -m) -R …` and fails because the binary is missing. Please gate the wrapper by host OS or provide a macOS-safe alternative so TSAN checks remain usable cross-platform.</comment>
<file context>
@@ -80,6 +73,44 @@ endif
+override ENABLE_LTO := 0 # LTO interferes with TSAN instrumentation
+CFLAGS += -DTSAN_ENABLED # Signal code to use TSAN-compatible allocations
+# Disable ASLR for TSAN tests to prevent allocations in TSAN shadow memory
+BIN_WRAPPER = setarch $(shell uname -m) -R
+else
+BIN_WRAPPER =
</file context>
|
@cubic-dev-ai The review finding about MAP_JIT is incorrect. Verification: The code at src/jit.c:2473-2477 (x86_64 TSAN path) already includes MAP_JIT: state->buf = mmap(jit_addr, size, PROT_READ | PROT_WRITE | PROT_EXEC,
MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED
#if defined(__APPLE__)
| MAP_JIT // ← Present on line 2476
#endif
,
-1, 0);All three mmap paths include MAP_JIT on macOS:
The MAP_JIT flag was added in commit f1b685e ("Add Arm64 TSAN support and fix JIT cache coherency") and is present in the current PR. Please re-scan or verify against the actual code at commit 7711e50. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 issues found across 13 files
Prompt for AI agents (all 2 issues)
Understand the root cause of the following 2 issues and fix them.
<file name="src/main.c">
<violation number="1" location="src/main.c:307">
Raising the default log level to LOG_INFO suppresses the rv_log_trace() stream that powers the -t instruction trace (rv_log_set_level(attr->log_level) now filters those messages), so trace mode stops emitting output.</violation>
</file>
<file name="Makefile">
<violation number="1" location="Makefile:109">
TSAN runs now always prepend `setarch`, but macOS does not ship this binary; enabling TSAN on mac hosts will make `make check` fail before the emulator starts. Please gate the wrapper to Linux or gracefully skip it when `setarch` is unavailable.</violation>
</file>
React with 👍 or 👎 to teach cubic. Mention @cubic-dev-ai to give feedback, ask questions, or re-run the review.
| override ENABLE_LTO := 0 # LTO interferes with TSAN instrumentation | ||
| CFLAGS += -DTSAN_ENABLED # Signal code to use TSAN-compatible allocations | ||
| # Disable ASLR for TSAN tests to prevent allocations in TSAN shadow memory | ||
| BIN_WRAPPER = setarch $(shell uname -m) -R |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TSAN runs now always prepend setarch, but macOS does not ship this binary; enabling TSAN on mac hosts will make make check fail before the emulator starts. Please gate the wrapper to Linux or gracefully skip it when setarch is unavailable.
Prompt for AI agents
Address the following comment on Makefile at line 109:
<comment>TSAN runs now always prepend `setarch`, but macOS does not ship this binary; enabling TSAN on mac hosts will make `make check` fail before the emulator starts. Please gate the wrapper to Linux or gracefully skip it when `setarch` is unavailable.</comment>
<file context>
@@ -80,6 +73,44 @@ endif
+override ENABLE_LTO := 0 # LTO interferes with TSAN instrumentation
+CFLAGS += -DTSAN_ENABLED # Signal code to use TSAN-compatible allocations
+# Disable ASLR for TSAN tests to prevent allocations in TSAN shadow memory
+BIN_WRAPPER = setarch $(shell uname -m) -R
+else
+BIN_WRAPPER =
</file context>
| BIN_WRAPPER = setarch $(shell uname -m) -R | |
| BIN_WRAPPER = $(if $(filter Linux,$(UNAME_S)),setarch $(shell uname -m) -R,) |
fad6d02 to
669efc1
Compare
This commit addresses multiple regressions introduced in recent changes: 1. DTB compilation regression - DTB dependencies moved outside CC_IS_EMCC conditional - Ensures DTB builds for system mode regardless of compiler - Fixes mk/wasm.mk structure for cross-platform consistency 2. Makefile syntax error in mk/toolchain.mk - Fixed TAB characters before $(warning) on lines 25, 28 - Changed to spaces for proper control flow - This was blocking all Makefile parsing 3. emcc configuration pollution - Added 'make distclean' before emcc builds in workflow - Prevents ENABLE_SYSTEM=1 from leaking between builds - Fixes "build/minimal.dtb does not exist" errors 4. Ubuntu ARM64 apt-get failures - Implemented exponential backoff retry mechanism (30s, 60s delays) - Added pipefail to preserve apt exit codes through tee - Explicit APT_EXIT capture to detect masked failures - Added InRelease to failure pattern (modern combined Release+GPG) - Ignore non-critical dep11 metadata failures - Focus on core package indices (Packages/Sources/Release/InRelease) 5. TSAN cross-compiler compatibility (fixed __has_feature issue) - Changed from defined(__has_feature) to defined(__clang__) - GCC doesn't support __has_feature, causing preprocessor errors - __has_feature only works when __clang__ is defined - Ensures __tsan_default_options() works with both GCC and clang 6. TSAN cross-platform compatibility - Guarded setarch with ifeq ($(UNAME_S),Linux) in Makefile - setarch doesn't exist on macOS, now conditionally applied - macOS TSAN builds require SIP disabled for ASLR control 7. Trace functionality regression - Reverted .log_level from LOG_INFO back to LOG_TRACE - LOG_INFO suppressed rv_log_trace() stream used by -t flag - Restores instruction trace output for debugging
669efc1 to
d1beb5a
Compare
Add comprehensive ThreadSanitizer output validation to prevent silent test failures. TSAN tests now explicitly check for data races and fail immediately with diagnostic output instead of masking errors. Implementation: - Capture TSAN stderr/stdout to log files for analysis - Pattern match race indicators: "ThreadSanitizer: data race", "ThreadSanitizer: race on", "WARNING: ThreadSanitizer:" - Exit with status 1 immediately upon race detection - Display race context (10 lines) for debugging - Progress indicators for 3-tier validation (Interpreter, JIT-T1, JIT-T2) Platform-Specific Handling: - Linux (x64/ARM64): Use setarch -R for ASLR mitigation (already gated in Makefile with ifeq ($(UNAME_S),Linux)) - macOS: NO setarch (not available), rely on MAP_FIXED allocations at 0x150000000000; gracefully handle SIP restrictions by distinguishing MAP_FAILED errors from actual race conditions Race Detection Patterns: ThreadSanitizer: data race # Standard race report ThreadSanitizer: race on # Race on specific object WARNING: ThreadSanitizer: # General TSAN warnings Error Handling (macOS): MAP_FAILED # mmap failure unexpected memory mapping # TSAN shadow conflict FATAL: ThreadSanitizer # Initialization failure → Skip test with warning (SIP restriction) → Still fail hard on actual races Benefits: - Immediate failure on race detection (fail-fast principle) - Clear diagnostic output in CI logs with race location/context - Platform-aware: Linux uses setarch -R, macOS handles SIP gracefully - No silent failures: Previously masked errors now cause test failure - Debugging support: Log files preserved for post-mortem analysis Validates race condition fixes from: - 2a2a5c4: TSAN with FULL4G and T2C support - f1b685e: ARM64 TSAN support and JIT cache coherency - 669efc1: Build system regressions (setarch gating, TSAN compatibility)
9dc456d to
49cd220
Compare
ARM64 CI builds fail when apt update returns exit code 100 due to dep11 metadata size mismatches during Ubuntu mirror synchronization. The fix uses '|| true' to prevent the exit code from failing the action, then manually checks for critical package index failures (Packages/Sources/ Release/InRelease). Non-critical dep11 metadata failures are ignored. Package installation uses apt-get with --fix-missing for robustness. Also fixes stale build configuration bug where 'make clean' doesn't regenerate build/.config, causing JIT extension tests to fail with 'map_file() (null) failed' errors. Changed to 'make distclean' for proper configuration reset between test runs.
2b643f0 to
061612c
Compare
Enable ThreadSanitizer across the entire multi-threaded JIT pipeline with FULL4G emulation support and comprehensive race detection.
Key Improvements
ThreadSanitizer Integration
JIT Performance & Stability
Race Condition Fixes
Build & CI
Validation
Performance optimization validated through systematic analysis and external AI review. See claudedocs/PR618-Multi-Model-Review-Summary.md for details.