Skip to content

Conversation

@markovejnovic
Copy link
Collaborator

What does this PR do?

How did you verify your code works?

@linear
Copy link

linear bot commented Oct 28, 2025

@robobun
Copy link
Collaborator

robobun commented Oct 28, 2025

Updated 10:23 AM PT - Oct 28th, 2025

@markovejnovic, your commit ae765a0 is building: #30429

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 28, 2025

Walkthrough

Adds a hosted-git-info Zig implementation, integrates it into dependency parsing, introduces URL fragment accessors, several string/path utilities, and extensive tests for hosted-git-info parsing and parsing helpers.

Changes

Cohort / File(s) Summary
URL Fragment Accessors
src/bun.js/bindings/BunString.cpp, src/bun.js/bindings/URL.zig
Added two C externs URL__hash and URL__fragmentIdentifier and Zig externs plus public wrapper methods hash and fragmentIdentifier to return URL fragment with and without leading #.
Hosted Git Info Implementation
src/install/hosted_git_info.zig
New module HostedGitInfo with parsing, normalization, provider configs, representation formatting, JS testing APIs, and memory/error handling for Git URL/shortcut support across multiple providers.
Dependency Resolution Updates
src/install/dependency.zig
Removed isGitHubRepoPath; switched to hosted_git_info detection (isGitHubShorthand / fromUrl) across parse/infer paths, added allocator-backed slow-paths and bridging logic for non-GitHub hosts.
String Utilities & Buffers
src/string/immutable.zig, src/string/immutable/paths.zig
Added helpers: lastIndexBeforeChar, charAtT, substring, countLeadingChar, case-insensitive endsWith variants, SlicedBuffer type; removed toWDirNormalized; updated concat signatures to bun.OOM.
Path Separator Detection
src/resolver/resolve_path.zig
Added pub fn hasPlatformPathSeparators(input_path: []const u8) bool to detect platform-specific path separators.
Comptime Map Helper
src/comptime_string_map.zig
Added pub fn getKey(value: V) ?[]const KeyType to lookup a key by value via linear search.
JS Testing Exports
src/js/internal-for-testing.ts
Exposed hostedGitInfo testing object with parseUrl and fromUrl Zig-backed functions.
Tests / Fixtures
test/cli/install/hosted-git-info/cases.ts, test/cli/install/hosted-git-info/from-url.test.ts, test/cli/install/hosted-git-info/parse-url.test.ts
Added comprehensive valid/invalid URL datasets and tests asserting parseUrl/fromUrl behavior across providers and formats.

Possibly related PRs

  • fix(npa): Cleanup NPA code #23358: Adds the same URL fragment accessor functions (URL__hash, URL__fragmentIdentifier) and corresponding Zig bindings, indicating a direct code-level overlap.

Pre-merge checks

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Description Check ⚠️ Warning The pull request description contains only the template section headers "What does this PR do?" and "How did you verify your code works?" without any substantive content. Both required sections remain empty placeholders, failing to explain what changes were made or how the implementation was verified. The template structure is present but unfilled, making the description incomplete and providing no meaningful information about the PR. The author should fill in both sections of the description. The "What does this PR do?" section should explain that this implements a Zig clone of the npm hosted-git-info library, including URL parsing, normalization, and multi-provider support. The "How did you verify your code works?" section should describe the testing approach, such as the test files added (parse-url.test.ts, from-url.test.ts) and any manual verification performed.
✅ Passed checks (3 passed)
Check name Status Explanation
Title Check ✅ Passed The title "feat(ENG-21324): Implement hosted_git_info.zig" is specific and concise, clearly identifying the main change without noise or vague terminology. It directly corresponds to the primary objective outlined in the linked issue ENG-21324, which is to implement the hosted_git_info.zig library. The title accurately reflects that this is a feature addition for implementing this specific module.
Linked Issues Check ✅ Passed The PR successfully implements the primary objective from ENG-21324: creating a Zig library that clones the npm hosted-git-info package. The new src/install/hosted_git_info.zig file provides comprehensive functionality including HostedGitInfo struct, fromUrl parsing, toString representation, URL normalization, and multi-provider support (github, bitbucket, gitlab, gist, sourcehut). Supporting changes integrate this library into the codebase (dependency.zig refactoring, testing APIs in internal-for-testing.ts) and include test cases validating the parsing logic.
Out of Scope Changes Check ✅ Passed The PR includes supporting infrastructure changes beyond the core hosted_git_info.zig implementation: URL binding additions (BunString.cpp, URL.zig), string utility functions (immutable.zig), path resolution utilities (resolve_path.zig), and data structure enhancements (comptime_string_map.zig). These appear to be foundational utilities needed to support URL parsing, normalization, and string manipulation within the hosted_git_info implementation, representing necessary supporting infrastructure rather than unrelated scope creep.

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 3f7668a and de14245.

📒 Files selected for processing (1)
  • src/install/dependency.zig (10 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.zig

📄 CodeRabbit inference engine (.cursor/rules/javascriptcore-class.mdc)

**/*.zig: Declare the extern C symbol in Zig and export a Zig-friendly alias for use
Wrap the Bun____toJS extern in a Zig method that takes a JSGlobalObject and returns JSC.JSValue

Files:

  • src/install/dependency.zig
src/**/*.zig

📄 CodeRabbit inference engine (.cursor/rules/building-bun.mdc)

When adding debug logs in Zig, create a scoped logger and log via Bun APIs: const log = bun.Output.scoped(.${SCOPE}, .hidden); then log("...", .{})

src/**/*.zig: Use Zig private fields with the # prefix for encapsulation (e.g., struct { #foo: u32 })
Prefer Decl literals for initialization (e.g., const decl: Decl = .{ .binding = 0, .value = 0 };)
Place @import statements at the bottom of the file (formatter will handle ordering)

src/**/*.zig: In Zig code, manage memory carefully: use appropriate allocators and defer for cleanup
Cache JavaScriptCore class structures in ZigGlobalObject when adding new classes

Files:

  • src/install/dependency.zig
🔇 Additional comments (2)
src/install/dependency.zig (2)

820-828: JS bridge: conversion path looks good

Converts JSValue -> BunString -> UTF‑8 and defers cleanup, then calls Tag.infer. Solid and safe.


1505-1505: Import placement matches project guidance

@import("./hosted_git_info.zig") is at the bottom with other imports, aligning with src/**/*.zig guideline.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 14

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 668eba0 and 3f7668a.

📒 Files selected for processing (12)
  • src/bun.js/bindings/BunString.cpp (1 hunks)
  • src/bun.js/bindings/URL.zig (1 hunks)
  • src/comptime_string_map.zig (1 hunks)
  • src/install/dependency.zig (10 hunks)
  • src/install/hosted_git_info.zig (1 hunks)
  • src/js/internal-for-testing.ts (1 hunks)
  • src/resolver/resolve_path.zig (1 hunks)
  • src/string/immutable.zig (10 hunks)
  • src/string/immutable/paths.zig (0 hunks)
  • test/cli/install/hosted-git-info/cases.ts (1 hunks)
  • test/cli/install/hosted-git-info/from-url.test.ts (1 hunks)
  • test/cli/install/hosted-git-info/parse-url.test.ts (1 hunks)
💤 Files with no reviewable changes (1)
  • src/string/immutable/paths.zig
🧰 Additional context used
📓 Path-based instructions (13)
**/*.zig

📄 CodeRabbit inference engine (.cursor/rules/javascriptcore-class.mdc)

**/*.zig: Declare the extern C symbol in Zig and export a Zig-friendly alias for use
Wrap the Bun____toJS extern in a Zig method that takes a JSGlobalObject and returns JSC.JSValue

Files:

  • src/comptime_string_map.zig
  • src/bun.js/bindings/URL.zig
  • src/resolver/resolve_path.zig
  • src/install/dependency.zig
  • src/install/hosted_git_info.zig
  • src/string/immutable.zig
src/**/*.zig

📄 CodeRabbit inference engine (.cursor/rules/building-bun.mdc)

When adding debug logs in Zig, create a scoped logger and log via Bun APIs: const log = bun.Output.scoped(.${SCOPE}, .hidden); then log("...", .{})

src/**/*.zig: Use Zig private fields with the # prefix for encapsulation (e.g., struct { #foo: u32 })
Prefer Decl literals for initialization (e.g., const decl: Decl = .{ .binding = 0, .value = 0 };)
Place @import statements at the bottom of the file (formatter will handle ordering)

src/**/*.zig: In Zig code, manage memory carefully: use appropriate allocators and defer for cleanup
Cache JavaScriptCore class structures in ZigGlobalObject when adding new classes

Files:

  • src/comptime_string_map.zig
  • src/bun.js/bindings/URL.zig
  • src/resolver/resolve_path.zig
  • src/install/dependency.zig
  • src/install/hosted_git_info.zig
  • src/string/immutable.zig
test/**

📄 CodeRabbit inference engine (.cursor/rules/writing-tests.mdc)

Place all tests under the test/ directory

Files:

  • test/cli/install/hosted-git-info/from-url.test.ts
  • test/cli/install/hosted-git-info/parse-url.test.ts
  • test/cli/install/hosted-git-info/cases.ts
test/cli/**/*.{js,ts}

📄 CodeRabbit inference engine (.cursor/rules/writing-tests.mdc)

test/cli/**/*.{js,ts}: Place CLI command tests (e.g., bun install, bun init) under test/cli/
When testing Bun as a CLI, use spawn with bunExe() and bunEnv from harness, and capture stdout/stderr via pipes

Files:

  • test/cli/install/hosted-git-info/from-url.test.ts
  • test/cli/install/hosted-git-info/parse-url.test.ts
  • test/cli/install/hosted-git-info/cases.ts
test/**/*.{js,ts}

📄 CodeRabbit inference engine (.cursor/rules/writing-tests.mdc)

test/**/*.{js,ts}: Write tests in JavaScript or TypeScript using Bun’s Jest-style APIs (test, describe, expect) and run with bun test
Prefer data-driven tests (e.g., test.each) to reduce boilerplate
Use shared utilities from test/harness.ts where applicable

Files:

  • test/cli/install/hosted-git-info/from-url.test.ts
  • test/cli/install/hosted-git-info/parse-url.test.ts
  • test/cli/install/hosted-git-info/cases.ts
test/**/*.test.{ts,js,jsx,tsx,mjs,cjs}

📄 CodeRabbit inference engine (test/CLAUDE.md)

test/**/*.test.{ts,js,jsx,tsx,mjs,cjs}: Use bun:test for files ending with *.test.{ts,js,jsx,tsx,mjs,cjs}
Prefer concurrent tests (test.concurrent/describe.concurrent) over sequential when feasible
Organize tests with describe blocks to group related tests
Use utilities like describe.each, toMatchSnapshot, and lifecycle hooks (beforeAll, beforeEach, afterEach) and track resources for cleanup

Files:

  • test/cli/install/hosted-git-info/from-url.test.ts
  • test/cli/install/hosted-git-info/parse-url.test.ts
test/**/*.{ts,tsx,js,jsx,mjs,cjs}

📄 CodeRabbit inference engine (test/CLAUDE.md)

For large/repetitive strings, use Buffer.alloc(count, fill).toString() instead of "A".repeat(count)

Files:

  • test/cli/install/hosted-git-info/from-url.test.ts
  • test/cli/install/hosted-git-info/parse-url.test.ts
  • test/cli/install/hosted-git-info/cases.ts
test/**/*.test.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

test/**/*.test.{ts,tsx}: Test files must live under test/ and end with .test.ts or .test.tsx
In tests, always use port: 0; do not hardcode ports or invent random port functions
Prefer snapshot assertions and use normalizeBunSnapshot for snapshot output in tests
Never write tests that assert absence of crashes (e.g., no "panic" or "uncaught exception") in output
Use tempDir from "harness" for temporary directories; do not use tmpdirSync or fs.mkdtempSync in tests
When spawning processes in tests, assert on stdout before asserting exitCode
Do not use setTimeout in tests; await conditions instead to avoid flakiness

Files:

  • test/cli/install/hosted-git-info/from-url.test.ts
  • test/cli/install/hosted-git-info/parse-url.test.ts
test/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Avoid shell commands in tests (e.g., find, grep); use Bun's Glob and built-in tools

Files:

  • test/cli/install/hosted-git-info/from-url.test.ts
  • test/cli/install/hosted-git-info/parse-url.test.ts
  • test/cli/install/hosted-git-info/cases.ts
src/bun.js/**/*.zig

📄 CodeRabbit inference engine (.cursor/rules/zig-javascriptcore-classes.mdc)

src/bun.js/**/*.zig: In Zig binding structs, expose generated bindings via pub const js = JSC.Codegen.JS and re-export toJS/fromJS/fromJSDirect
Constructors and prototype methods should return bun.JSError!JSC.JSValue to integrate Zig error handling with JS exceptions
Use parameter name globalObject (not ctx) and accept (*JSC.JSGlobalObject, *JSC.CallFrame) in binding methods/constructors
Implement getters as get(this, globalObject) returning JSC.JSValue and matching the .classes.ts interface
Provide deinit() for resource cleanup and finalize() that calls deinit(); use bun.destroy(this) or appropriate destroy pattern
Access JS call data via CallFrame (argument(i), argumentCount(), thisValue()) and throw errors with globalObject.throw(...)
For properties marked cache: true, use the generated Zig accessors (NameSetCached/GetCached) to work with GC-owned values
In finalize() for objects holding JS references, release them using .deref() before destroy

Files:

  • src/bun.js/bindings/URL.zig
**/*.{cpp,h}

📄 CodeRabbit inference engine (.cursor/rules/javascriptcore-class.mdc)

**/*.{cpp,h}: When exposing a JS class with public Constructor and Prototype, define three C++ types: class Foo : public JSC::DestructibleObject (if it has C++ fields), class FooPrototype : public JSC::JSNonFinalObject, and class FooConstructor : public JSC::InternalFunction
If the class has C++ data members, inherit from JSC::DestructibleObject and provide proper destruction; if it has no C++ fields (only JS properties), avoid a class and use JSC::constructEmptyObject(vm, structure) with putDirectOffset
Prefer placing the subspaceFor implementation in the .cpp file rather than the header when possible

Files:

  • src/bun.js/bindings/BunString.cpp
**/*.cpp

📄 CodeRabbit inference engine (.cursor/rules/javascriptcore-class.mdc)

**/*.cpp: Include "root.h" at the top of C++ binding files to satisfy lints
Define prototype properties using a const HashTableValue array and declare accessors/functions with JSC_DECLARE_* macros
Prototype classes should subclass JSC::JSNonFinalObject, provide create/createStructure, DECLARE_INFO, finishCreation that reifies static properties, and set mayBePrototype on the Structure
Custom getters should use JSC_DEFINE_CUSTOM_GETTER, jsDynamicCast to validate this, and throwThisTypeError on mismatch
Custom setters should use JSC_DEFINE_CUSTOM_SETTER, validate this via jsDynamicCast, and store via WriteBarrier/set semantics
Prototype functions should use JSC_DEFINE_HOST_FUNCTION, validate this with jsDynamicCast, and return encoded JSValue
Constructors should subclass JSC::InternalFunction, return internalFunctionSpace in subspaceFor, set the prototype property as non-configurable/non-writable, and provide create/createStructure
Provide a setup function that builds the Prototype, Constructor, and Structure, and assigns them to the LazyClassStructure initializer
Use the cached Structure via globalObject->m_.get(globalObject) when constructing instances
Expose constructors to Zig via an extern "C" function that returns the constructor from the LazyClassStructure
Provide an extern "C" Bun____toJS function that creates an instance using the cached Structure and returns an EncodedJSValue

Files:

  • src/bun.js/bindings/BunString.cpp
src/bun.js/bindings/**/*.cpp

📄 CodeRabbit inference engine (CLAUDE.md)

src/bun.js/bindings/**/*.cpp: When implementing a JavaScript class with a public constructor in C++, create Foo, FooPrototype, and FooConstructor classes
Define class properties using HashTableValue arrays in C++ bindings
Add iso subspaces for classes with C++ fields in JavaScriptCore bindings

Files:

  • src/bun.js/bindings/BunString.cpp
🧬 Code graph analysis (2)
test/cli/install/hosted-git-info/from-url.test.ts (2)
test/cli/install/hosted-git-info/cases.ts (2)
  • validGitUrls (26-2407)
  • invalidGitUrls (2409-2472)
src/js/internal-for-testing.ts (1)
  • hostedGitInfo (214-217)
test/cli/install/hosted-git-info/parse-url.test.ts (1)
src/js/internal-for-testing.ts (1)
  • hostedGitInfo (214-217)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Format
🔇 Additional comments (11)
src/comptime_string_map.zig (1)

307-312: Let me examine the actual file directly:

Add test coverage for the new getKey method.

This new public API method lacks test coverage. The existing test suite validates forward lookups (get, has, indexOf) but doesn't test the reverse lookup functionality.

Consider adding tests that verify:

  • Finding a key for an existing value
  • Returning null for a non-existent value
  • Behavior when multiple keys map to the same value (returns first in sorted order)
src/string/immutable.zig (7)

417-422: LGTM: lastIndexBeforeChar utility is correct and efficient.


476-481: LGTM: charAtT is bounds-safe and generic.


1028-1036: LGTM: case-insensitive suffix checks reuse existing ASCII-insensitive compare.


1561-1575: LGTM: countLeadingChar is correct and clear.


1815-1818: LGTM: trimSpaces wraps trim with shared whitespace set.


1825-1832: LGTM: containsUppercaseAscii is fine; consider SIMD later.


2084-2097: API change verified: all call sites properly handle bun.OOM error type.

The verification confirms that the concatWithLength and concat functions now correctly return bun.OOM![]u8, and all identified call sites are properly handling the error type:

  • Try-wrapped calls propagate the error to callers (e.g., src/linker.zig:311, src/cli/Arguments.zig:1101/1105/1109)
  • Unreachable assertions are used when allocation is guaranteed not to fail (e.g., src/string/immutable.zig:2131)
  • handleOom wrappers provide explicit error handling (e.g., src/install/hosted_git_info.zig:345)
  • Tail calls correctly propagate the error type (e.g., src/string/immutable.zig:2107)

No call sites ignore or mismatch the new return type.

src/js/internal-for-testing.ts (1)

214-217: LGTM: HostedGitInfo testing hooks correctly exposed.

Verification complete. The symbols TestingAPIs.jsParseUrl and TestingAPIs.jsFromUrl are correctly exported in src/install/hosted_git_info.zig (lines 1737 and 1767 respectively, within the TestingAPIs struct at line 1736). The filename-only reference pattern used here matches all other $newZigFunction calls throughout the file (e.g., patch.zig, shell.zig, socket.zig), confirming the bindings are correct.

src/bun.js/bindings/URL.zig (1)

19-33: URL fragment getters look good

Bindings mirror the C++ externs and match existing getter patterns. No issues found.

src/install/dependency.zig (1)

820-830: JS glue looks fine

Conversion to UTF‑8 and delegation to Tag.infer is straightforward.

Comment on lines +627 to +642
extern "C" BunString URL__hash(WTF::URL* url)
{
const auto& fragment = url->fragmentIdentifier().isEmpty()
? emptyString()
: url->fragmentIdentifierWithLeadingNumberSign().toString();
return Bun::toStringRef(fragment);
}

extern "C" BunString URL__fragmentIdentifier(WTF::URL* url)
{
const auto& fragment = url->fragmentIdentifier().isEmpty()
? emptyString()
: url->fragmentIdentifier().toString();
return Bun::toStringRef(fragment);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Fix namespace for emptyString() to avoid compile error

emptyString() is in the WTF namespace; unqualified calls will not compile here. Qualify with WTF::emptyString(). Optionally use toStringWithoutCopying() to avoid an extra copy.

-    const auto& fragment = url->fragmentIdentifier().isEmpty()
-        ? emptyString()
-        : url->fragmentIdentifierWithLeadingNumberSign().toString();
+    const auto& fragment = url->fragmentIdentifier().isEmpty()
+        ? WTF::emptyString()
+        : url->fragmentIdentifierWithLeadingNumberSign().toString();
@@
-    const auto& fragment = url->fragmentIdentifier().isEmpty()
-        ? emptyString()
-        : url->fragmentIdentifier().toString();
+    const auto& fragment = url->fragmentIdentifier().isEmpty()
+        ? WTF::emptyString()
+        : url->fragmentIdentifier().toString();

Nit: consider replacing both .toString() calls above with .toStringWithoutCopying() for consistency with nearby helpers.


🏁 Script executed:

#!/bin/bash
# Check includes and namespace context at top of BunString.cpp
echo "=== First 50 lines of BunString.cpp (includes/namespace) ==="
head -50 src/bun.js/bindings/BunString.cpp

echo -e "\n=== Lines 620-650 to see target code and context ==="
sed -n '620,650p' src/bun.js/bindings/BunString.cpp

echo -e "\n=== Check for 'using namespace' or 'using WTF' declarations ==="
grep -n "using namespace\|using WTF" src/bun.js/bindings/BunString.cpp || echo "No 'using namespace/WTF' found"

echo -e "\n=== Check for emptyString usage elsewhere in file ==="
grep -n "emptyString()" src/bun.js/bindings/BunString.cpp | head -20

Length of output: 2795


🏁 Script executed:

#!/bin/bash
# Check for toStringWithoutCopying usage and fragmentIdentifier() API
echo "=== Search for toStringWithoutCopying in file ==="
grep -n "toStringWithoutCopying" src/bun.js/bindings/BunString.cpp || echo "Not found in this file"

echo -e "\n=== Search for fragmentIdentifier calls to understand return type ==="
grep -n "fragmentIdentifier\|fragmentIdentifierWithLeadingNumberSign" src/bun.js/bindings/BunString.cpp

echo -e "\n=== Check URL header for fragmentIdentifier() return type ==="
grep -A 2 "fragmentIdentifier\|fragmentIdentifierWithLeadingNumberSign" /usr/include/wtf/URL.h 2>/dev/null | head -20 || echo "Standard header not available, checking includes in file"

Length of output: 795


Apply both the namespace fix and optional optimization for consistency

The unqualified emptyString() calls at lines 630 and 638 will fail to compile—the file does not have using namespace WTF; and already uses WTF::emptyString() elsewhere (lines 793, 883, 918). The fix is critical.

Additionally, other URL accessors in this file (protocol, query, host, path at lines 655, 680, 692, 720) use .toStringWithoutCopying(), so aligning fragmentIdentifier methods to the same pattern is a valid consistency improvement, not just a nit.

 extern "C" BunString URL__hash(WTF::URL* url)
 {
     const auto& fragment = url->fragmentIdentifier().isEmpty()
-        ? emptyString()
-        : url->fragmentIdentifierWithLeadingNumberSign().toString();
+        ? WTF::emptyString()
+        : url->fragmentIdentifierWithLeadingNumberSign().toStringWithoutCopying();
     return Bun::toStringRef(fragment);
 }

 extern "C" BunString URL__fragmentIdentifier(WTF::URL* url)
 {
     const auto& fragment = url->fragmentIdentifier().isEmpty()
-        ? emptyString()
-        : url->fragmentIdentifier().toString();
+        ? WTF::emptyString()
+        : url->fragmentIdentifier().toStringWithoutCopying();
     return Bun::toStringRef(fragment);
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
extern "C" BunString URL__hash(WTF::URL* url)
{
const auto& fragment = url->fragmentIdentifier().isEmpty()
? emptyString()
: url->fragmentIdentifierWithLeadingNumberSign().toString();
return Bun::toStringRef(fragment);
}
extern "C" BunString URL__fragmentIdentifier(WTF::URL* url)
{
const auto& fragment = url->fragmentIdentifier().isEmpty()
? emptyString()
: url->fragmentIdentifier().toString();
return Bun::toStringRef(fragment);
}
extern "C" BunString URL__hash(WTF::URL* url)
{
const auto& fragment = url->fragmentIdentifier().isEmpty()
? WTF::emptyString()
: url->fragmentIdentifierWithLeadingNumberSign().toStringWithoutCopying();
return Bun::toStringRef(fragment);
}
extern "C" BunString URL__fragmentIdentifier(WTF::URL* url)
{
const auto& fragment = url->fragmentIdentifier().isEmpty()
? WTF::emptyString()
: url->fragmentIdentifier().toStringWithoutCopying();
return Bun::toStringRef(fragment);
}
🤖 Prompt for AI Agents
In src/bun.js/bindings/BunString.cpp around lines 627-642, the code uses
unqualified emptyString() (which will not compile here) and returns
.toString()/toStringWithLeadingNumberSign(), which is inconsistent with other
URL accessors; change emptyString() to WTF::emptyString() at both sites and
replace .toString()/toStringWithLeadingNumberSign().toString() with
.toStringWithoutCopying() (or .toStringWithoutCopyingWithLeadingNumberSign() if
available) so the fragment branches match the pattern used by
protocol/query/host/path, then return Bun::toStringRef of that non-copying
string.

Comment on lines +303 to +306
/// Lookup the first-defined string key for a given value.
///
/// Linear search.
pub fn getKey(value: V) ?[]const KeyType {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Clarify "first-defined" in documentation.

The documentation states "first-defined string key," but since kvs is sorted by length and then lexicographically (line 40), this method returns the first key in sorted order, not definition order. When multiple keys map to the same value, the result is deterministic but based on sorted order rather than the order keys were defined in the input list.

Consider revising to "first matching key (in sorted order)" for accuracy.

🤖 Prompt for AI Agents
In src/comptime_string_map.zig around lines 303 to 306, the doc comment "Lookup
the first-defined string key" is misleading because the internal kvs array is
sorted by length then lexicographically (see line 40), so the function actually
returns the first matching key in sorted order, not definition order; update the
doc comment to read something like "Lookup the first matching key (in sorted
order)" and mention that when multiple keys map to the same value the choice is
deterministic based on the map's sort order rather than the input definition
order.

Comment on lines +497 to +500
// Allocator necessary for slow paths.
var stackFallback = std.heap.stackFallback(1024, bun.default_allocator);
var allocator = stackFallback.get();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Minor: reduce duplicate slow-path parsing and allocations

Several branches dupe dependency and call HostedGitInfo.fromUrl just to select .github vs .git. Consider a small helper that:

  • accepts allocator and dependency,
  • returns { is_github: bool, is_shortcut: bool } or the provider enum,
  • encapsulates the dupe/free/fromUrl/deinit dance.

This removes repetition and centralizes edge-case handling.

Also applies to: 561-589, 616-640, 680-688, 708-717, 785-803

🤖 Prompt for AI Agents
In src/install/dependency.zig around lines 497-500 (and also affecting 561-589,
616-640, 680-688, 708-717, 785-803), multiple branches duplicate the slow-path
parsing/allocations by calling HostedGitInfo.fromUrl, inspecting .github vs
.git, and repeating free/deinit logic; introduce a small helper function (e.g.,
parseDependencyProvider) that accepts the allocator and the dependency string
and returns a compact result (either {is_github: bool, is_shortcut: bool} or a
provider enum) and internally runs HostedGitInfo.fromUrl, handles error paths,
and performs deinit/free before returning; replace the duplicated blocks with
calls to this helper and update callers to use the returned flags/enum, removing
repeated allocation/deinit/free code and centralizing edge-case handling.

Comment on lines +1467 to +1514
sb.count(user_part);
sb.count(project);
if (committish) |c| sb.count(c);

sb.allocate(allocator) catch return null;

const user_slice = blk: {
const writable = sb.writable();
var stream = std.io.fixedBufferStream(writable);
const decoded_len = PercentEncoding.decode(
@TypeOf(stream.writer()),
stream.writer(),
user_part,
) catch return null;
sb.len += decoded_len;
break :blk writable[0..decoded_len];
};
const project_slice = blk: {
const writable = sb.writable();
var stream = std.io.fixedBufferStream(writable);
const decoded_len = PercentEncoding.decode(
@TypeOf(stream.writer()),
stream.writer(),
project,
) catch return null;
sb.len += decoded_len;
break :blk writable[0..decoded_len];
};
const committish_slice = if (committish) |c| blk: {
const writable = sb.writable();
var stream = std.io.fixedBufferStream(writable);
const decoded_len = PercentEncoding.decode(
@TypeOf(stream.writer()),
stream.writer(),
c,
) catch return null;
sb.len += decoded_len;
break :blk writable[0..decoded_len];
} else null;

return .{
.user = user_slice,
.project = project_slice,
.committish = committish_slice,
._owned_buffer = sb.allocatedSlice(),
._allocator = allocator,
};
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consistency: reuse decodeAndAppend in sourcehut extractor

The sourcehut extractor inlines percent-decoding three times. For consistency and less code, consider using HostedGitInfo.decodeAndAppend like other extractors.

🤖 Prompt for AI Agents
In src/install/hosted_git_info.zig around lines 1467 to 1514, the sourcehut
extractor duplicates percent-decoding logic for user, project, and committish;
replace the three inline decoding blocks with calls to the existing
HostedGitInfo.decodeAndAppend helper to avoid repetition and ensure consistency
with other extractors. Specifically, after sb.allocate(allocator) use
decodeAndAppend(sb, allocator, user_part) and decodeAndAppend(sb, allocator,
project), and if committish is present call decodeAndAppend(sb, allocator,
committish) (handling and propagating errors the same way decodeAndAppend does),
then use sb.allocatedSlice() and sb._allocator as before to return the struct.
Ensure sb.len updates and error handling are delegated to decodeAndAppend so the
duplicate fixedBufferStream/PercentEncoding.decode code is removed.

Comment on lines +26 to +27
export const validGitUrls: { [K in Provider]: { [K in string]: object } } = {
bitbucket: {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Type tightening for dataset shape

The mapped type can be simplified to improve readability:

-export const validGitUrls: { [K in Provider]: { [K in string]: object } } = {
+export const validGitUrls: Record<Provider, Record<string, Record<string, unknown>>> = {

If you prefer stricter value typing later, replace Record<string, unknown> with a dedicated interface.

🤖 Prompt for AI Agents
In test/cli/install/hosted-git-info/cases.ts around lines 26-27, the mapped type
{ [K in Provider]: { [K in string]: object } } is overly verbose and uses the
broad object type; change the declaration to use Record for clarity, e.g.
Record<Provider, Record<string, unknown>>, or swap the inner Record<string,
unknown> for a dedicated interface if you want stricter value typing, and update
the variable's type annotation accordingly.

Comment on lines +28 to +31
// shortucts
//
// NOTE auth is accepted but ignored
"bitbucket:foo/bar": { ...defaults.bitbucket, default: "shortcut" },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Nit: fix typos in comments

  • “shortucts” → “shortcuts”
  • “protocos” → “protocols”

No behavior change, just clarity.

-    // shortucts
+    // shortcuts
@@
-    // shortucts
+    // shortcuts
@@
-    // invalid protocos
+    // invalid protocols

Also applies to: 2358-2365, 2467-2471

🤖 Prompt for AI Agents
In test/cli/install/hosted-git-info/cases.ts around lines 28-31 (and also update
occurrences at 2358-2365 and 2467-2471), there are typos in comment text: change
“shortucts” to “shortcuts” and “protocos” to “protocols”; update each affected
comment occurrence accordingly with no code behavior changes.

Comment on lines +5 to +19
describe("fromUrl", () => {
describe("valid urls", () => {
describe.each(Object.entries(validGitUrls))("%s", (_, urlset: object) => {
it.each(Object.entries(urlset))("parses %s", (url, expected) => {
expect(hostedGitInfo.fromUrl(url)).toMatchObject({
...(expected.type && { type: expected.type }),
...(expected.domain && { domain: expected.domain }),
...(expected.user && { user: expected.user }),
...(expected.project && { project: expected.project }),
...(expected.committish && { committish: expected.committish }),
...(expected.default && { default: expected.default }),
});
});
});
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Prefer concurrent tests; tighten TS typing for expected.

  • Use it.concurrent.each for faster runs.
  • Replace loose object typing with Record<string, any> to avoid TS property access warnings in editors.

Apply:

-describe.each(Object.entries(validGitUrls))("%s", (_, urlset: object) => {
-  it.each(Object.entries(urlset))("parses %s", (url, expected) => {
+describe.each(Object.entries(validGitUrls))("%s", (_, urlset: Record<string, any>) => {
+  it.concurrent.each(Object.entries(urlset))("parses %s", (url, expected: any) => {
🤖 Prompt for AI Agents
In test/cli/install/hosted-git-info/from-url.test.ts around lines 5 to 19,
change the test to use concurrent runs and tighten the TypeScript types: replace
it.each with it.concurrent.each to run tests in parallel, and replace the loose
object typing for the iterated values with Record<string, any> (e.g., urlset:
Record<string, any> and expected: Record<string, any>) so property access like
expected.type does not emit TS warnings; keep the same assertion structure.

Comment on lines +21 to +31
// TODO(markovejnovic): Unskip these tests.
describe.skip("invalid urls", () => {
describe.each(Object.entries(invalidGitUrls))("%s", (_, urls: (string | null | undefined)[]) => {
it.each(urls)("does not permit %s", url => {
expect(() => {
hostedGitInfo.fromUrl(url);
}).toThrow();
});
});
});
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Track enabling invalid-url tests; consider returning null instead of throw.

Tests are skipped. Decide whether fromUrl should throw or return null for invalid inputs and align tests accordingly. Document behavior either way.

I can adjust tests once the desired behavior is confirmed.

🤖 Prompt for AI Agents
In test/cli/install/hosted-git-info/from-url.test.ts around lines 21-31, the
invalid-url tests are skipped and the review asks to standardize behavior;
decide to have hostedGitInfo.fromUrl return null for invalid inputs, then unskip
these tests and change their expectations to assert returned value is null
(instead of expecting a throw), and update the fromUrl implementation and its
JSDoc/type definitions to explicitly return null for invalid or non-matching
inputs and add a brief note in the module README describing this behavior.

Comment on lines +1 to +21
/**
* Mimics https://github.com/npm/hosted-git-info/blob/main/test/parse-url.js
*/
import { hostedGitInfo } from "bun:internal-for-testing";
import { describe, expect, it } from "bun:test";

const okCases = [
// These come straight out of the hosted-git-info tests
"git+ssh://git@abc:frontend/utils.git#6d45447e0c5eb6cd2e3edf05a8c5a9bb81950c79",
// These are custom cases added for Bun
"ssh://:password@bitbucket.org:foo/bar.git",
"git@bitbucket.org:foo/bar",
"gist:user:password@/feedbeef#branch",
"github:foo/bar#branch with space",
];

describe("parseUrl", () => {
it.each(okCases)("parses %s", url => {
expect(hostedGitInfo.parseUrl(url)).not.toBeNull();
});
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Make tests concurrent and centralize cases.

  • Prefer concurrent tests for speed.
  • Consider moving okCases into cases.ts for consistency with from-url tests.

Apply:

-import { describe, expect, it } from "bun:test";
+import { describe, expect, it } from "bun:test";
@@
-describe("parseUrl", () => {
-  it.each(okCases)("parses %s", url => {
+describe("parseUrl", () => {
+  it.concurrent.each(okCases)("parses %s", url => {
     expect(hostedGitInfo.parseUrl(url)).not.toBeNull();
   });
 });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
/**
* Mimics https://github.com/npm/hosted-git-info/blob/main/test/parse-url.js
*/
import { hostedGitInfo } from "bun:internal-for-testing";
import { describe, expect, it } from "bun:test";
const okCases = [
// These come straight out of the hosted-git-info tests
"git+ssh://git@abc:frontend/utils.git#6d45447e0c5eb6cd2e3edf05a8c5a9bb81950c79",
// These are custom cases added for Bun
"ssh://:password@bitbucket.org:foo/bar.git",
"git@bitbucket.org:foo/bar",
"gist:user:password@/feedbeef#branch",
"github:foo/bar#branch with space",
];
describe("parseUrl", () => {
it.each(okCases)("parses %s", url => {
expect(hostedGitInfo.parseUrl(url)).not.toBeNull();
});
});
/**
* Mimics https://github.com/npm/hosted-git-info/blob/main/test/parse-url.js
*/
import { hostedGitInfo } from "bun:internal-for-testing";
import { describe, expect, it } from "bun:test";
const okCases = [
// These come straight out of the hosted-git-info tests
"git+ssh://git@abc:frontend/utils.git#6d45447e0c5eb6cd2e3edf05a8c5a9bb81950c79",
// These are custom cases added for Bun
"ssh://:password@bitbucket.org:foo/bar.git",
"git@bitbucket.org:foo/bar",
"gist:user:password@/feedbeef#branch",
"github:foo/bar#branch with space",
];
describe("parseUrl", () => {
it.concurrent.each(okCases)("parses %s", url => {
expect(hostedGitInfo.parseUrl(url)).not.toBeNull();
});
});
🤖 Prompt for AI Agents
In test/cli/install/hosted-git-info/parse-url.test.ts lines 1-21, the test
should run concurrently and reuse centralized test cases; replace the local
okCases array by importing the shared cases from
test/cli/install/hosted-git-info/cases.ts (or appropriate cases.ts used by other
from-url tests) and change the test runner from it.each to it.concurrent.each
(or the Bun equivalent) so the table-driven tests run in parallel for speed
while reusing the canonical cases file.

@markovejnovic markovejnovic force-pushed the user/markovejnovic/hosted-git-info branch from de14245 to ae765a0 Compare October 28, 2025 17:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants