Skip to content

Conversation

@andrewbranch
Copy link
Member

Fixes a race noticed by @sheetalkamat
Probably fixes #1983

Copy link
Member

@jakebailey jakebailey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I never noticed that we had two things with nearly the same implementation

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR addresses race conditions in reference-counted cache implementations by adding checks for deleted entries and adjusting lock timing. The changes ensure that operations handle cases where entries are marked for deletion (refCount <= 0) between the time they're looked up and when their locks are acquired.

Key Changes

  • Added resurrection logic in Ref() methods to handle entries deleted while acquiring locks
  • Added recursive retry logic in loadOrStoreNewLockedEntry() for deleted entries
  • Moved mu.Unlock() in Deref() methods to after Delete() operations to prevent race conditions

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
internal/project/parsecache.go Adds race condition handling for deleted parse cache entries and fixes unlock ordering in Deref
internal/project/extendedconfigcache.go Adds race condition handling for deleted config cache entries and fixes unlock ordering in Deref
Comments suppressed due to low confidence (1)

internal/project/extendedconfigcache.go:1

  • This recursive retry could loop indefinitely under sustained contention. Consider adding a maximum retry count to prevent potential stack overflow or infinite loops in edge cases where entries are repeatedly deleted.
package project

Comment on lines 106 to 110
if existing.refCount <= 0 {
// Existing entry was deleted while we were acquiring the lock
existing.mu.Unlock()
return c.loadOrStoreNewLockedEntry(key)
}
Copy link

Copilot AI Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This recursive retry pattern could potentially loop indefinitely if there's sustained contention where entries are constantly being deleted. Consider adding a retry limit or exponential backoff to prevent unbounded recursion in pathological cases.

Copilot uses AI. Check for mistakes.
@andrewbranch
Copy link
Member Author

Oh, it's blowing up because we never delete entries in our test suites since they're likely to come back

@jakebailey
Copy link
Member

Doesn't that imply we aren't testing this new code?

@andrewbranch
Copy link
Member Author

Yes. AFAIK it's not possible to deterministically trigger the race, because there's no callback into test code that could occur between the sync map load and the mutex lock.

if entry, ok := c.entries.Load(path); ok {
entry.mu.Lock()
if entry.refCount <= 0 {
// Entry was deleted while we were acquiring the lock
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't possible that the entry got deleted before the cal to c.entries.Load(path) as well?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

textDocument/diagnostic failed & panic

4 participants