Refactor: centralize lookup and named type helpers in SemanticAnalyzer (#4157) #20136

rchristof · 2025-10-28T15:12:42Z

This PR is an incremental cleanup in SemanticAnalyzer to move toward issue #4157 ("too many lookup helpers, too many subtly different code paths"). It does two things:

Centralizes fully qualified name lookup logic.
Centralizes Instance construction from named types.

Both changes follow the same pattern:

Extract the real logic into a single private helper (not plugin API).
Turn the existing public helpers into thin, intention-revealing wrappers.
Keep public signatures and behavior stable as much as possible.

The goal is to reduce duplication and make future cleanup / movement into mypy/lookup.py easier without breaking callers immediately.

Part 1: Fully qualified lookup refactor

Previously:

lookup_fully_qualified_or_none() in semanal.py contained a large block of logic that:
- tried to resolve a fully qualified name by walking self.modules,
- handled partially loaded / incomplete namespaces and called record_incomplete_ref(),
- walked into TypeInfo members for nested attributes,
- and had two different branches for "the module name matches directly" vs "peel components until we find the longest loaded prefix".
lookup_fully_qualified() just called that function and asserted the result was not None.

Now:

A new private helper _lookup_fully_qualified(fullname, *, allow_missing) implements the traversal logic in one place.
lookup_fully_qualified() becomes a thin "strict" wrapper: it calls the helper with allow_missing=False, asserts non-None, and returns the node.
lookup_fully_qualified_or_none() becomes a thin "permissive" wrapper: it calls the helper with allow_missing=True and returns that result (possibly None).

Key properties:

The logic that finds the longest loaded module prefix and then walks remaining name parts now lives in one place instead of being duplicated across branches.
The behavior around is_incomplete_namespace() and record_incomplete_ref() is preserved and documented.
These wrappers keep their original names and signatures, so existing code and plugins can continue to call them.

Behavioral note:

When the fully qualified name corresponds exactly to a module (e.g. "pkg.mod"), _lookup_fully_qualified() now returns None instead of treating the module object itself as a symbol defined in the parent package. This matches the direction of the ongoing lookup cleanup (avoid treating modules like regular definitions), but I'm calling it out in case something relied on the old behavior.

This puts us in a place where future work can move the core logic into mypy/lookup.py and update other ad-hoc lookups to use it.

Part 2: Named type / Instance construction refactor

Previously:

named_type(), named_type_or_none(), and builtin_type() each reimplemented very similar logic:
- resolve the symbol with lookup_fully_qualified() or lookup_fully_qualified_or_none(),
- handle PlaceholderNode,
- sometimes unwrap TypeAlias targets to a TypeInfo,
- build an Instance, filling in default type arguments using a particular AnyType flavor (TypeOfAny.special_form vs TypeOfAny.unannotated).

The differences between these helpers (strict vs permissive, unwrap alias or not, which AnyType flavor to use) were spread out and not very explicit at the call site.

Now:

A new private helper _build_named_instance(fullname, args, *, allow_missing, any_flavor, unwrap_alias) is the single implementation of that logic.
- allow_missing mirrors strict (named_type) vs permissive (named_type_or_none) lookup.
- unwrap_alias controls whether we unwrap a TypeAlias to get the underlying TypeInfo (only the or_none variant historically did this).
- any_flavor selects which AnyType flavor to use when no explicit type arguments are provided.
named_type() now calls this helper with strict parameters and asserts a result exists.
named_type_or_none() calls it permissively and may return None.
builtin_type() now also calls the helper directly instead of indirectly delegating through named_type().

Key properties:

The public APIs are unchanged:
- named_type(fullname, args?) -> Instance
- named_type_or_none(fullname, args?) -> Instance | None
- builtin_type(fullname) -> Instance
The differences between these helpers are now expressed as parameters to the shared implementation instead of being duplicated in three places.
Future fixes to how we synthesize type arguments or unwrap aliases now only need to happen in one function.

Why this helps #4157

Issue #4157 calls out that there are ~9 different lookup / type access helpers across the analyzer/checker, with slightly different semantics. That makes it easy for behavior to drift and hard for plugin authors (and even core devs) to know which one is "correct."

This PR moves us toward a world where:

there is one authoritative code path for fully qualified lookups,
there is one authoritative code path for turning a qualified name into an Instance,
the public helpers become stable, intention-revealing wrappers,
and we can start migrating other places in the codebase to use these helpers instead of reimplementing their own traversal logic.

This should make future steps (for example, moving more of this logic into mypy/lookup.py) much easier to review and reason about.

Scope / compatibility

No public helper was removed or renamed.
Signatures of lookup_fully_qualified(), lookup_fully_qualified_or_none(), named_type(), named_type_or_none(), and builtin_type() are unchanged.
_lookup_fully_qualified() and _build_named_instance() are documented as internal-only and not part of the plugin API.
The only semantic difference worth flagging is that exact-module names (like "pkg.mod") now return None from the fully qualified lookup helper instead of trying to return the module symbol through the parent package. This aligns with the new lookup direction but is called out here explicitly for review.

Follow-up work

With these helpers in place, future cleanups can:

Update other ad hoc lookup logic in semanal.py / checker.py to call these helpers instead of duplicating traversal.
Potentially move the shared logic into mypy/lookup.py, so the analyzer and other passes use a single centralized implementation.

This PR intentionally keeps each step reviewable and does not attempt the full migration in one shot.

This refactor extracts the core logic for resolving fully qualified names into a new private helper _lookup_fully_qualified(), and rewrites lookup_fully_qualified() / lookup_fully_qualified_or_none() as thin wrappers around it. Before this change: lookup_fully_qualified_or_none() in semanal.py directly implemented a large amount of lookup logic with two separate code paths ("exact module match" vs "longest loaded prefix"). This made it harder to maintain and reason about. lookup_fully_qualified() just called the above and asserted non-None, so the only authoritative behavior lived inside one big public method. After this change: _lookup_fully_qualified(fullname, allow_missing) is now the single internal implementation of fully qualified lookup: - It finds the longest prefix of the name that exists in self.modules. - It walks the remaining components through module / class symbol tables. - It preserves record_incomplete_ref() calls for incomplete namespaces. - It validates input and handles partial / nested names consistently. lookup_fully_qualified() now just calls the helper in "strict" mode and asserts the result is not None. lookup_fully_qualified_or_none() now just calls the helper in "permissive" mode and returns the result (or None). Why this matters: The public API remains unchanged: - lookup_fully_qualified(name) -> SymbolTableNode - lookup_fully_qualified_or_none(name) -> SymbolTableNode | None The duplicated traversal logic is no longer embedded in a public method. Instead, there is a single internal source of truth with a documented contract (_lookup_fully_qualified is explicitly not plugin API). This makes it easier to migrate other ad-hoc lookup sites to use this common logic, which is one of the goals in issue python#4157 ("too many lookup helpers across the semantic analyzer / checker leads to confusion"). Behavioral note: When the fully qualified name refers exactly to a module (e.g. "pkg.mod"), _lookup_fully_qualified() now returns None instead of attempting to treat the module object itself as a definitional symbol via its parent package. This matches the intent of the ongoing lookup cleanup work (avoid treating modules like regular definitions), but I'm calling it out in case external plugins depended on the old quirk. This is an incremental, internal refactor toward python#4157. No plugin-facing API signatures were changed.

This refactor extracts the common logic for constructing Instance objects from fully qualified type names into a private helper _build_named_instance(), and rewrites named_type(), named_type_or_none(), and builtin_type() to delegate to it. Before this change: named_type() and named_type_or_none() each reimplemented the same core steps: - resolve a fully qualified name using either lookup_fully_qualified() (strict) or lookup_fully_qualified_or_none() (permissive), - reject unresolved symbols / placeholders, - optionally unwrap a TypeAlias to get to the underlying TypeInfo, - build an Instance, synthesizing default type arguments with a particular AnyType flavor (special_form vs unannotated). builtin_type() just wrapped named_type(), obscuring the fact that it's semantically "strict + use special_form AnyType defaults". This meant the same idea ("turn a fully qualified name into an Instance") was duplicated in several places, with subtle differences that were not obvious at the call site. After this change: _build_named_instance(fullname, args, *, allow_missing, any_flavor, unwrap_alias) becomes the single internal implementation of that logic: - allow_missing controls whether we allow unresolved names (mirrors strict vs permissive lookup). - any_flavor selects which AnyType flavor to use for synthesized type arguments when args is not provided. - unwrap_alias controls whether a TypeAlias target should be unwrapped to its underlying TypeInfo (this matches the historical behavior of named_type_or_none()). named_type() now calls the helper with allow_missing=False, any_flavor=TypeOfAny.special_form, unwrap_alias=False, and asserts that a result exists. named_type_or_none() calls it with allow_missing=True, any_flavor=TypeOfAny.unannotated, unwrap_alias=True, and returns None if the symbol couldn't be resolved or is still a placeholder. builtin_type() now directly calls the helper with the same strict parameters as named_type(), making its semantics explicit while keeping the same behavior. Why this matters: The public helpers keep the same signatures and the same intended contracts: - named_type(fullname, args?) -> Instance - named_type_or_none(fullname, args?) -> Instance | None - builtin_type(fullname) -> Instance The differences between them (strict vs permissive lookup, alias unwrapping, AnyType flavor for default arguments) are now explicit in how they call _build_named_instance() instead of being duplicated in multiple ad hoc implementations. Future changes to how we resolve names or synthesize default type arguments only need to happen in one place. This continues the cleanup called out in issue python#4157: reducing the number of slightly different lookup/type helpers and making the remaining ones more uniform. _build_named_instance() is internal-only and not part of the plugin API surface.

for more information, see https://pre-commit.ci

github-actions · 2025-10-28T15:32:15Z

According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅

rchristof and others added 3 commits October 28, 2025 12:00

[pre-commit.ci] auto fixes from pre-commit.com hooks

08594d3

for more information, see https://pre-commit.ci

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Refactor: centralize lookup and named type helpers in SemanticAnalyzer (#4157) #20136

Refactor: centralize lookup and named type helpers in SemanticAnalyzer (#4157) #20136

rchristof commented Oct 28, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Oct 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Refactor: centralize lookup and named type helpers in SemanticAnalyzer (#4157) #20136

Are you sure you want to change the base?

Refactor: centralize lookup and named type helpers in SemanticAnalyzer (#4157) #20136

Conversation

rchristof commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Part 1: Fully qualified lookup refactor

Part 2: Named type / Instance construction refactor

Why this helps #4157

Scope / compatibility

Follow-up work

Uh oh!

github-actions bot commented Oct 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

rchristof commented Oct 28, 2025 •

edited

Loading