Skip to content

Conversation

@rchristof
Copy link

@rchristof rchristof commented Oct 28, 2025

This PR is an incremental cleanup in SemanticAnalyzer to move toward issue #4157 ("too many lookup helpers, too many subtly different code paths"). It does two things:

  1. Centralizes fully qualified name lookup logic.

  2. Centralizes Instance construction from named types.

Both changes follow the same pattern:

  • Extract the real logic into a single private helper (not plugin API).

  • Turn the existing public helpers into thin, intention-revealing wrappers.

  • Keep public signatures and behavior stable as much as possible.

The goal is to reduce duplication and make future cleanup / movement into mypy/lookup.py easier without breaking callers immediately.


Part 1: Fully qualified lookup refactor

Previously:

  • lookup_fully_qualified_or_none() in semanal.py contained a large block of logic that:

    • tried to resolve a fully qualified name by walking self.modules,

    • handled partially loaded / incomplete namespaces and called record_incomplete_ref(),

    • walked into TypeInfo members for nested attributes,

    • and had two different branches for "the module name matches directly" vs "peel components until we find the longest loaded prefix".

  • lookup_fully_qualified() just called that function and asserted the result was not None.

Now:

  • A new private helper _lookup_fully_qualified(fullname, *, allow_missing) implements the traversal logic in one place.

  • lookup_fully_qualified() becomes a thin "strict" wrapper: it calls the helper with allow_missing=False, asserts non-None, and returns the node.

  • lookup_fully_qualified_or_none() becomes a thin "permissive" wrapper: it calls the helper with allow_missing=True and returns that result (possibly None).

Key properties:

  • The logic that finds the longest loaded module prefix and then walks remaining name parts now lives in one place instead of being duplicated across branches.

  • The behavior around is_incomplete_namespace() and record_incomplete_ref() is preserved and documented.

  • These wrappers keep their original names and signatures, so existing code and plugins can continue to call them.

Behavioral note:

  • When the fully qualified name corresponds exactly to a module (e.g. "pkg.mod"), _lookup_fully_qualified() now returns None instead of treating the module object itself as a symbol defined in the parent package. This matches the direction of the ongoing lookup cleanup (avoid treating modules like regular definitions), but I'm calling it out in case something relied on the old behavior.

This puts us in a place where future work can move the core logic into mypy/lookup.py and update other ad-hoc lookups to use it.


Part 2: Named type / Instance construction refactor

Previously:

  • named_type(), named_type_or_none(), and builtin_type() each reimplemented very similar logic:

    • resolve the symbol with lookup_fully_qualified() or lookup_fully_qualified_or_none(),

    • handle PlaceholderNode,

    • sometimes unwrap TypeAlias targets to a TypeInfo,

    • build an Instance, filling in default type arguments using a particular AnyType flavor (TypeOfAny.special_form vs TypeOfAny.unannotated).

The differences between these helpers (strict vs permissive, unwrap alias or not, which AnyType flavor to use) were spread out and not very explicit at the call site.

Now:

  • A new private helper _build_named_instance(fullname, args, *, allow_missing, any_flavor, unwrap_alias) is the single implementation of that logic.

    • allow_missing mirrors strict (named_type) vs permissive (named_type_or_none) lookup.

    • unwrap_alias controls whether we unwrap a TypeAlias to get the underlying TypeInfo (only the or_none variant historically did this).

    • any_flavor selects which AnyType flavor to use when no explicit type arguments are provided.

  • named_type() now calls this helper with strict parameters and asserts a result exists.

  • named_type_or_none() calls it permissively and may return None.

  • builtin_type() now also calls the helper directly instead of indirectly delegating through named_type().

Key properties:

  • The public APIs are unchanged:

    • named_type(fullname, args?) -> Instance

    • named_type_or_none(fullname, args?) -> Instance | None

    • builtin_type(fullname) -> Instance

  • The differences between these helpers are now expressed as parameters to the shared implementation instead of being duplicated in three places.

  • Future fixes to how we synthesize type arguments or unwrap aliases now only need to happen in one function.


Why this helps #4157

Issue #4157 calls out that there are ~9 different lookup / type access helpers across the analyzer/checker, with slightly different semantics. That makes it easy for behavior to drift and hard for plugin authors (and even core devs) to know which one is "correct."

This PR moves us toward a world where:

  • there is one authoritative code path for fully qualified lookups,

  • there is one authoritative code path for turning a qualified name into an Instance,

  • the public helpers become stable, intention-revealing wrappers,

  • and we can start migrating other places in the codebase to use these helpers instead of reimplementing their own traversal logic.

This should make future steps (for example, moving more of this logic into mypy/lookup.py) much easier to review and reason about.


Scope / compatibility

  • No public helper was removed or renamed.

  • Signatures of lookup_fully_qualified(), lookup_fully_qualified_or_none(), named_type(), named_type_or_none(), and builtin_type() are unchanged.

  • _lookup_fully_qualified() and _build_named_instance() are documented as internal-only and not part of the plugin API.

  • The only semantic difference worth flagging is that exact-module names (like "pkg.mod") now return None from the fully qualified lookup helper instead of trying to return the module symbol through the parent package. This aligns with the new lookup direction but is called out here explicitly for review.


Follow-up work

With these helpers in place, future cleanups can:

  • Update other ad hoc lookup logic in semanal.py / checker.py to call these helpers instead of duplicating traversal.

  • Potentially move the shared logic into mypy/lookup.py, so the analyzer and other passes use a single centralized implementation.

This PR intentionally keeps each step reviewable and does not attempt the full migration in one shot.

rchristof and others added 3 commits October 28, 2025 12:00
This refactor extracts the core logic for resolving fully qualified names
into a new private helper _lookup_fully_qualified(), and rewrites
lookup_fully_qualified() / lookup_fully_qualified_or_none() as thin
wrappers around it.

Before this change:

lookup_fully_qualified_or_none() in semanal.py directly implemented
a large amount of lookup logic with two separate code paths
("exact module match" vs "longest loaded prefix"). This made it harder
to maintain and reason about.
lookup_fully_qualified() just called the above and asserted non-None,
so the only authoritative behavior lived inside one big public method.

After this change:

_lookup_fully_qualified(fullname, allow_missing) is now the single
internal implementation of fully qualified lookup:
  - It finds the longest prefix of the name that exists in self.modules.
  - It walks the remaining components through module / class symbol tables.
  - It preserves record_incomplete_ref() calls for incomplete namespaces.
  - It validates input and handles partial / nested names consistently.
lookup_fully_qualified() now just calls the helper in "strict" mode
and asserts the result is not None.
lookup_fully_qualified_or_none() now just calls the helper in
"permissive" mode and returns the result (or None).

Why this matters:

The public API remains unchanged:
  - lookup_fully_qualified(name) -> SymbolTableNode
  - lookup_fully_qualified_or_none(name) -> SymbolTableNode | None
The duplicated traversal logic is no longer embedded in a public method.
Instead, there is a single internal source of truth with a documented
contract (_lookup_fully_qualified is explicitly not plugin API).
This makes it easier to migrate other ad-hoc lookup sites to use this
common logic, which is one of the goals in issue python#4157 ("too many lookup
helpers across the semantic analyzer / checker leads to confusion").

Behavioral note:

When the fully qualified name refers exactly to a module
(e.g. "pkg.mod"), _lookup_fully_qualified() now returns None instead
of attempting to treat the module object itself as a definitional
symbol via its parent package.
This matches the intent of the ongoing lookup cleanup work (avoid
treating modules like regular definitions), but I'm calling it out in
case external plugins depended on the old quirk.

This is an incremental, internal refactor toward python#4157. No plugin-facing
API signatures were changed.
This refactor extracts the common logic for constructing Instance
objects from fully qualified type names into a private helper
_build_named_instance(), and rewrites named_type(),
named_type_or_none(), and builtin_type() to delegate to it.

Before this change:

named_type() and named_type_or_none() each reimplemented the same
core steps:
  - resolve a fully qualified name using either
    lookup_fully_qualified() (strict) or
    lookup_fully_qualified_or_none() (permissive),
  - reject unresolved symbols / placeholders,
  - optionally unwrap a TypeAlias to get to the underlying TypeInfo,
  - build an Instance, synthesizing default type arguments with a
    particular AnyType flavor (special_form vs unannotated).
builtin_type() just wrapped named_type(), obscuring the fact that
it's semantically "strict + use special_form AnyType defaults".

This meant the same idea ("turn a fully qualified name into an Instance")
was duplicated in several places, with subtle differences that were not
obvious at the call site.

After this change:

_build_named_instance(fullname, args, *, allow_missing, any_flavor,
unwrap_alias) becomes the single internal implementation of that logic:
  - allow_missing controls whether we allow unresolved names
    (mirrors strict vs permissive lookup).
  - any_flavor selects which AnyType flavor to use for synthesized
    type arguments when args is not provided.
  - unwrap_alias controls whether a TypeAlias target should be
    unwrapped to its underlying TypeInfo (this matches the historical
    behavior of named_type_or_none()).
named_type() now calls the helper with
  allow_missing=False, any_flavor=TypeOfAny.special_form,
  unwrap_alias=False, and asserts that a result exists.
named_type_or_none() calls it with
  allow_missing=True, any_flavor=TypeOfAny.unannotated,
  unwrap_alias=True, and returns None if the symbol couldn't be
  resolved or is still a placeholder.
builtin_type() now directly calls the helper with the same strict
parameters as named_type(), making its semantics explicit while
keeping the same behavior.

Why this matters:

The public helpers keep the same signatures and the same intended
contracts:
  - named_type(fullname, args?) -> Instance
  - named_type_or_none(fullname, args?) -> Instance | None
  - builtin_type(fullname) -> Instance
The differences between them (strict vs permissive lookup, alias
unwrapping, AnyType flavor for default arguments) are now explicit in
how they call _build_named_instance() instead of being duplicated in
multiple ad hoc implementations.
Future changes to how we resolve names or synthesize default type
arguments only need to happen in one place.

This continues the cleanup called out in issue python#4157: reducing the number
of slightly different lookup/type helpers and making the remaining ones
more uniform. _build_named_instance() is internal-only and not part of
the plugin API surface.
@github-actions
Copy link
Contributor

According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant