[maintenance] change logic in sklearnex `get_namespace` to follow sklearn array API expected behavior #2747

icfaust · 2025-10-23T10:51:36Z

Description

DPNP and DPCTL now support the __array_namespace__ attribute for array API. Sklearn's get_namespace contains checks for validation in verifying the same namespace. There is a bit of a dance with get_namespace where it is used before and after validate_data and before and after sklearnex._device_offload.dispatch. This change allows for the array namespace verification in the case that array_api_dispatch is enabled.

@david-cortes-intel I think your reviews with intermixed array types should be formalized and integrated for better quality of the codebase. I would recommend also device intermixing be done. This way we can accelerate #2209 #2201 #2700 and #2654 to minimize maintenance without getting bogged down in a bug hunt impacting larger array API scope (i.e. its likely that other estimators are failing in similar ways and need follow up work).

Checklist:

Completeness and readability

I have commented my code, particularly in hard-to-understand areas.
I have updated the documentation to reflect the changes or created a separate PR with updates and provided its number in the description, if necessary.
Git commit message contains an appropriate signed-off-by string (see CONTRIBUTING.md for details).
I have resolved any merge conflicts that might occur with the base branch.

Testing

I have run it locally and tested the changes extensively.
All CI jobs are green or I have provided justification why they aren't.
I have extended testing suite if new functionality was introduced in this PR.

Performance

I have measured performance for affected algorithms using scikit-learn_bench and provided at least a summary table with measured data, if performance change is expected.
I have provided justification why performance and/or quality metrics have changed or why changes are not expected.
I have extended the benchmarking suite and provided a corresponding scikit-learn_bench PR if new measurable functionality was introduced in this PR.

david-cortes-intel · 2025-10-23T12:27:58Z

@icfaust I'm still seeing the same error message after merging this PR in your RF branch:

import os, sys
os.environ["SCIPY_ARRAY_API"] = "1"
import numpy as np
import dpnp
from sklearnex import config_context, set_config
from sklearnex.ensemble import RandomForestClassifier
from sklearn.datasets import make_classification
X, y = make_classification(random_state=123)
Xd = dpnp.array(X, dtype=np.float32, device="cpu")
yd = dpnp.array(y, dtype=np.float32, device="cpu")

set_config(array_api_dispatch=True)
model = RandomForestClassifier(n_estimators=1, max_depth=5).fit(Xd, yd)
model.predict(X[:5])

ValueError: `mode` must be `wrap` or `clip`.Got `raise`.

I cannot reproduce the error with any of the other current array API classes, although I see that previous ones all use arrays of the given class to store fitted attributes like coefficients, whereas forests work quite differently.

codecov · 2025-10-23T12:29:23Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

Flag	Coverage Δ
azure	`80.48% <83.33%> (+<0.01%)`	⬆️
github	`82.08% <100.00%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
sklearnex/utils/_array_api.py	`91.66% <100.00%> (+0.23%)`	⬆️

... and 3 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

icfaust · 2025-10-23T13:53:13Z

@icfaust I'm still seeing the same error message after merging this PR in your RF branch:
import os, sys
os.environ["SCIPY_ARRAY_API"] = "1"
import numpy as np
import dpnp
from sklearnex import config_context, set_config
from sklearnex.ensemble import RandomForestClassifier
from sklearn.datasets import make_classification
X, y = make_classification(random_state=123)
Xd = dpnp.array(X, dtype=np.float32, device="cpu")
yd = dpnp.array(y, dtype=np.float32, device="cpu")

set_config(array_api_dispatch=True)
model = RandomForestClassifier(n_estimators=1, max_depth=5).fit(Xd, yd)
model.predict(X[:5])
ValueError: `mode` must be `wrap` or `clip`.Got `raise`.
I cannot reproduce the error with any of the other current array API classes, although I see that previous ones all use arrays of the given class to store fitted attributes like coefficients, whereas forests work quite differently.

Please make sure to apply 665b903 as well. My reproduction of this now raises:

TypeError: Multiple namespaces for array inputs: {<module 'dpnp' from 'dpnp/__init__.py'>, <module 'array_api_compat.numpy' from 'array_api_compat/numpy/__init__.py

How about with sklearn: https://github.com/scikit-learn/scikit-learn/blob/c60dae2060/sklearn/decomposition/_base.py#L167 Try inverse_transform using numpy data (That is one of the major array API estimators from their codebase)

david-cortes-intel · 2025-10-23T14:02:26Z

@icfaust I'm still seeing the same error message after merging this PR in your RF branch:
import os, sys
os.environ["SCIPY_ARRAY_API"] = "1"
import numpy as np
import dpnp
from sklearnex import config_context, set_config
from sklearnex.ensemble import RandomForestClassifier
from sklearn.datasets import make_classification
X, y = make_classification(random_state=123)
Xd = dpnp.array(X, dtype=np.float32, device="cpu")
yd = dpnp.array(y, dtype=np.float32, device="cpu")

set_config(array_api_dispatch=True)
model = RandomForestClassifier(n_estimators=1, max_depth=5).fit(Xd, yd)
model.predict(X[:5])
ValueError: `mode` must be `wrap` or `clip`.Got `raise`.
I cannot reproduce the error with any of the other current array API classes, although I see that previous ones all use arrays of the given class to store fitted attributes like coefficients, whereas forests work quite differently.
Please make sure to apply 665b903 as well. My reproduction of this now raises:
TypeError: Multiple namespaces for array inputs: {<module 'dpnp' from 'dpnp/__init__.py'>, <module 'array_api_compat.numpy' from 'array_api_compat/numpy/__init__.py
How about with sklearn: https://github.com/scikit-learn/scikit-learn/blob/c60dae2060/sklearn/decomposition/_base.py#L167 Try inverse_transform using numpy data (That is one of the major array API estimators from their codebase)

I was able to trigger the new error message after merging that commit.

But if this is the intended behavior, then please modify the docs too, since they currently state that it should work:

scikit-learn-intelex/doc/sources/array_api.rst

Line 147 in 55571d3

# Fitted models can be passed array API inputs of a different class

(note that when that doc was written. fitting on array API on CPU and then predicting on numpy would work for the other estimators that support array API)

icfaust · 2025-10-23T14:10:52Z

/intelci: run

icfaust · 2025-10-24T12:54:24Z

/intelci: run

Update _array_api.py

af4db1a

icfaust mentioned this pull request Oct 23, 2025

[enhancement] Enable Array API in ensemble algos #2201

Open

13 tasks

icfaust added 3 commits October 23, 2025 13:40

Update _array_api.py

daf6e5b

Update _array_api.py

a7bd059

Update _array_api.py

3322331

icfaust added 2 commits October 24, 2025 05:30

Merge branch 'uxlfoundation:main' into dev/update_get_namespace

f021183

Merge branch 'uxlfoundation:main' into dev/update_get_namespace

bfe2bc5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[maintenance] change logic in sklearnex `get_namespace` to follow sklearn array API expected behavior #2747

[maintenance] change logic in sklearnex `get_namespace` to follow sklearn array API expected behavior #2747

Uh oh!

icfaust commented Oct 23, 2025

Uh oh!

david-cortes-intel commented Oct 23, 2025

Uh oh!

codecov bot commented Oct 23, 2025 •

edited

Loading

Uh oh!

icfaust commented Oct 23, 2025 •

edited

Loading

Uh oh!

david-cortes-intel commented Oct 23, 2025

Uh oh!

icfaust commented Oct 23, 2025

Uh oh!

icfaust commented Oct 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

[maintenance] change logic in sklearnex get_namespace to follow sklearn array API expected behavior #2747

Are you sure you want to change the base?

[maintenance] change logic in sklearnex get_namespace to follow sklearn array API expected behavior #2747

Uh oh!

Conversation

icfaust commented Oct 23, 2025

Description

Uh oh!

david-cortes-intel commented Oct 23, 2025

Uh oh!

codecov bot commented Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

icfaust commented Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

david-cortes-intel commented Oct 23, 2025

Uh oh!

icfaust commented Oct 23, 2025

Uh oh!

icfaust commented Oct 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[maintenance] change logic in sklearnex `get_namespace` to follow sklearn array API expected behavior #2747

[maintenance] change logic in sklearnex `get_namespace` to follow sklearn array API expected behavior #2747

codecov bot commented Oct 23, 2025 •

edited

Loading

icfaust commented Oct 23, 2025 •

edited

Loading