-
Couldn't load subscription status.
- Fork 184
[maintenance] change logic in sklearnex get_namespace to follow sklearn array API expected behavior
#2747
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@icfaust I'm still seeing the same error message after merging this PR in your RF branch: import os, sys
os.environ["SCIPY_ARRAY_API"] = "1"
import numpy as np
import dpnp
from sklearnex import config_context, set_config
from sklearnex.ensemble import RandomForestClassifier
from sklearn.datasets import make_classification
X, y = make_classification(random_state=123)
Xd = dpnp.array(X, dtype=np.float32, device="cpu")
yd = dpnp.array(y, dtype=np.float32, device="cpu")
set_config(array_api_dispatch=True)
model = RandomForestClassifier(n_estimators=1, max_depth=5).fit(Xd, yd)
model.predict(X[:5])I cannot reproduce the error with any of the other current array API classes, although I see that previous ones all use arrays of the given class to store fitted attributes like coefficients, whereas forests work quite differently. |
Codecov Report✅ All modified and coverable lines are covered by tests.
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 3 files with indirect coverage changes 🚀 New features to boost your workflow:
|
Please make sure to apply 665b903 as well. My reproduction of this now raises: TypeError: Multiple namespaces for array inputs: {<module 'dpnp' from 'dpnp/__init__.py'>, <module 'array_api_compat.numpy' from 'array_api_compat/numpy/__init__.pyHow about with sklearn: https://github.com/scikit-learn/scikit-learn/blob/c60dae2060/sklearn/decomposition/_base.py#L167 Try |
I was able to trigger the new error message after merging that commit. But if this is the intended behavior, then please modify the docs too, since they currently state that it should work:
(note that when that doc was written. fitting on array API on CPU and then predicting on numpy would work for the other estimators that support array API) |
|
/intelci: run |
|
/intelci: run |
Description
DPNP and DPCTL now support the
__array_namespace__attribute for array API. Sklearn'sget_namespacecontains checks for validation in verifying the same namespace. There is a bit of a dance withget_namespacewhere it is used before and aftervalidate_dataand before and aftersklearnex._device_offload.dispatch. This change allows for the array namespace verification in the case that array_api_dispatch is enabled.@david-cortes-intel I think your reviews with intermixed array types should be formalized and integrated for better quality of the codebase. I would recommend also device intermixing be done. This way we can accelerate #2209 #2201 #2700 and #2654 to minimize maintenance without getting bogged down in a bug hunt impacting larger array API scope (i.e. its likely that other estimators are failing in similar ways and need follow up work).
Checklist:
Completeness and readability
Testing
Performance