-
Notifications
You must be signed in to change notification settings - Fork 52
feat: add isin
to the specification
#959
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -1,7 +1,50 @@ | ||||||
__all__ = ["unique_all", "unique_counts", "unique_inverse", "unique_values"] | ||||||
__all__ = ["isin", "unique_all", "unique_counts", "unique_inverse", "unique_values"] | ||||||
|
||||||
|
||||||
from ._types import Tuple, array | ||||||
from ._types import Tuple, Union, array | ||||||
|
||||||
|
||||||
def isin( | ||||||
x1: Union[array, int, float, complex, bool], | ||||||
x2: Union[array, int, float, complex, bool], | ||||||
/, | ||||||
*, | ||||||
invert: bool = False, | ||||||
) -> array: | ||||||
""" | ||||||
Tests whether each element in ``x1`` is in ``x2``. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How about this wording:
Suggested change
|
||||||
Parameters | ||||||
---------- | ||||||
x1: Union[array, int, float, complex, bool] | ||||||
first input array. **May** have any data type. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just to double-check, are we happy with e.g. torch not allowing complex values here:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @ev-br How difficult would this be to work around in the compat layer? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's doable of course. That said, we are then adding to the growing "thickness" of the allegedly thin compat shim that is array-api-compat, and we'd be adding one more thing that realistically pytorch is not going to implement in a foreseeable future. Meaning we should be very clear that these small steps all move the compat layer from a temporary solution to permanent. Which looks like we should just stop pretending that the compat layer is temporary. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What are actual use-cases for (In the sense, that one point of the Array API was to not add a lot of complicated/awkward API.) |
||||||
x2: Union[array, int, float, complex, bool] | ||||||
second input array. **May** have any data type. | ||||||
invert: bool | ||||||
boolean indicating whether to invert the test criterion. If ``True``, the function **must** test whether each element in ``x1`` is *not* in ``x2``. If ``False``, the function **must** test whether each element in ``x1`` is in ``x2``. Default: ``False``. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So if There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, I think that may be true... which in fact might be a bit awkward, because I am not sure if There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As long as nans are never
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I suppose the weird thing is whether There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, sorry, mind-slip. Somehow I sometimes think just inverting can lead to weird things with NaNs, but that only works with the other comparisons not There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, here it only works because of equality comparison IIUC. Otherwise you're completely right, nans throw off logical inversion There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @ev-br Yes, in principle, we could drop the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I recognize that the performance optimization argument I just stated is a bit tenuous here, given that the OP explicitly advocates against including |
||||||
Returns | ||||||
------- | ||||||
out: array | ||||||
an array containing element-wise test results. The returned array **must** have the same shape as ``x1`` and **must** have a boolean data type. | ||||||
Notes | ||||||
----- | ||||||
- At least one of ``x1`` or ``x2`` **must** be an array. | ||||||
- If an element in ``x1`` is in ``x2``, the corresponding element in the output array **must** be ``True``; otherwise, the corresponding element in the output array **must** be ``False``. | ||||||
- Testing whether an element in ``x1`` corresponds to an element in ``x2`` **must** be determined based on value equality (see :func:`~array_api.equal`). For input arrays having floating-point data types, value-based equality implies the following behavior. When ``invert`` is ``False``, | ||||||
- As ``nan`` values compare as ``False``, if an element in ``x1`` is ``nan``, the corresponding element in the returned array **must** be ``False``. | ||||||
- As complex floating-point values having at least one ``nan`` component compare as ``False``, if an element in ``x1`` is a complex floating-point value having one or more ``nan`` components, the corresponding element in the returned array **must** be ``False``. | ||||||
- As ``-0`` and ``+0`` compare as ``True``, if an element in ``x1`` is ``±0`` and ``x2`` contains at least one element which is ``±0``, the corresponding element in the returned array **must** be ``True``. | ||||||
When ``invert`` is ``True``, the returned array **must** contain the same results as if the operation is implemented as ``logical_not(isin(x1, x2))``. | ||||||
- Comparison of arrays without a corresponding promotable data type (see :ref:`type-promotion`) is unspecified and thus implementation-defined. | ||||||
""" | ||||||
|
||||||
|
||||||
def unique_all(x: array, /) -> Tuple[array, array, array, array]: | ||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for the very late comment! I'm afraid allowing
x2
to be anarray
is problematic for ndonnx (or rather ONNX in general). Ifx2
is known "at build time" - as it would be if it were, for instance, aSequence[int | float | complex | bool]
- then there are efficient ways to implement this in the ONNX standard using a HashMap. However, if the values ofx2
are not known ahead of time, then one would be forced to implement this using broadcasting, such asx1[:, None] == x2[None, :]
and a subsequentany
reduction. The performance of such an implementation may be surprisingly bad in some circumstances. This is also the reason why we use aSequence
rather than an array in the current ndonnx implementation.