🌐 AI搜索 & 代理 主页
Skip to content

Conversation

@mhvk
Copy link
Contributor

@mhvk mhvk commented Oct 14, 2025

With #28576, it has become possible to ensure that when calling a ufunc, the output is guaranteed to be an array.

This PR uses that to replace np.asanyarray calls in some functions. I'm not sure this is complete -- these were just parts of code for which test failed if np.anyarray(scalar) is made to return a read-only array (see #29876).

I do think there are all small improvements, though.

@mhvk mhvk force-pushed the use-ellipses-to-keep-array branch from 4f7fcd0 to 3e27ff0 Compare October 14, 2025 16:52
@jorenham jorenham changed the title MAINT: replace use of asanyarray with ... to keep arrays MAINT: replace use of asanyarray with out=... to keep arrays Oct 14, 2025
Copy link
Contributor Author

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments to help review.

x = um.subtract(arr, arrmean, out=...)
if issubclass(arr.dtype.type, (nt.floating, nt.integer)):
x = um.multiply(x, x, out=x)
x = um.square(x, out=x)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, couldn't resist making a few more small improvements as I was here...

q = np.quantile(x, .5)
assert_equal(q, 1.75)
assert_equal(type(q), np.float64)
assert isinstance(q, float)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a side effect, the function now returns regular float. Since here one is dealing with object arrays, that arguably is more appropriate. But it really should not matter at all -- it is not like one could count on an np.float64, in most cases one gets a Fraction.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think I had skipped some cases like this when first adding out=..., but should have followed up on it. I think this is fine for object array input... It is arguably a bug-fix.
But it is also a very subtle small change that might break someone in theory!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really only in theory, I think, since even before one couldn't count on getting np.float64...


# Convention is to return scalars instead of 0d arrays
if r.ndim == 0:
r = r[()]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the change below takes care of this now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, yeah with r potentially being scalar might be awkward, but I don't think this makes it less awkward and we use dtype.type(val) in enough places that I think we can rely on it.
(It seemed slightly awkward for object dtype, but it works even there.)

@mhvk mhvk force-pushed the use-ellipses-to-keep-array branch from 3e27ff0 to 86c9d84 Compare October 14, 2025 19:15
Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks. The change in quantile for object input is very subtle, but I think we should do it and it's technically fixing bugs for decimal or fractional inputs.

I'll wait a bit in case someone has thoughts, but I think we can put this in.

q = np.quantile(x, .5)
assert_equal(q, 1.75)
assert_equal(type(q), np.float64)
assert isinstance(q, float)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think I had skipped some cases like this when first adding out=..., but should have followed up on it. I think this is fine for object array input... It is arguably a bug-fix.
But it is also a very subtle small change that might break someone in theory!

diff_b_a = subtract(b, a)
# asanyarray is a stop-gap until gh-13105
lerp_interpolation = asanyarray(add(a, diff_b_a * t, out=out))
diff_b_a = b - a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason not to do subtract(b, a, out=...) actually and stick to arrays?
(but doesn't matter, TBH.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My logic was that if both are scalars, this is faster. I had hoped more generally to go to just using the operators, but in other places it turned out annoying, and then I decided to keep the changes minimal... Anyway, would suggest to leave it, but can change back if desired.


# Convention is to return scalars instead of 0d arrays
if r.ndim == 0:
r = r[()]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, yeah with r potentially being scalar might be awkward, but I don't think this makes it less awkward and we use dtype.type(val) in enough places that I think we can rely on it.
(It seemed slightly awkward for object dtype, but it works even there.)

@seberg
Copy link
Member

seberg commented Oct 28, 2025

Let's get this in then, thanks Marten!

The biggest change (I am not sure how big), is that quantiles propagate Fractions better with this which might change things in some niche situation (arguably a fix, but possibly a subtle change).

@seberg seberg merged commit a3a2f2f into numpy:main Oct 28, 2025
77 checks passed
@mhvk mhvk deleted the use-ellipses-to-keep-array branch October 28, 2025 12:11
@neutrinoceros
Copy link
Contributor

FYI it seems that, as a side effect, this fixed some odd behavior we had in unyt
xref: yt-project/unyt#603

I don't understand it myself without a deeper dig, but I thought that'd be interesting to you guys !

@mhvk
Copy link
Contributor Author

mhvk commented Nov 1, 2025

@neutrinoceros - nice to hear. Agreed it is not that obvious what change actually made the code work better.

"""
# promote back to an array if flattened
res = nx.asanyarray(nx.ceil(x, out=out))
res = nx.ceil(x, out=... if out is None else out)
Copy link

@rhshadrach rhshadrach Nov 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is causing a failure in pandas

https://github.com/pandas-dev/pandas/actions/runs/19195436099/job/54875728980?pr=63046#step:5:63

Not clear to me this was intentional. This is complicated by the fact that this was already xfailed on the main branch due to pandas-dev/pandas#51082 (comment), see also the comment a few below this one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not intentional, but we should allow out=... in pandas' __array_ufunc__ implementation maybe (it is the same as None except for reductions).

This allows fixing weird corner cases with object dtype in NumPy, so it would be nice to not stall on it forever.

(I don't mind reverting this or similar cases for now if you prefer.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nvm. We should just change fix to forward to trunc, that should be safe and good right now and not worry about this regression.

I'll have a look at the __array_ufunc__ in pandas so that this type of change doesn't break anything in the future.

cakedev0 pushed a commit to cakedev0/numpy that referenced this pull request Dec 5, 2025
With numpygh-28576, it has become possible to ensure that when calling a ufunc, the output is guaranteed to be an array.

This PR uses that to replace np.asanyarray calls in some functions. I'm not sure this is complete -- these were just parts of code for which test failed if np.anyarray(scalar) is made to return a read-only array (see numpygh-29876).

I do think there are all small improvements, though.
IndifferentArea pushed a commit to IndifferentArea/numpy that referenced this pull request Dec 7, 2025
With numpygh-28576, it has become possible to ensure that when calling a ufunc, the output is guaranteed to be an array.

This PR uses that to replace np.asanyarray calls in some functions. I'm not sure this is complete -- these were just parts of code for which test failed if np.anyarray(scalar) is made to return a read-only array (see numpygh-29876).

I do think there are all small improvements, though.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants