🌐 AI搜索 & 代理 主页
Skip to content

Conversation

@mtsokol
Copy link
Member

@mtsokol mtsokol commented Sep 8, 2025

This PR fixes a refcount bug raised in #29707 and adds a test which fails on main branch for linux, x86.

ngoldbaum
ngoldbaum previously approved these changes Sep 8, 2025
Copy link
Member

@ngoldbaum ngoldbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but I'll let someone else look this over before merging

@charris charris added the 09 - Backport-Candidate PRs tagged should be backported label Sep 9, 2025
@ngoldbaum
Copy link
Member

@benburrill thanks for the code review help!

The reference counting here is tricky - I'd like @seberg to look this over. He's on vacation this week so might not look at it until next week.

@mtsokol
Copy link
Member Author

mtsokol commented Sep 10, 2025

Sure, let's wait for another pair of eyes. In general, PyArg_ParseTupleAndKeywords increments refcount for newtype, and PyArray_CastToType steals it - therefore the standard path where we pass dtype doesn't alter refcount.

When we don't pass it, and have copy=True, we need an additional Py_INCREF(newtype); to even out PyArray_CastToType.

When we pass it (and is the same as array's) and have copy=None, we have one refcount extra from PyArg_ParseTupleAndKeywords parsing, ergo we need an additional Py_XDECREF(newtype); for that path to even out missing stealing call (which is only present later, when newtype is present and differs from array's dtype).

The test covers: dtype(none | same | different) x copy(true | false) (6) scenarios.

@ngoldbaum
Copy link
Member

I think it would help if you added an edited-down subset of that last comment as a comment in the source code. Probably some of it makes more sense as a comment in the test and some of it makes sense as a comment in the __array__ implementation.

@benburrill
Copy link

In general, PyArg_ParseTupleAndKeywords increments refcount for newtype

Just to be slightly more precise, PyArray_DescrConverter2 is (as I understand it) what's actually responsible for incrementing the refcount for newtype, not PyArg_ParseTupleAndKeywords itself.

@mtsokol
Copy link
Member Author

mtsokol commented Sep 10, 2025

Correct, PyArray_DescrConverter2 converter function which is passed to PyArg_ParseTupleAndKeywords.

@ngoldbaum ngoldbaum dismissed their stale review September 10, 2025 17:33

I approved this with a bug

@benburrill
Copy link

Yeah, I just thought the wording could potentially be a bit misleading to readers (if you're planning to put something similar in a comment in the source code), since although

In general, PyArg_ParseTupleAndKeywords increments refcount for newtype

is true for this specific code,

In general, PyArg_ParseTupleAndKeywords increments refcount

is not true.

return self.val.__array__(dtype=dtype, copy=copy)

# test all possible scenarios:
# dtype(none | same | different) x copy(true | false)
Copy link

@benburrill benburrill Sep 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized you haven't fully tested the COPY_NEVER case. Really, you should be testing copy(True | False | None). Currently, you're actually testing copy(True | None).

This test fails:

with pytest.raises(ValueError):
    np.array(MyArray(dt), dtype=dt2, copy=False)
assert_equal(old_refcount2, sys.getrefcount(dt2))

Adding Py_DECREF(newtype); to the npy_no_copy_err_msg branch fixes it.

I'm also suspicious about the if (!PyArray_CheckExact(self)) branch (I think at the very least there should be Py_XDECREF(newtype) before the return NULL;, and possibly more than that), but I'm not totally sure how to test it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, copy=False when dtypes are different was a path missed in the test - thanks!

About PyArray_CheckExact: Py_XDECREF was clearly missing there, but I think it's already an esoteric path, as PyArray_NewFromDescrAndBase needs to fail and set an exception.

Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, although I think the new comments are more verbose than helpful. Half the paths (at least the error paths) have no comment. What is useful, IMO, is that CastToType steals the reference (because reference stealing is confusing API basically).

If we want to really clarify this all, I think the right thing would be to simplify the code instead...

  • Swap self and new naming. Because new is clearly a new reference ;), while self is suprisingly a new reference.
  • One could move newtype == NULL logic to the top directly after the argument parsing, that way it get's much clearer why the Incref is needed. (at the cost of an unnecessary incref/decref.)
  • The best way to avoid piling up DECREF() before the return would be to just make it goto finish; that has the logic. But since that requires adding newtype = NULL; // Stolen by CastToType (or another incref), I am not sure it's nicer. It would be clearer though.

Anyway, besides thinking that the comments are not useful except as comments during review, this looks right to me.

Comment on lines 966 to 968
// Take a new strong reference to compensate for
// PyArray_CastToType stealing a reference to newtype.
Py_INCREF(newtype);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Take a new strong reference to compensate for
// PyArray_CastToType stealing a reference to newtype.
Py_INCREF(newtype);
Py_INCREF(newtype); // newtype is owned.

Maybe. I don't like the overly long comments, TBH.

We could add a comment on all CastToType calls that it steals the reference, though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! Also added CastToType comments.

Comment on lines 975 to 976
// If newtype isn't NULL then release a strong reference
// introduced by PyArray_DescrConverter2 in arg parsing.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// If newtype isn't NULL then release a strong reference
// introduced by PyArray_DescrConverter2 in arg parsing.

I disagree that it is helpful to add it here when it's not everywhere else.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed!

@mtsokol
Copy link
Member Author

mtsokol commented Sep 16, 2025

@seberg thanks for feedback! I found moving newtype == NULL check to the top really useful - now it's clearer how we count refs here. I decided skip the other two points if it's OK.

@seberg
Copy link
Member

seberg commented Sep 17, 2025

Nice, thanks! Yeah, I didn't care about doing all of those things, more of a rambling that in my opinion code reorganization would do more good than long comments :).

@seberg seberg merged commit c9e7592 into numpy:main Sep 17, 2025
77 checks passed
@mtsokol mtsokol deleted the ms/fix-__array__-refcount branch September 17, 2025 07:19
bwhitt7 pushed a commit to bwhitt7/numpy that referenced this pull request Sep 23, 2025
* BUG: Fix `dtype` refcount in `__array__`

* Consider all possible code paths

* Remove else-if branch

* Move refcount checks to a separate test

* Add code comments

* Add missing `Py_DECREF` for error path

* Apply review comments
charris pushed a commit to charris/numpy that referenced this pull request Sep 23, 2025
* BUG: Fix `dtype` refcount in `__array__`

* Consider all possible code paths

* Remove else-if branch

* Move refcount checks to a separate test

* Add code comments

* Add missing `Py_DECREF` for error path

* Apply review comments
@charris charris removed the 09 - Backport-Candidate PRs tagged should be backported label Sep 23, 2025
charris added a commit that referenced this pull request Sep 23, 2025
BUG: Fix `dtype` refcount in `__array__` (#29715)
IndifferentArea pushed a commit to IndifferentArea/numpy that referenced this pull request Dec 7, 2025
* BUG: Fix `dtype` refcount in `__array__`

* Consider all possible code paths

* Remove else-if branch

* Move refcount checks to a separate test

* Add code comments

* Add missing `Py_DECREF` for error path

* Apply review comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: Reference counting problem on dtypes with __array__

5 participants