🌐 AI搜索 & 代理 主页
Skip to content

Conversation

@scratchmex
Copy link
Contributor

@scratchmex scratchmex commented Dec 3, 2025

remove begin repeat and use C macros. work towards #29528

for now I'm just refactoring to use macros and whenever the c++ templates are needed I will change to it. moving to C macros still has it's advantages cause I'm aiming for grepability and thus LSP advantages: Py@NAME@ArrType_Type vs PyNumberArrType_Type

the objective is to have a scalartypes.c or scalartypes.cpp

edit: the objective now is scalartypes.cpp

- DEF_PYARRTYPE
- DEF_BINOP
@seberg
Copy link
Member

seberg commented Dec 3, 2025

I think the helpful first step may to be convert the file to C++ to continue in steps from there. These custom macros seem a step in the wrong direction, since if we are to replace custom templating with C++ templating, we should use that.

@scratchmex
Copy link
Contributor Author

scratchmex commented Dec 3, 2025

I'm going to do that in another PR because apparently is not as trivial as I thought. converting from scalartypes.c.src to scalartypes.cpp.src

edit: I pushed the changes to review, and because this will have a dependency on it. Just confirm me if you want this in separate PR for easy review and to avoid noise on this

edit 2: well apparently for the migration to C++ in windows it requires C++ 20 :/. do you have a suggestion? I think it is not suitable to do it without designated initializers { .tp_name = val }

error C7555: use of designated initializers requires at least '/std:c++20'

edit 3: learning stuff: struct factories at top level and compile time. introduce make_pyarr_type

edit 4: removed all designated initializers just trying to make it compile in MSVC so that I can mark as "completed" the migration to C++.

> warning: ISO C++11 does not allow conversion from string literal to 'char *'

in python >=3.13 they changed PyArg_ParseTupleAndKeywords to have char *const* in C++, instead of char**
ref: https://docs.python.org/3/c-api/arg.html#c.PyArg_ParseTupleAndKeywords

TODO: remove these (char **) castings when we drop support for 3.12
@scratchmex scratchmex force-pushed the gh-29528--scalartypes branch from 567fb25 to 69fa910 Compare December 3, 2025 23:01
this should be reverted when moving to c++20 in MSVC
@scratchmex scratchmex force-pushed the gh-29528--scalartypes branch from d4daaff to b09125a Compare December 4, 2025 01:17
@scratchmex
Copy link
Contributor Author

scratchmex commented Dec 4, 2025

nicee. finally C++ compiles jaja!

Just confirm me if you want the migration to C++ in a separate PR for easy review and to avoid noise on the removal of .src x2

edit 5: cannot make factories for array of methods because a) trying to use static variables inside a constexpr is not allowed and b) returning a stack address associated with a local variable is not something we want.

a) definition of a static variable in a constexpr function is a C++23 extension
b) address of stack memory associated with local variable 'methods' returned

because of that, I will just repeat the definition of those

edit 6: cannot use constexpr std::map until C++26 to avoid using a chain of if-else constexpr

edit 7: cannot use template <const char* mystr> to create function factories. nor I can make a function return another function (lambda). will need to make definition of each function that wraps the generic one

@seberg
Copy link
Member

seberg commented Dec 5, 2025

I think it is not suitable to do it without designated initializers { .tp_name = val }

I think they are allowed to be written as long as they are in the exactly correct order? I am not sure how many holes they have, this is one annoyance with C++. If there are few we might also just fill them in explicitly, otherwise, I guess this tricky is fine.

Maybe it makes sense to keep going here for now and then see if we want to split out just the initial conversion for example. (A bit conflicting, I would prefer to split it out, but if we don't follow through with the full thing there is no point.)

For what it's worth, I think the changes you did beyond the main change are not far reaching enough to be nice, we need to go all-in on templates for things to make sense. (Almost) anything repeated should be a template instead, the bufferprocs are a good example, that should just be something like bufferprocs<typename T>.
(The gentype version is then the overload that kicks in if a more specialized version is not defined.)

EDIT: CC @athurdekoos for awareness (and maybe if you want to even review).

@scratchmex
Copy link
Contributor Author

the bufferprocs are a good example, that should just be something like bufferprocs<typename T>

you want

static auto unicode_arrtype_as_buffer = generic_pybufferprocs<unicode_getbuffer>;
static auto bool_arrtype_as_buffer = generic_pybufferprocs<dunder_buffer_impl1<npy_bool, PyBoolScalarObject>>;

instead of

static auto unicode_arrtype_as_buffer = make_pybufferprocs(unicode_getbuffer);
static auto bool_arrtype_as_buffer = make_pybufferprocs(dunder_buffer_impl1<npy_bool, PyBoolScalarObject>);

?

@seberg
Copy link
Member

seberg commented Dec 6, 2025

No, unicode_arrtype_as_buffer need not exisst at all.

@scratchmex
Copy link
Contributor Author

scratchmex commented Dec 6, 2025

unicode_arrtype_as_buffer needs to live in the stack at the top level scope because later its pointer is passed to their type definition like this

// that's why I do
static auto unicode_arrtype_as_buffer = generic_pybufferprocs<unicode_getbuffer>;  // << top level
PyUnicodeArrType_Type.tp_as_buffer = &unicode_arrtype_as_buffer;
// instead of
PyUnicodeArrType_Type.tp_as_buffer = &make_pybufferprocs(unicode_getbuffer);

and doing something like

PyUnicodeArrType_Type.tp_as_buffer = &arrtype_as_buffer<unicode_getbuffer>

I wasn't able to do it yet. Can you give an example for that?

@scratchmex
Copy link
Contributor Author

I think the PR can be taken a first look. Remarking the following points

  1. The most difficult part of the migration was the __new__ impls block, I would like a review on that
  2. I want to get rid of the macros but I don't find a better way to do it right now
  3. seberg comment about assigning a pointer to a templated struct. I don't know how to do it

ccing the top 3 commiters of scalartypes.c.src @seberg @ngoldbaum @mtsokol

Comment on lines +840 to 843
PyMem_Free(ip);
if (new_ == NULL) {
return NULL;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
PyMem_Free(ip);
if (new_ == NULL) {
return NULL;
}
if (new_ == NULL) {
PyMem_Free(ip);
return NULL;
}

@scratchmex scratchmex marked this pull request as ready for review December 7, 2025 00:30
@scratchmex scratchmex changed the title MAINT: refactor scalartypes.c.src to avoid using custom metaprogramming MAINT: refactor scalartypes.c.src to pure C++ Dec 7, 2025
@ngoldbaum
Copy link
Member

ping @athurdekoos as well

Copy link
Member

@ngoldbaum ngoldbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I skimmed over most of the changes in scalartypes.cpp and left a few comments. No careful review here.

npy_type scalar = PyArrayScalar_VAL(self, ScalarName); \
uint8_t count = npy_popcount_func(scalar); \
return ConvertFrom(count); \
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't this be a template function?

gentype_squeeze(PyObject *self, PyObject *args, PyObject *kwds)
{
return gentype_generic_method(self, args, kwds, "squeeze");
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given that it's trivial it's probably better to make this a macro though, it'll be a lot more concise.

(PyCFunction)npy_int_bit_count,
METH_NOARGS, NULL},
{NULL, NULL, 0, NULL} /* sentinel */
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that these are all explicitly defined in the source now, it's OK that it's more verbose.

else if constexpr (std::is_same_v<PT, PyCDoubleScalarObject>) {
return "Zd";
}
else if constexpr (std::is_same_v<PT, PyCLongDoubleScalarObject>) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

general question about code like this from a non-expert in C++, code like this using constexpr is evaluated at compile-time, right?

@seberg
Copy link
Member

seberg commented Dec 9, 2025

seberg comment about assigning a pointer to a templated struct. I don't know how to do it

Maybe check out what jax/ml_dtypes is doing and see if you get inspired from a pattern. I don't really want to try around right now, but I think this has to happen in some form.
I would hope all explicit instantiation with a name is unnecessary with the exception a single global type probably.
(In part, focusing on this type of discussion is why it might be helpful to do just the conversion first.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants