🌐 AI搜索 & 代理 主页
Skip to content

BUG: MemoryError when using negative width in numpy.strings.xxx or numpy.char.xxx with StringDType arrays #29359

@Kairoven

Description

@Kairoven

Describe the issue:

When using the new experimental StringDType with numpy.strings.center or numpy.char.center, if the width parameter is negative, it raises a MemoryError.

However, the same negative width works fine with Python str objects, numpy.str_, and np.dtype('<U') string arrays — they return the input unchanged, which matches Python’s str.center behavior.

Reproduce the code example:

import numpy as np
from numpy.dtypes import StringDType

a1 = 'test'
a2 = np.array('test', dtype = np.str_)
a3 = np.array('test', dtype = '<U4')
a4 = np.array('test', dtype=StringDType())

res1 = np.strings.center(a1, -1)
print(res1) # test

res2 = np.strings.center(a2, -1)
print(res2) # test

res3 = np.strings.center(a3, -1)
print(res3) # test

res4 = np.strings.center(a4, -1)
print(res4) # MemoryError: Failed to allocate string in _center

Error message:

Traceback (most recent call last):
  File "xxx/test_string.py", line 45, in <module>
    res4 = numpy.strings.center(a4, -1)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "xxx/python3.12/site-packages/numpy/_core/strings.py", line 765, in center
    return _center(a, width, fillchar)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
MemoryError: Failed to allocate string in _center

Python and NumPy Versions:

Numpy: 2.3.1
Python: 3.12.0 | packaged by Anaconda, Inc. | (main, Oct 2 2023, 17:29:18) [GCC 11.2.0]

Runtime Environment:

[{'numpy_version': '2.3.1',
'python': '3.12.0 | packaged by Anaconda, Inc. | (main, Oct 2 2023, '
'17:29:18) [GCC 11.2.0]',
'uname': uname_result(system='Linux', node='gpu-node5', release='5.4.0-100-generic', version='#113-Ubuntu SMP Thu Feb 3 18:43:29 UTC 2022', machine='x86_64')},
{'simd_extensions': {'baseline': ['SSE', 'SSE2', 'SSE3'],
'found': ['SSSE3',
'SSE41',
'POPCNT',
'SSE42',
'AVX',
'F16C',
'FMA3',
'AVX2',
'AVX512F',
'AVX512CD',
'AVX512_SKX',
'AVX512_CLX',
'AVX512_CNL',
'AVX512_ICL'],
'not_found': ['AVX512_KNL', 'AVX512_KNM', 'AVX512_SPR']}},
{'architecture': 'SkylakeX',
'filepath': '/home/root/miniconda3/envs/pbt/lib/python3.12/site-packages/numpy.libs/libscipy_openblas64_-56d6093b.so',
'internal_api': 'openblas',
'num_threads': 64,
'prefix': 'libscipy_openblas',
'threading_layer': 'pthreads',
'user_api': 'blas',
'version': '0.3.29'}]

Context for the issue:

I found that this issue is not limited to center, but also occurs in other string/char functions such as ljust, rjust, and zfill when used with StringDType arrays and negative width values. This appears to be a broader issue with StringDType handling in these functions.

Although the documentation states that StringDType is supported, inconsistent or erroneous behavior (e.g., MemoryError) suggests there may be an underlying bug. Should we take any action to address this inconsistency and improve robustness for users?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions