🌐 AI搜索 & 代理 主页
Skip to content

Conversation

@youknowone
Copy link
Member

@youknowone youknowone commented Nov 29, 2025

Summary by CodeRabbit

  • New Features

    • Metaclass-driven Structure and Union types with automatic field processing
    • CArgObject class exposed to ctypes for byref/reference handling
    • Platform-aware _wstring_at_addr attribute for wide-string address resolution
    • Array type multiplication to create nested array types
    • Field descriptors implementing descriptor protocol for access and mutation
  • Bug Fixes

    • byref() now validates input and returns a proper error or CArgObject
  • Improvements

    • Buffer-backed, sequence-enabled arrays with direct byte access and item operations

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 29, 2025

Caution

Review failed

The pull request is closed.

Walkthrough

Reworks _ctypes: adds metaclass-driven Structure/Union types, converts C arrays to buffer-backed, sequence-enabled types, implements descriptor-based PyCField read/write, changes byref to validate and return a new CArgObject, and exposes a platform-aware _wstring_at_addr attr.

Changes

Cohort / File(s) Summary
Core ctypes module and wiring
crates/vm/src/stdlib/ctypes.rs
Registers PyCStructType and PyCUnionType; exposes CArgObject pyclass with _obj getter; changes byref signature to (obj, offset) returning a validated CArgObject; adds _wstring_at_addr attr; imports ToPyObject, PyPayload, PyObjectRef.
Array refactor & behavior
crates/vm/src/stdlib/ctypes/array.rs, crates/vm/src/stdlib/ctypes/base.rs, crates/vm/src/stdlib/ctypes/pointer.rs
PyCArrayType now wraps an inner PyCArray; PyCArray drops value in favor of element_size: AtomicCell and buffer: PyRwLock<Vec<u8>>; adds sequence semantics (AsSequence), numeric ops (AsNumber/__mul__), item access (__getitem__, __setitem__, __len__), packing helpers, and to_arg from buffer contents.
Field descriptor implementation
crates/vm/src/stdlib/ctypes/field.rs
PyCField becomes a descriptor: adds PyCField::new, descr_get, descr_set, __set__, __delete__; proto changed to PyObjectRef; implements bytes↔value helpers and returns type as PyObjectRef.
Structure metaclass & Structure class
crates/vm/src/stdlib/ctypes/structure.rs
Adds PyCStructType metaclass with py_new, __init_subclass__, process_fields, get_field_size, __mul__; introduces FieldInfo; refactors PyCStructure to use buffer, fields map and size, with py_new populating fields and buffer.
Union metaclass & Union class
crates/vm/src/stdlib/ctypes/union.rs
Replaces direct Union with PyCUnionType metaclass; adds py_new, process_fields, and get_field_size; registers PyCField descriptors (offset 0) during class creation.

Sequence Diagram

sequenceDiagram
    participant User as User code
    participant Meta as PyCStructType / PyCUnionType
    participant Cls as Structure/Union Class
    participant Field as PyCField (descriptor)
    participant Inst as Structure/Union Instance
    participant Core as _ctypes module

    User->>Meta: define class with _fields_
    Meta->>Meta: process_fields(_fields_)
    Meta->>Field: create PyCField descriptors (compute size/offset)
    Meta->>Cls: register descriptors and fields metadata
    Meta-->>User: return created class

    User->>Cls: instantiate(args/kwargs)
    Cls->>Inst: py_new -> allocate buffer (Vec<u8>) and fields map
    Inst->>Inst: populate buffer from args/kwargs
    Inst-->>User: instance returned

    User->>Inst: read attribute
    Inst->>Field: descriptor __get__ (reads buffer[offset...])
    Field->>Field: bytes_to_value -> PyObject
    Field-->>User: value

    User->>Inst: write attribute
    Inst->>Field: descriptor __set__ (value_to_bytes -> bytes)
    Field->>Inst: write bytes into buffer[offset...]
    Field-->>User: done
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

  • Pay special attention to:
    • PyCArray buffer lifecycle, element_size correctness, and safety/to_arg pointer/lifetime behavior.
    • Descriptor bytes↔value conversion (bitfields, alignment, endianness).
    • process_fields offset/size calculations and descriptor registration.
    • byref → CArgObject validation, VM exposure, and ABI of returned wrapper.

Possibly related PRs

  • ctypes struct/union/array #6309 — appears to implement the same ctypes changes (CArgObject, metaclass registrations, byref signature/behavior).
  • Ctypes __mul__ #6305 — overlaps on adding AsNumber/mul behavior used by pointer/array multiplication and numeric semantics.
  • Downcastable #5986 — touches ctypes array implementation and to_arg/downcasting details overlapping with buffer/to_arg changes.

Suggested reviewers

  • arihant2math

"🐰
I nibble bytes and line them neat,
Fields and buffers now all meet,
Metaclasses hop, descriptors sing,
Byref brings CArg with a spring,
Hooray — ctypes leaps to greet!"

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 67.27% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'ctypes struct/union/array' is a concise summary of the main changes, which focus on implementing metaclass-driven ctypes structures, unions, and arrays with field processing and buffer management.

📜 Recent review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 10fe041 and 711ea51.

📒 Files selected for processing (3)
  • crates/vm/src/stdlib/ctypes.rs (5 hunks)
  • crates/vm/src/stdlib/ctypes/array.rs (5 hunks)
  • crates/vm/src/stdlib/ctypes/structure.rs (1 hunks)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@youknowone youknowone marked this pull request as ready for review November 29, 2025 02:04
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/vm/src/stdlib/ctypes/field.rs (1)

68-76: Repr displays boolean as size value.

bitfield_size is a bool (line 29), but the repr format bit_size={bitfield_size} suggests it should be a numeric size. This will display bit_size=true instead of an actual bit count.

Either rename the field to is_bitfield or change its type to store the actual bit size:

-    pub(super) bitfield_size: bool,
+    pub(super) bitfield_size: Option<usize>,  // None if not a bitfield, Some(n) for n-bit field
🧹 Nitpick comments (3)
crates/vm/src/stdlib/ctypes/union.rs (1)

85-109: Consider extracting shared size calculation logic.

This get_field_size logic duplicates the element size derivation in base.rs (lines 311-324) and array.rs. Consider extracting this into a shared helper function in _ctypes to reduce duplication and ensure consistent behavior.

Example shared helper:

// In _ctypes module
pub fn get_field_size_from_type(field_type: &PyObjectRef, vm: &VirtualMachine) -> PyResult<usize> {
    // ... shared implementation
}
crates/vm/src/stdlib/ctypes/array.rs (1)

374-382: Silent failure when value is not bytes.

The set_value setter silently ignores non-PyBytes values and returns Ok(()). This could mask user errors. Consider raising a TypeError for consistency with set_raw.

     #[pygetset(setter)]
     fn set_value(&self, value: PyObjectRef, vm: &VirtualMachine) -> PyResult<()> {
         if let Some(bytes) = value.downcast_ref::<PyBytes>() {
             let mut buffer = self.buffer.write();
             let src = bytes.as_bytes();
             let len = std::cmp::min(src.len(), buffer.len());
             buffer[..len].copy_from_slice(&src[..len]);
+            Ok(())
+        } else {
+            Err(vm.new_type_error("expected bytes".to_owned()))
         }
-        Ok(())
     }
crates/vm/src/stdlib/ctypes/structure.rs (1)

233-242: Inconsistent error handling: silently skipping malformed fields.

process_fields returns errors for malformed field tuples, but py_new silently skips them with continue. This inconsistency could cause hard-to-debug issues where a structure appears to work but has missing fields.

Consider reusing validation logic or returning errors consistently:

-            let Some(field_tuple) = field.downcast_ref::<PyTuple>() else {
-                continue;
-            };
-            if field_tuple.len() < 2 {
-                continue;
-            }
-            let Some(name) = field_tuple.first().unwrap().downcast_ref::<PyStr>() else {
-                continue;
-            };
+            let field_tuple = field
+                .downcast_ref::<PyTuple>()
+                .ok_or_else(|| vm.new_type_error("_fields_ must contain tuples".to_string()))?;
+            if field_tuple.len() < 2 {
+                return Err(vm.new_type_error("_fields_ tuple must have at least 2 elements".to_string()));
+            }
+            let name = field_tuple.first().unwrap()
+                .downcast_ref::<PyStr>()
+                .ok_or_else(|| vm.new_type_error("field name must be a string".to_string()))?;
📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7d8f0b9 and df369d0.

📒 Files selected for processing (7)
  • crates/vm/src/stdlib/ctypes.rs (5 hunks)
  • crates/vm/src/stdlib/ctypes/array.rs (5 hunks)
  • crates/vm/src/stdlib/ctypes/base.rs (1 hunks)
  • crates/vm/src/stdlib/ctypes/field.rs (4 hunks)
  • crates/vm/src/stdlib/ctypes/pointer.rs (1 hunks)
  • crates/vm/src/stdlib/ctypes/structure.rs (1 hunks)
  • crates/vm/src/stdlib/ctypes/union.rs (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style in Rust code (cargo fmt to format)
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by your changes
Follow Rust best practices for error handling and memory management

Files:

  • crates/vm/src/stdlib/ctypes/base.rs
  • crates/vm/src/stdlib/ctypes/pointer.rs
  • crates/vm/src/stdlib/ctypes/union.rs
  • crates/vm/src/stdlib/ctypes.rs
  • crates/vm/src/stdlib/ctypes/array.rs
  • crates/vm/src/stdlib/ctypes/field.rs
  • crates/vm/src/stdlib/ctypes/structure.rs
**/src/**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • crates/vm/src/stdlib/ctypes/base.rs
  • crates/vm/src/stdlib/ctypes/pointer.rs
  • crates/vm/src/stdlib/ctypes/union.rs
  • crates/vm/src/stdlib/ctypes.rs
  • crates/vm/src/stdlib/ctypes/array.rs
  • crates/vm/src/stdlib/ctypes/field.rs
  • crates/vm/src/stdlib/ctypes/structure.rs
🧠 Learnings (2)
📚 Learning: 2025-11-25T11:05:02.899Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-25T11:05:02.899Z
Learning: Applies to **/src/**/*.rs : Use the macro system (`pyclass`, `pymodule`, `pyfunction`, etc.) when implementing Python functionality in Rust

Applied to files:

  • crates/vm/src/stdlib/ctypes.rs
  • crates/vm/src/stdlib/ctypes/array.rs
📚 Learning: 2025-08-26T05:20:54.540Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.540Z
Learning: In the RustPython codebase, only certain builtin types should be marked with the SEQUENCE flag for pattern matching. List and tuple are sequences, but bytes, bytearray, and range are not considered sequences in this context, even though they may implement sequence-like protocols.

Applied to files:

  • crates/vm/src/stdlib/ctypes/array.rs
🧬 Code graph analysis (6)
crates/vm/src/stdlib/ctypes/base.rs (1)
crates/vm/src/stdlib/ctypes.rs (17)
  • get_size (140-159)
  • mem (142-142)
  • mem (143-143)
  • mem (144-144)
  • mem (145-145)
  • mem (146-146)
  • mem (147-147)
  • mem (148-148)
  • mem (149-149)
  • mem (150-150)
  • mem (151-151)
  • mem (152-152)
  • mem (153-153)
  • mem (154-154)
  • mem (155-155)
  • mem (156-156)
  • size_of (197-210)
crates/vm/src/stdlib/ctypes/pointer.rs (1)
crates/vm/src/stdlib/ctypes.rs (16)
  • mem (142-142)
  • mem (143-143)
  • mem (144-144)
  • mem (145-145)
  • mem (146-146)
  • mem (147-147)
  • mem (148-148)
  • mem (149-149)
  • mem (150-150)
  • mem (151-151)
  • mem (152-152)
  • mem (153-153)
  • mem (154-154)
  • mem (155-155)
  • mem (156-156)
  • size_of (197-210)
crates/vm/src/stdlib/ctypes/union.rs (3)
crates/vm/src/stdlib/ctypes.rs (17)
  • get_size (140-159)
  • mem (142-142)
  • mem (143-143)
  • mem (144-144)
  • mem (145-145)
  • mem (146-146)
  • mem (147-147)
  • mem (148-148)
  • mem (149-149)
  • mem (150-150)
  • mem (151-151)
  • mem (152-152)
  • mem (153-153)
  • mem (154-154)
  • mem (155-155)
  • mem (156-156)
  • size_of (197-210)
crates/vm/src/stdlib/ctypes/field.rs (1)
  • type_ (271-273)
crates/vm/src/stdlib/ctypes/structure.rs (1)
  • field_type (247-248)
crates/vm/src/stdlib/ctypes/array.rs (1)
crates/vm/src/stdlib/ctypes.rs (17)
  • get_size (140-159)
  • mem (142-142)
  • mem (143-143)
  • mem (144-144)
  • mem (145-145)
  • mem (146-146)
  • mem (147-147)
  • mem (148-148)
  • mem (149-149)
  • mem (150-150)
  • mem (151-151)
  • mem (152-152)
  • mem (153-153)
  • mem (154-154)
  • mem (155-155)
  • mem (156-156)
  • size_of (197-210)
crates/vm/src/stdlib/ctypes/field.rs (2)
crates/vm/src/stdlib/ctypes/base.rs (13)
  • new (169-175)
  • value (44-46)
  • value (48-50)
  • value (52-54)
  • value (84-84)
  • value (94-94)
  • value (104-104)
  • value (111-111)
  • value (112-112)
  • value (123-123)
  • value (134-134)
  • value (135-135)
  • value (287-292)
crates/vm/src/stdlib/ctypes/array.rs (5)
  • index (324-324)
  • index (341-341)
  • value (367-371)
  • value (375-375)
  • value (392-392)
crates/vm/src/stdlib/ctypes/structure.rs (2)
crates/vm/src/stdlib/ctypes.rs (17)
  • get_size (140-159)
  • mem (142-142)
  • mem (143-143)
  • mem (144-144)
  • mem (145-145)
  • mem (146-146)
  • mem (147-147)
  • mem (148-148)
  • mem (149-149)
  • mem (150-150)
  • mem (151-151)
  • mem (152-152)
  • mem (153-153)
  • mem (154-154)
  • mem (155-155)
  • mem (156-156)
  • size_of (197-210)
crates/vm/src/stdlib/ctypes/field.rs (2)
  • name (266-268)
  • size (246-248)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Run rust tests (windows-latest)
  • GitHub Check: Check Rust code with clippy
  • GitHub Check: Run snippets and cpython tests (windows-latest)
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Run snippets and cpython tests (macos-latest)
🔇 Additional comments (14)
crates/vm/src/stdlib/ctypes/pointer.rs (1)

27-34: LGTM!

The pointer size calculation using std::mem::size_of::<usize>() is correct for pointer types, and the initialization of element_size and buffer fields aligns with the new PyCArray structure. The empty buffer is appropriate since this creates a type object, not an instance.

crates/vm/src/stdlib/ctypes/base.rs (2)

306-324: LGTM!

The element size derivation logic correctly extracts the type code from the _type_ attribute and uses get_size for single-character codes, with appropriate fallbacks to size_of::<usize>() for unknown or complex types.


325-334: LGTM!

The PyCArrayType initialization is consistent with the new buffer-backed array structure introduced across the ctypes module.

crates/vm/src/stdlib/ctypes/union.rs (2)

16-36: LGTM!

The metaclass constructor correctly delegates to PyType::py_new and processes _fields_ only when defined directly on the class (not inherited), which is the expected behavior for ctypes Union types.


41-83: LGTM!

The process_fields implementation correctly implements union semantics where all fields start at offset 0. The validation of _fields_ format (list/tuple of tuples with at least 2 elements) is thorough.

crates/vm/src/stdlib/ctypes.rs (2)

24-25: LGTM!

The metaclass registrations for PyCStructType and PyCUnionType are correctly placed before the module extension, ensuring the metaclasses are available when creating the base classes.


63-78: LGTM!

The CArgObject implementation correctly follows the pyclass macro conventions. The _obj getter provides access to the wrapped object as expected by ctypes semantics.

crates/vm/src/stdlib/ctypes/array.rs (3)

33-65: LGTM!

The Callable implementation correctly creates array instances from positional arguments, initializing the buffer with proper element sizing. The bounds checking (if i >= length) prevents overflow when more arguments than array length are provided.


87-116: LGTM!

The __mul__ implementation correctly creates nested array types where the element size is the total size of the inner array. This enables expressions like (c_int * 5) * 3 to create a 2D array structure.


248-256: LGTM!

The int_to_bytes helper correctly uses native endianness (to_ne_bytes) which matches platform ABI expectations for ctypes. The size handling covers common C integer sizes.

crates/vm/src/stdlib/ctypes/structure.rs (2)

21-41: LGTM!

The metaclass constructor correctly delegates to PyType::py_new and processes _fields_ only when defined directly on the class (avoiding inherited fields).


56-104: LGTM!

Field processing correctly validates structure, extracts field metadata, and registers CField descriptors with proper offsets.

crates/vm/src/stdlib/ctypes/field.rs (2)

34-53: LGTM!

Clean constructor with sensible defaults for optional fields.


102-134: LGTM!

The descriptor correctly returns itself on class access and reads from the structure buffer on instance access with proper bounds checking.

Comment on lines +258 to +277
fn bytes_to_int(bytes: &[u8], size: usize, vm: &VirtualMachine) -> PyObjectRef {
match size {
1 => vm.ctx.new_int(bytes[0] as i8).into(),
2 => {
let val = i16::from_ne_bytes([bytes[0], bytes[1]]);
vm.ctx.new_int(val).into()
}
4 => {
let val = i32::from_ne_bytes([bytes[0], bytes[1], bytes[2], bytes[3]]);
vm.ctx.new_int(val).into()
}
8 => {
let val = i64::from_ne_bytes([
bytes[0], bytes[1], bytes[2], bytes[3], bytes[4], bytes[5], bytes[6], bytes[7],
]);
vm.ctx.new_int(val).into()
}
_ => vm.ctx.new_int(0).into(),
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Potential panic if buffer is too short.

The bytes_to_int function accesses bytes[0], bytes[1], etc. without verifying that the slice has enough elements. If called with an undersized slice, this will panic.

     fn bytes_to_int(bytes: &[u8], size: usize, vm: &VirtualMachine) -> PyObjectRef {
+        if bytes.len() < size {
+            return vm.ctx.new_int(0).into();
+        }
         match size {
             1 => vm.ctx.new_int(bytes[0] as i8).into(),
             2 => {
-                let val = i16::from_ne_bytes([bytes[0], bytes[1]]);
+                let val = i16::from_ne_bytes(bytes[..2].try_into().unwrap());
                 vm.ctx.new_int(val).into()
             }
             4 => {
-                let val = i32::from_ne_bytes([bytes[0], bytes[1], bytes[2], bytes[3]]);
+                let val = i32::from_ne_bytes(bytes[..4].try_into().unwrap());
                 vm.ctx.new_int(val).into()
             }
             8 => {
-                let val = i64::from_ne_bytes([
-                    bytes[0], bytes[1], bytes[2], bytes[3], bytes[4], bytes[5], bytes[6], bytes[7],
-                ]);
+                let val = i64::from_ne_bytes(bytes[..8].try_into().unwrap());
                 vm.ctx.new_int(val).into()
             }
             _ => vm.ctx.new_int(0).into(),
         }
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
fn bytes_to_int(bytes: &[u8], size: usize, vm: &VirtualMachine) -> PyObjectRef {
match size {
1 => vm.ctx.new_int(bytes[0] as i8).into(),
2 => {
let val = i16::from_ne_bytes([bytes[0], bytes[1]]);
vm.ctx.new_int(val).into()
}
4 => {
let val = i32::from_ne_bytes([bytes[0], bytes[1], bytes[2], bytes[3]]);
vm.ctx.new_int(val).into()
}
8 => {
let val = i64::from_ne_bytes([
bytes[0], bytes[1], bytes[2], bytes[3], bytes[4], bytes[5], bytes[6], bytes[7],
]);
vm.ctx.new_int(val).into()
}
_ => vm.ctx.new_int(0).into(),
}
}
fn bytes_to_int(bytes: &[u8], size: usize, vm: &VirtualMachine) -> PyObjectRef {
if bytes.len() < size {
return vm.ctx.new_int(0).into();
}
match size {
1 => vm.ctx.new_int(bytes[0] as i8).into(),
2 => {
let val = i16::from_ne_bytes(bytes[..2].try_into().unwrap());
vm.ctx.new_int(val).into()
}
4 => {
let val = i32::from_ne_bytes(bytes[..4].try_into().unwrap());
vm.ctx.new_int(val).into()
}
8 => {
let val = i64::from_ne_bytes(bytes[..8].try_into().unwrap());
vm.ctx.new_int(val).into()
}
_ => vm.ctx.new_int(0).into(),
}
}
🤖 Prompt for AI Agents
crates/vm/src/stdlib/ctypes/array.rs around lines 258 to 277: the bytes_to_int
function indexes bytes[0..7] without checking slice length which can panic on
short slices; validate the slice length against the requested size before
indexing and handle the error path instead of panicking — either return a Python
exception (e.g. raise a ValueError/IndexError via the VM context) or return a
safe default; implement a guard like if bytes.len() < size { return
vm.ctx.new_value_error("buffer too small").into() } (or equivalent using VM
APIs) then proceed to construct integers using safe indexing or by copying into
fixed-size arrays after confirming length.

Comment on lines 404 to 412
impl PyCArray {
#[allow(unused)]
pub fn to_arg(&self, _vm: &VirtualMachine) -> PyResult<libffi::middle::Arg> {
let value = self.value.read();
let py_bytes = value.downcast_ref::<PyBytes>().unwrap();
let bytes = py_bytes.payload().to_vec();
Ok(libffi::middle::Arg::new(&bytes))
let buffer = self.buffer.read();
Ok(libffi::middle::Arg::new(&buffer.clone()))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Use-after-free: Arg::new references a temporary that is immediately dropped.

libffi::middle::Arg::new(&buffer.clone()) creates an Arg referencing a cloned Vec<u8> that is dropped at the end of the expression, leaving the Arg pointing to freed memory. This will cause undefined behavior when the Arg is used.

The to_arg implementation needs to ensure the buffer lives long enough. This likely requires a different approach, such as storing the buffer in a way that outlives the FFI call, or using a raw pointer approach that matches how ctypes handles memory for FFI.

     pub fn to_arg(&self, _vm: &VirtualMachine) -> PyResult<libffi::middle::Arg> {
-        let buffer = self.buffer.read();
-        Ok(libffi::middle::Arg::new(&buffer.clone()))
+        // TODO: This needs a different approach to ensure buffer lifetime
+        // The buffer must outlive the Arg returned here
+        let buffer = self.buffer.read();
+        let ptr = buffer.as_ptr();
+        Ok(libffi::middle::Arg::new(&ptr))
     }

Note: The suggested fix passes a pointer, but the overall lifetime management for FFI calls may need architectural review.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In crates/vm/src/stdlib/ctypes/array.rs around lines 404–409, the current
implementation calls libffi::middle::Arg::new(&buffer.clone()), creating an Arg
that points at a temporary Vec<u8> that is dropped immediately (use-after-free).
Fix by ensuring the buffer outlives the Arg: do not pass a reference to a
temporary clone; instead make the byte storage owned and stored with a lifetime
tied to the PyCArray (e.g., store Arc<Vec<u8>> or Box<[u8]>/Vec<u8> as a field
or in a dedicated arg_storage on self), create the Arg from a reference to that
owned storage (or use .as_ptr() with explicit length and ensure ownership
remains valid until the ffi call completes). Update PyCArray fields and to_arg
to return an Arg that references the persistent storage rather than a dropped
temporary.

Comment on lines 75 to +157
impl PyCField {
/// Convert bytes to a Python value based on size
fn bytes_to_value(bytes: &[u8], size: usize, vm: &VirtualMachine) -> PyResult {
match size {
1 => Ok(vm.ctx.new_int(bytes[0] as i8).into()),
2 => {
let val = i16::from_ne_bytes([bytes[0], bytes[1]]);
Ok(vm.ctx.new_int(val).into())
}
4 => {
let val = i32::from_ne_bytes([bytes[0], bytes[1], bytes[2], bytes[3]]);
Ok(vm.ctx.new_int(val).into())
}
8 => {
let val = i64::from_ne_bytes([
bytes[0], bytes[1], bytes[2], bytes[3], bytes[4], bytes[5], bytes[6], bytes[7],
]);
Ok(vm.ctx.new_int(val).into())
}
_ => Ok(vm.ctx.new_int(0).into()),
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Type conversion only handles signed integers; missing unsigned and floating-point support.

This implementation always interprets bytes as signed integers, but ctypes supports unsigned types (c_uint, c_ulong, etc.) and floating-point types (c_float, c_double). Reading an unsigned field or float will produce incorrect values.

Consider using the field's proto type information to determine signedness and type:

-    fn bytes_to_value(bytes: &[u8], size: usize, vm: &VirtualMachine) -> PyResult {
+    fn bytes_to_value(bytes: &[u8], size: usize, proto: &PyObjectRef, vm: &VirtualMachine) -> PyResult {
+        // Check _type_ to determine signed/unsigned/float
+        let type_code = proto.get_attr("_type_", vm).ok()
+            .and_then(|t| t.str(vm).ok())
+            .map(|s| s.to_string());
+        
         match size {
             1 => Ok(vm.ctx.new_int(bytes[0] as i8).into()),
+            // Add unsigned variants based on type_code
             // ...
         }
     }

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In crates/vm/src/stdlib/ctypes/field.rs around lines 136 to 157, bytes_to_value
currently always decodes signed integers; change it to inspect the field's
proto/type information to choose the correct decoding path (signed int, unsigned
int, or float) and return the appropriate Python object: for unsigned sizes use
u8/u16/u32/u64 via from_ne_bytes and pass the value to vm.ctx.new_int (Python
ints cover unsigned), for floats use f32::from_ne_bytes/f64::from_ne_bytes and
vm.ctx.new_float, and for signed sizes keep the existing i8/i16/i32/i64
handling; also validate that bytes.len() >= size and return an error or sensible
default on mismatch rather than silently returning 0.

Comment on lines +159 to +186
/// Convert a Python value to bytes
fn value_to_bytes(value: &PyObjectRef, size: usize, vm: &VirtualMachine) -> PyResult<Vec<u8>> {
if let Ok(int_val) = value.try_int(vm) {
let i = int_val.as_bigint();
match size {
1 => {
let val = i.to_i8().unwrap_or(0);
Ok(val.to_ne_bytes().to_vec())
}
2 => {
let val = i.to_i16().unwrap_or(0);
Ok(val.to_ne_bytes().to_vec())
}
4 => {
let val = i.to_i32().unwrap_or(0);
Ok(val.to_ne_bytes().to_vec())
}
8 => {
let val = i.to_i64().unwrap_or(0);
Ok(val.to_ne_bytes().to_vec())
}
_ => Ok(vec![0u8; size]),
}
} else {
Ok(vec![0u8; size])
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Silent truncation on integer overflow.

unwrap_or(0) silently handles values that don't fit in the target size. For example, assigning 300 to a 1-byte field silently becomes 0 instead of raising an OverflowError (as CPython ctypes does).

Consider returning an error on overflow:

             1 => {
-                let val = i.to_i8().unwrap_or(0);
+                let val = i.to_i8().ok_or_else(|| {
+                    vm.new_overflow_error(format!("int too large to convert to c_byte"))
+                })?;
                 Ok(val.to_ne_bytes().to_vec())
             }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
/// Convert a Python value to bytes
fn value_to_bytes(value: &PyObjectRef, size: usize, vm: &VirtualMachine) -> PyResult<Vec<u8>> {
if let Ok(int_val) = value.try_int(vm) {
let i = int_val.as_bigint();
match size {
1 => {
let val = i.to_i8().unwrap_or(0);
Ok(val.to_ne_bytes().to_vec())
}
2 => {
let val = i.to_i16().unwrap_or(0);
Ok(val.to_ne_bytes().to_vec())
}
4 => {
let val = i.to_i32().unwrap_or(0);
Ok(val.to_ne_bytes().to_vec())
}
8 => {
let val = i.to_i64().unwrap_or(0);
Ok(val.to_ne_bytes().to_vec())
}
_ => Ok(vec![0u8; size]),
}
} else {
Ok(vec![0u8; size])
}
}
}
/// Convert a Python value to bytes
fn value_to_bytes(value: &PyObjectRef, size: usize, vm: &VirtualMachine) -> PyResult<Vec<u8>> {
if let Ok(int_val) = value.try_int(vm) {
let i = int_val.as_bigint();
match size {
1 => {
let val = i.to_i8().ok_or_else(|| {
vm.new_overflow_error(format!("int too large to convert to c_byte"))
})?;
Ok(val.to_ne_bytes().to_vec())
}
2 => {
let val = i.to_i16().ok_or_else(|| {
vm.new_overflow_error(format!("int too large to convert to c_short"))
})?;
Ok(val.to_ne_bytes().to_vec())
}
4 => {
let val = i.to_i32().ok_or_else(|| {
vm.new_overflow_error(format!("int too large to convert to c_int"))
})?;
Ok(val.to_ne_bytes().to_vec())
}
8 => {
let val = i.to_i64().ok_or_else(|| {
vm.new_overflow_error(format!("int too large to convert to c_long"))
})?;
Ok(val.to_ne_bytes().to_vec())
}
_ => Ok(vec![0u8; size]),
}
} else {
Ok(vec![0u8; size])
}
}
}
🤖 Prompt for AI Agents
In crates/vm/src/stdlib/ctypes/field.rs around lines 159-186, the code currently
uses unwrap_or(0) when converting BigInt -> i8/i16/i32/i64 which silently
truncates values that don't fit; change those conversions to check the Option
return and return an OverflowError via the VM on failure. Concretely, replace
each let val = i.to_i8().unwrap_or(0); (and the i16/i32/i64 variants) with a
match or if-let that returns Err(vm.new_overflow_error(format!("integer
overflow: value {} does not fit into {} bytes", i, size))) when the conversion
returns None, otherwise proceed to to_ne_bytes().to_vec() for the successful
value. Keep the existing default branch for unsupported sizes.

Comment on lines +106 to +134
/// Get the size of a ctypes type
fn get_field_size(field_type: &PyObjectRef, vm: &VirtualMachine) -> PyResult<usize> {
// Try to get _type_ attribute for simple types
if let Some(size) = field_type
.get_attr("_type_", vm)
.ok()
.and_then(|type_attr| type_attr.str(vm).ok())
.and_then(|type_str| {
let s = type_str.to_string();
(s.len() == 1).then(|| get_size(&s))
})
{
return Ok(size);
}

// Try sizeof for other types
if let Some(s) = field_type
.get_attr("size_of_instances", vm)
.ok()
.and_then(|size_method| size_method.call((), vm).ok())
.and_then(|size| size.try_int(vm).ok())
.and_then(|n| n.as_bigint().to_usize())
{
return Ok(s);
}

// Default to pointer size for unknown types
Ok(std::mem::size_of::<usize>())
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Potential panic when _type_ contains unrecognized type code.

The get_size() function (from ctypes.rs) uses unreachable!() for unknown type strings. While the s.len() == 1 check filters multi-char strings, a single unrecognized character (e.g., "x") will still trigger a panic.

Consider validating the type code before calling get_size, or wrap it with a fallback:

         .and_then(|type_str| {
             let s = type_str.to_string();
-            (s.len() == 1).then(|| get_size(&s))
+            const VALID_TYPES: &[&str] = &[
+                "u", "c", "b", "h", "H", "i", "I", "l", "q", "L", "Q", "f", "d", "g", "?", "B", "P", "z", "Z", "O"
+            ];
+            (s.len() == 1 && VALID_TYPES.contains(&s.as_str())).then(|| get_size(&s))
         })
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
/// Get the size of a ctypes type
fn get_field_size(field_type: &PyObjectRef, vm: &VirtualMachine) -> PyResult<usize> {
// Try to get _type_ attribute for simple types
if let Some(size) = field_type
.get_attr("_type_", vm)
.ok()
.and_then(|type_attr| type_attr.str(vm).ok())
.and_then(|type_str| {
let s = type_str.to_string();
(s.len() == 1).then(|| get_size(&s))
})
{
return Ok(size);
}
// Try sizeof for other types
if let Some(s) = field_type
.get_attr("size_of_instances", vm)
.ok()
.and_then(|size_method| size_method.call((), vm).ok())
.and_then(|size| size.try_int(vm).ok())
.and_then(|n| n.as_bigint().to_usize())
{
return Ok(s);
}
// Default to pointer size for unknown types
Ok(std::mem::size_of::<usize>())
}
/// Get the size of a ctypes type
fn get_field_size(field_type: &PyObjectRef, vm: &VirtualMachine) -> PyResult<usize> {
// Try to get _type_ attribute for simple types
if let Some(size) = field_type
.get_attr("_type_", vm)
.ok()
.and_then(|type_attr| type_attr.str(vm).ok())
.and_then(|type_str| {
let s = type_str.to_string();
const VALID_TYPES: &[&str] = &[
"u", "c", "b", "h", "H", "i", "I", "l", "q", "L", "Q", "f", "d", "g", "?", "B", "P", "z", "Z", "O"
];
(s.len() == 1 && VALID_TYPES.contains(&s.as_str())).then(|| get_size(&s))
})
{
return Ok(size);
}
// Try sizeof for other types
if let Some(s) = field_type
.get_attr("size_of_instances", vm)
.ok()
.and_then(|size_method| size_method.call((), vm).ok())
.and_then(|size| size.try_int(vm).ok())
.and_then(|n| n.as_bigint().to_usize())
{
return Ok(s);
}
// Default to pointer size for unknown types
Ok(std::mem::size_of::<usize>())
}
🤖 Prompt for AI Agents
In crates/vm/src/stdlib/ctypes/structure.rs around lines 106 to 134, the call to
get_size(...) can panic if the single-character _type_ value is an unrecognized
code (get_size currently uses unreachable!()). Before calling get_size, validate
the single-character type code against the set of supported ctypes codes (or map
it with a safe match) and only call get_size for known codes; if the code is
unknown, fall back to the pointer-size default (or the sizeof logic already
used) so the function returns Ok(pointer_size) instead of panicking.

Comment on lines 136 to 155
#[pymethod]
fn __mul__(cls: PyTypeRef, n: isize, vm: &VirtualMachine) -> PyResult {
use super::array::{PyCArray, PyCArrayType};
if n < 0 {
return Err(vm.new_value_error(format!("Array length must be >= 0, not {n}")));
}
// For structures, element size is the structure size (sum of field sizes)
let element_size = std::mem::size_of::<usize>(); // Default, should calculate from fields
Ok(PyCArrayType {
inner: PyCArray {
typ: PyRwLock::new(cls),
length: AtomicCell::new(n as usize),
element_size: AtomicCell::new(element_size),
buffer: PyRwLock::new(vec![]),
},
}
todo!("Implement PyCStructure::py_new")
.to_pyobject(vm))
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Incorrect element size calculation for Structure arrays.

The element_size is hardcoded to size_of::<usize>() (pointer size) instead of the actual structure size. This will cause incorrect array sizing and memory access issues.

Consider computing the actual structure size:

     fn __mul__(cls: PyTypeRef, n: isize, vm: &VirtualMachine) -> PyResult {
         use super::array::{PyCArray, PyCArrayType};
         if n < 0 {
             return Err(vm.new_value_error(format!("Array length must be >= 0, not {n}")));
         }
-        // For structures, element size is the structure size (sum of field sizes)
-        let element_size = std::mem::size_of::<usize>(); // Default, should calculate from fields
+        // Calculate element size from _fields_
+        let element_size = if let Some(fields_attr) = cls.get_attr(vm.ctx.intern_str("_fields_"), vm).ok() {
+            // Sum field sizes (reuse process_fields logic or store size on class)
+            Self::calculate_struct_size(&fields_attr, vm).unwrap_or(std::mem::size_of::<usize>())
+        } else {
+            std::mem::size_of::<usize>()
+        };

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In crates/vm/src/stdlib/ctypes/structure.rs around lines 136–154, the
element_size for structure arrays is incorrectly hardcoded to
size_of::<usize>(); replace this with computation of the actual structure size
derived from the Structure type's field definitions (iterate the structure's
fields or call the type's layout/size helper), summing each field's C type size
and applying proper alignment/padding or using the structure's computed layout
helper if available, then store that computed size in element_size; also handle
any errors when reading field metadata and return a PyResult error if the
structure size cannot be determined.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (4)
crates/vm/src/stdlib/ctypes/array.rs (2)

258-277: Potential panic if buffer slice is too short.

The bytes_to_int function directly indexes bytes[0], bytes[1], etc. without verifying slice length. If called with an undersized slice, this will panic.

Add a length guard at the start:

     fn bytes_to_int(bytes: &[u8], size: usize, vm: &VirtualMachine) -> PyObjectRef {
+        if bytes.len() < size {
+            return vm.ctx.new_int(0).into();
+        }
         match size {
             1 => vm.ctx.new_int(bytes[0] as i8).into(),

404-412: Lifetime concern: Arg::new(&ptr) references a stack-local pointer.

The ptr variable is local to this function. While Arg::new(&ptr) takes a reference to the pointer value itself (not the buffer data), this still creates an Arg that references memory that will be invalid after the function returns and the read guard is dropped.

This requires architectural changes to ensure the buffer outlives the FFI call. Consider storing the buffer pointer in a way that persists for the duration of the FFI call, or restructuring how arguments are passed to FFI.

crates/vm/src/stdlib/ctypes/structure.rs (2)

106-134: Potential panic with unrecognized single-character type codes.

The get_size() function uses unreachable!() for unknown type strings. While the s.len() == 1 check filters multi-char strings, an unrecognized single character (e.g., "x") will still trigger a panic.

Validate the type code before calling get_size:

         .and_then(|type_str| {
             let s = type_str.to_string();
-            (s.len() == 1).then(|| get_size(&s))
+            const VALID_TYPES: &str = "cbBhHiIlqLQfdg?PzZOu";
+            (s.len() == 1 && VALID_TYPES.contains(&s)).then(|| get_size(&s))
         })

136-155: Element size is hardcoded to pointer size instead of actual structure size.

The element_size should be calculated from the structure's fields, not hardcoded to size_of::<usize>(). This will cause incorrect array sizing when creating arrays of structures.

Consider computing the actual structure size from _fields_:

     fn __mul__(cls: PyTypeRef, n: isize, vm: &VirtualMachine) -> PyResult {
         use super::array::{PyCArray, PyCArrayType};
         if n < 0 {
             return Err(vm.new_value_error(format!("Array length must be >= 0, not {n}")));
         }
-        // TODO: Calculate element size properly
-        // For structures, element size is the structure size (sum of field sizes)
-        let element_size = std::mem::size_of::<usize>(); // Default, should calculate from fields
+        // Calculate element size from _fields_
+        let element_size = if let Ok(fields_attr) = cls.as_object().get_attr("_fields_", vm) {
+            Self::calculate_struct_size_from_fields(&fields_attr, vm).unwrap_or(std::mem::size_of::<usize>())
+        } else {
+            std::mem::size_of::<usize>()
+        };

You'll need to add a helper method or reuse logic from process_fields to sum field sizes.

🧹 Nitpick comments (1)
crates/vm/src/stdlib/ctypes/structure.rs (1)

308-312: _fields_ getter returns None instead of actual fields.

The _fields_ property always returns vm.ctx.none() instead of retrieving the actual _fields_ from the class. While the comment suggests this is intentional, this may break code that expects to read _fields_ from an instance.

Consider returning the class's _fields_ attribute:

     #[pygetset]
     fn _fields_(&self, vm: &VirtualMachine) -> PyObjectRef {
-        // Return the _fields_ from the class, not instance
-        vm.ctx.none()
+        // Return the _fields_ from the class
+        self.class()
+            .get_attr(vm.ctx.intern_str("_fields_"), vm)
+            .unwrap_or_else(|_| vm.ctx.none())
     }
📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between df369d0 and 39e6885.

📒 Files selected for processing (3)
  • crates/vm/src/stdlib/ctypes.rs (5 hunks)
  • crates/vm/src/stdlib/ctypes/array.rs (5 hunks)
  • crates/vm/src/stdlib/ctypes/structure.rs (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style in Rust code (cargo fmt to format)
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by your changes
Follow Rust best practices for error handling and memory management

Files:

  • crates/vm/src/stdlib/ctypes.rs
  • crates/vm/src/stdlib/ctypes/structure.rs
  • crates/vm/src/stdlib/ctypes/array.rs
**/src/**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • crates/vm/src/stdlib/ctypes.rs
  • crates/vm/src/stdlib/ctypes/structure.rs
  • crates/vm/src/stdlib/ctypes/array.rs
🧠 Learnings (3)
📚 Learning: 2025-11-10T06:27:41.954Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6243
File: vm/src/function/buffer.rs:123-129
Timestamp: 2025-11-10T06:27:41.954Z
Learning: In vm/src/function/buffer.rs, line 36 in the try_rw_bytes_like method intentionally uses TypeError (not BufferError) for the "buffer is not a read-write bytes-like object" error case, even though a similar error message in ArgMemoryBuffer::try_from_borrowed_object uses BufferError. The TypeError is the intended exception type for this specific code path.

Applied to files:

  • crates/vm/src/stdlib/ctypes.rs
  • crates/vm/src/stdlib/ctypes/array.rs
📚 Learning: 2025-11-25T11:05:02.899Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-25T11:05:02.899Z
Learning: Applies to **/src/**/*.rs : Use the macro system (`pyclass`, `pymodule`, `pyfunction`, etc.) when implementing Python functionality in Rust

Applied to files:

  • crates/vm/src/stdlib/ctypes.rs
  • crates/vm/src/stdlib/ctypes/array.rs
📚 Learning: 2025-08-26T05:20:54.540Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.540Z
Learning: In the RustPython codebase, only certain builtin types should be marked with the SEQUENCE flag for pattern matching. List and tuple are sequences, but bytes, bytearray, and range are not considered sequences in this context, even though they may implement sequence-like protocols.

Applied to files:

  • crates/vm/src/stdlib/ctypes/array.rs
🧬 Code graph analysis (1)
crates/vm/src/stdlib/ctypes/array.rs (1)
crates/vm/src/stdlib/ctypes.rs (17)
  • get_size (140-159)
  • mem (142-142)
  • mem (143-143)
  • mem (144-144)
  • mem (145-145)
  • mem (146-146)
  • mem (147-147)
  • mem (148-148)
  • mem (149-149)
  • mem (150-150)
  • mem (151-151)
  • mem (152-152)
  • mem (153-153)
  • mem (154-154)
  • mem (155-155)
  • mem (156-156)
  • size_of (197-210)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Run snippets and cpython tests (windows-latest)
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Run tests under miri
  • GitHub Check: Run rust tests (windows-latest)
  • GitHub Check: Check Rust code with clippy
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Ensure compilation on various targets
🔇 Additional comments (8)
crates/vm/src/stdlib/ctypes.rs (3)

63-78: LGTM! Clean CArgObject implementation.

The CArgObject struct follows the RustPython macro conventions correctly with #[pyclass] and #[pygetset]. The #[allow(dead_code)] on offset is appropriate since it's stored for future use but not yet exposed.


345-367: LGTM! byref error message now includes actual type name.

The fix correctly uses obj.class().name() to provide helpful error messages showing the actual type passed, addressing the previous review feedback.


436-439: _cast_addr uses todo!() which will panic at runtime.

Similar to _wstring_at_addr, this will panic if called. If this is intentional for tracking unimplemented features, consider using a more explicit approach.

Is this intentional as a development placeholder? If users might call this, consider returning a Python exception instead.

crates/vm/src/stdlib/ctypes/array.rs (2)

33-65: LGTM! Clean Callable implementation for PyCArrayType.

The implementation correctly creates array instances from the type metadata, bounds-checks the positional arguments, and initializes the buffer appropriately.


87-116: LGTM! Well-structured __mul__ for nested array types.

The implementation correctly validates negative lengths and creates nested array types with proper element size calculation. The approach of using the current array type as the element type for the new outer array is correct for ctypes semantics.

crates/vm/src/stdlib/ctypes/structure.rs (3)

21-41: LGTM! Clean metaclass constructor.

The py_new correctly delegates to PyType::py_new and processes _fields_ only when defined directly on the new class (not inherited), which is the correct behavior for Structure metaclasses.


56-104: Well-structured field processing with proper offset tracking.

The process_fields method correctly validates _fields_ format, computes offsets, and creates PyCField descriptors. Using set_attr to register descriptors on the class is the right approach.


220-265: Good: IndexMap preserves insertion order for positional arguments.

The use of IndexMap instead of HashMap ensures that fields_map.keys() returns field names in declaration order, so positional arguments are correctly assigned. This addresses the concern from the previous review.

Also applies to: 287-297

Comment on lines +374 to +382
fn set_value(&self, value: PyObjectRef, _vm: &VirtualMachine) -> PyResult<()> {
if let Some(bytes) = value.downcast_ref::<PyBytes>() {
let mut buffer = self.buffer.write();
let src = bytes.as_bytes();
let len = std::cmp::min(src.len(), buffer.len());
buffer[..len].copy_from_slice(&src[..len]);
}
Ok(())
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

set_value silently ignores non-bytes input.

When value is not a PyBytes, the function returns Ok(()) without any indication that the value wasn't set. This could lead to confusing behavior where assignments appear to succeed but have no effect.

Consider raising a type error for non-bytes input:

     fn set_value(&self, value: PyObjectRef, vm: &VirtualMachine) -> PyResult<()> {
         if let Some(bytes) = value.downcast_ref::<PyBytes>() {
             let mut buffer = self.buffer.write();
             let src = bytes.as_bytes();
             let len = std::cmp::min(src.len(), buffer.len());
             buffer[..len].copy_from_slice(&src[..len]);
+            Ok(())
+        } else {
+            Err(vm.new_type_error("expected bytes".to_owned()))
         }
-        Ok(())
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
fn set_value(&self, value: PyObjectRef, _vm: &VirtualMachine) -> PyResult<()> {
if let Some(bytes) = value.downcast_ref::<PyBytes>() {
let mut buffer = self.buffer.write();
let src = bytes.as_bytes();
let len = std::cmp::min(src.len(), buffer.len());
buffer[..len].copy_from_slice(&src[..len]);
}
Ok(())
}
fn set_value(&self, value: PyObjectRef, vm: &VirtualMachine) -> PyResult<()> {
if let Some(bytes) = value.downcast_ref::<PyBytes>() {
let mut buffer = self.buffer.write();
let src = bytes.as_bytes();
let len = std::cmp::min(src.len(), buffer.len());
buffer[..len].copy_from_slice(&src[..len]);
Ok(())
} else {
Err(vm.new_type_error("expected bytes".to_owned()))
}
}
🤖 Prompt for AI Agents
In crates/vm/src/stdlib/ctypes/array.rs around lines 374 to 382, the set_value
function currently returns Ok(()) when value is not a PyBytes which silently
ignores the assignment; change this to return a Python TypeError instead. Update
the function to check if value.downcast_ref::<PyBytes>() is Some and perform the
copy as before, but if it is None return Err(vm.new_type_error("expected a
bytes-like object".to_owned())) (or a similar clear message), so non-bytes input
raises an error rather than silently succeeding.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/vm/src/stdlib/ctypes/base.rs (1)

44-52: Duplicate condition check - likely copy-paste error.

The same PyBytes with length 1 check appears twice. The second check (lines 48-51) should likely check a different type (perhaps PyByteArray?).

         if value
             .clone()
             .downcast_exact::<PyBytes>(vm)
             .is_ok_and(|v| v.len() == 1)
-            || value
-                .clone()
-                .downcast_exact::<PyBytes>(vm)
-                .is_ok_and(|v| v.len() == 1)
+            || value
+                .clone()
+                .downcast_exact::<PyByteArray>(vm)
+                .is_ok_and(|v| v.len() == 1)
             || value
                 .clone()
                 .downcast_exact::<PyInt>(vm)
♻️ Duplicate comments (8)
crates/vm/src/stdlib/ctypes.rs (1)

422-434: Windows todo!() will panic at runtime.

This was flagged in a previous review. The todo!() macro will cause a runtime panic on Windows. Consider returning 0 with a TODO comment, or raising a proper Python exception.

         #[cfg(target_os = "windows")]
         {
-            todo!("On Windows, use wcslen from ucrt")
+            // TODO: RUSTPYTHON - On Windows, should use wcslen from ucrt
+            // Returning 0 as placeholder until Windows support is added
+            0
         }
crates/vm/src/stdlib/ctypes/field.rs (2)

136-157: Type conversion only handles signed integers.

This was flagged in a previous review. The implementation always interprets bytes as signed integers, but ctypes supports unsigned types (c_uint, c_ulong) and floating-point types (c_float, c_double). Consider using the field's proto type information to determine the correct interpretation.


159-186: Silent truncation on integer overflow.

This was flagged in a previous review. Using unwrap_or(0) silently handles values that don't fit, whereas CPython ctypes raises OverflowError. For example, assigning 300 to a 1-byte field silently becomes 0.

             1 => {
-                let val = i.to_i8().unwrap_or(0);
+                let val = i.to_i8().ok_or_else(|| {
+                    vm.new_overflow_error("int too large to convert".to_owned())
+                })?;
                 Ok(val.to_ne_bytes().to_vec())
             }
crates/vm/src/stdlib/ctypes/array.rs (3)

258-277: Potential panic if buffer is too short.

This was flagged in a previous review. The bytes_to_int function accesses array indices without validating slice length. If called with an undersized slice, this will panic.

     fn bytes_to_int(bytes: &[u8], size: usize, vm: &VirtualMachine) -> PyObjectRef {
+        if bytes.len() < size {
+            return vm.ctx.new_int(0).into();
+        }
         match size {
             1 => vm.ctx.new_int(bytes[0] as i8).into(),
             2 => {
-                let val = i16::from_ne_bytes([bytes[0], bytes[1]]);
+                let val = i16::from_ne_bytes(bytes[..2].try_into().unwrap());
                 vm.ctx.new_int(val).into()
             }

373-382: set_value silently ignores non-bytes input.

This was flagged in a previous review. When value is not PyBytes, the function returns Ok(()) without any indication that the value wasn't set.

     fn set_value(&self, value: PyObjectRef, vm: &VirtualMachine) -> PyResult<()> {
         if let Some(bytes) = value.downcast_ref::<PyBytes>() {
             let mut buffer = self.buffer.write();
             let src = bytes.as_bytes();
             let len = std::cmp::min(src.len(), buffer.len());
             buffer[..len].copy_from_slice(&src[..len]);
+            Ok(())
+        } else {
+            Err(vm.new_type_error("expected bytes".to_owned()))
         }
-        Ok(())
     }

404-412: Lifetime issue with to_arg - pointer may dangle after read lock is released.

This was flagged in a previous review. The buffer.read() guard is dropped at the end of the function, but ptr still references that memory. The Arg returned will point to potentially invalid memory when used.

The TODO comment acknowledges this, but this is a use-after-free that will cause undefined behavior. Consider:

  1. Storing a persistent buffer pointer in PyCArray that outlives FFI calls
  2. Using Arc<RwLock<Vec<u8>>> and keeping the guard alive during FFI
  3. Copying the buffer to a location that's guaranteed to outlive the call
crates/vm/src/stdlib/ctypes/structure.rs (2)

106-134: Guard get_size against unknown _type_ codes to avoid VM panics

get_field_size calls get_size(&s) for any single‑char _type_ value without validating the code. If _type_ is a single unknown character (e.g. user‑defined or corrupted), get_size’s internal unreachable!() will panic the VM.

Consider explicitly validating the code before calling get_size, or delegating to a non‑panicking helper in _ctypes:

-            .and_then(|type_str| {
-                let s = type_str.to_string();
-                (s.len() == 1).then(|| get_size(&s))
-            })
+            .and_then(|type_str| {
+                let s = type_str.to_string();
+                if s.len() != 1 {
+                    return None;
+                }
+                // Keep this set in sync with `_ctypes::get_size`.
+                const VALID_TYPES: &[&str] = &[
+                    "u", "c", "b", "h", "H", "i", "I", "l", "q", "L", "Q", "f", "d", "g", "?",
+                    "B", "P", "z", "Z", "O",
+                ];
+                if !VALID_TYPES.contains(&s.as_str()) {
+                    return None;
+                }
+                Some(get_size(&s))
+            })

This keeps get_field_size total‑failure‑free and avoids panics even if _type_ is malformed.


136-175: Structure * n uses pointer size instead of structure size for arrays

PyCStructType::__mul__ sets:

// TODO: Calculate element size properly
let element_size = std::mem::size_of::<usize>();

For arrays of structures (Point * 10), this will allocate n * sizeof(usize) bytes instead of n * sizeof(struct). Any field descriptor that uses struct offsets larger than sizeof(usize) risks indexing past the end of the array buffer and panicking.

You should compute the real structure size from _fields_, ideally reusing the same logic as in PyCStructure::py_new:

-        // TODO: Calculate element size properly
-        // For structures, element size is the structure size (sum of field sizes)
-        let element_size = std::mem::size_of::<usize>(); // Default, should calculate from fields
+        // Element size is the structure size derived from the class's `_fields_`
+        let element_size = {
+            let fields_attr = cls
+                .as_object()
+                .get_attr("_fields_", vm)
+                .map_err(|_| vm.new_type_error("Structure has no _fields_".to_owned()))?;
+            // Factor this out of `PyCStructure::py_new` into a shared helper
+            PyCStructType::get_struct_size_from_fields(&fields_attr, vm)?
+        };

…and add a get_struct_size_from_fields helper that mirrors the size/offset accumulation in PyCStructure::py_new, so arrays and instances share one definition of layout.

Until this is fixed, arrays of non‑trivial Structure are very likely incorrect and fragile.

🧹 Nitpick comments (3)
crates/vm/src/stdlib/ctypes/union.rs (1)

38-110: Code duplication with PyCStructType - consider extracting shared logic.

The process_fields and get_field_size methods are nearly identical to those in structure.rs (only differing in offset calculation - always 0 for unions vs. cumulative for structs). Consider extracting the common field parsing and size calculation into a shared helper module.

For example, a shared helper could be:

// In a common module
pub fn parse_fields(fields_attr: &PyObjectRef, vm: &VirtualMachine) 
    -> PyResult<Vec<(String, PyObjectRef, usize)>> { ... }

pub fn get_field_size(field_type: &PyObjectRef, vm: &VirtualMachine) 
    -> PyResult<usize> { ... }
crates/vm/src/stdlib/ctypes/array.rs (1)

248-256: int_to_bytes silently truncates on overflow (same issue as field.rs).

Using unwrap_or(0) silently handles values that don't fit. Consider returning a PyResult<Vec<u8>> and raising OverflowError for consistency with CPython.

crates/vm/src/stdlib/ctypes/structure.rs (1)

306-313: Implement _fields_ getter for instances (currently always returns None)

The _fields_ getter on PyCStructure currently returns None, despite the comment saying it should expose the class’s _fields_. This breaks introspection like instance._fields_ and diverges from expectations around ctypes.Structure.

Now that you store fields: IndexMap<String, FieldInfo> and the class already has a _fields_ attribute, consider:

  • Either returning the class attribute:
    #[pygetset]
    fn _fields_(&self, vm: &VirtualMachine) -> PyObjectRef {
-        // Return the _fields_ from the class, not instance
-        vm.ctx.none()
+        self.as_object()
+            .class()
+            .get_attr("_fields_", vm)
+            .unwrap_or_else(|_| vm.ctx.none())
    }
  • Or reconstructing _fields_ from self.fields if you want the instance to be the single source of truth.

Either way, returning None permanently is likely to surprise users.

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 39e6885 and 10fe041.

📒 Files selected for processing (7)
  • crates/vm/src/stdlib/ctypes.rs (5 hunks)
  • crates/vm/src/stdlib/ctypes/array.rs (5 hunks)
  • crates/vm/src/stdlib/ctypes/base.rs (1 hunks)
  • crates/vm/src/stdlib/ctypes/field.rs (4 hunks)
  • crates/vm/src/stdlib/ctypes/pointer.rs (1 hunks)
  • crates/vm/src/stdlib/ctypes/structure.rs (1 hunks)
  • crates/vm/src/stdlib/ctypes/union.rs (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • crates/vm/src/stdlib/ctypes/pointer.rs
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style in Rust code (cargo fmt to format)
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by your changes
Follow Rust best practices for error handling and memory management

Files:

  • crates/vm/src/stdlib/ctypes/union.rs
  • crates/vm/src/stdlib/ctypes.rs
  • crates/vm/src/stdlib/ctypes/structure.rs
  • crates/vm/src/stdlib/ctypes/field.rs
  • crates/vm/src/stdlib/ctypes/base.rs
  • crates/vm/src/stdlib/ctypes/array.rs
**/src/**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • crates/vm/src/stdlib/ctypes/union.rs
  • crates/vm/src/stdlib/ctypes.rs
  • crates/vm/src/stdlib/ctypes/structure.rs
  • crates/vm/src/stdlib/ctypes/field.rs
  • crates/vm/src/stdlib/ctypes/base.rs
  • crates/vm/src/stdlib/ctypes/array.rs
🧠 Learnings (6)
📚 Learning: 2025-11-10T06:27:41.954Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6243
File: vm/src/function/buffer.rs:123-129
Timestamp: 2025-11-10T06:27:41.954Z
Learning: In vm/src/function/buffer.rs, line 36 in the try_rw_bytes_like method intentionally uses TypeError (not BufferError) for the "buffer is not a read-write bytes-like object" error case, even though a similar error message in ArgMemoryBuffer::try_from_borrowed_object uses BufferError. The TypeError is the intended exception type for this specific code path.

Applied to files:

  • crates/vm/src/stdlib/ctypes.rs
  • crates/vm/src/stdlib/ctypes/field.rs
  • crates/vm/src/stdlib/ctypes/array.rs
📚 Learning: 2025-11-25T11:05:02.899Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-25T11:05:02.899Z
Learning: Applies to Lib/**/*.py : When modifying files in Lib/, add a `# TODO: RUSTPYTHON` comment to mark modifications, or use `unittest.skip("TODO: RustPython <reason>")` or `unittest.expectedFailure` with a comment

Applied to files:

  • crates/vm/src/stdlib/ctypes.rs
📚 Learning: 2025-06-27T14:47:28.810Z
Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration using #ifdef checks rather than providing fallback values for other platforms.

Applied to files:

  • crates/vm/src/stdlib/ctypes.rs
📚 Learning: 2025-06-27T14:47:28.810Z
Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration rather than providing fallback values for other platforms.

Applied to files:

  • crates/vm/src/stdlib/ctypes.rs
📚 Learning: 2025-11-25T11:05:02.899Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-25T11:05:02.899Z
Learning: Applies to **/src/**/*.rs : Use the macro system (`pyclass`, `pymodule`, `pyfunction`, etc.) when implementing Python functionality in Rust

Applied to files:

  • crates/vm/src/stdlib/ctypes.rs
  • crates/vm/src/stdlib/ctypes/array.rs
📚 Learning: 2025-08-26T05:20:54.540Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.540Z
Learning: In the RustPython codebase, only certain builtin types should be marked with the SEQUENCE flag for pattern matching. List and tuple are sequences, but bytes, bytearray, and range are not considered sequences in this context, even though they may implement sequence-like protocols.

Applied to files:

  • crates/vm/src/stdlib/ctypes/array.rs
🧬 Code graph analysis (5)
crates/vm/src/stdlib/ctypes/union.rs (3)
crates/vm/src/stdlib/ctypes.rs (17)
  • get_size (140-159)
  • mem (142-142)
  • mem (143-143)
  • mem (144-144)
  • mem (145-145)
  • mem (146-146)
  • mem (147-147)
  • mem (148-148)
  • mem (149-149)
  • mem (150-150)
  • mem (151-151)
  • mem (152-152)
  • mem (153-153)
  • mem (154-154)
  • mem (155-155)
  • mem (156-156)
  • size_of (197-210)
crates/vm/src/stdlib/ctypes/field.rs (1)
  • type_ (271-273)
crates/vm/src/stdlib/ctypes/structure.rs (1)
  • field_type (248-249)
crates/vm/src/stdlib/ctypes.rs (3)
crates/vm/src/stdlib/ctypes/field.rs (4)
  • name (266-268)
  • obj (120-120)
  • obj (205-205)
  • offset (276-278)
crates/vm/src/stdlib/ctypes/function.rs (2)
  • obj (351-351)
  • f (273-274)
crates/vm/src/frame.rs (1)
  • class (122-124)
crates/vm/src/stdlib/ctypes/field.rs (2)
crates/vm/src/stdlib/ctypes/array.rs (5)
  • index (324-324)
  • index (341-341)
  • value (367-371)
  • value (375-375)
  • value (392-392)
crates/vm/src/stdlib/ctypes.rs (1)
  • obj (337-337)
crates/vm/src/stdlib/ctypes/base.rs (1)
crates/vm/src/stdlib/ctypes.rs (17)
  • get_size (140-159)
  • mem (142-142)
  • mem (143-143)
  • mem (144-144)
  • mem (145-145)
  • mem (146-146)
  • mem (147-147)
  • mem (148-148)
  • mem (149-149)
  • mem (150-150)
  • mem (151-151)
  • mem (152-152)
  • mem (153-153)
  • mem (154-154)
  • mem (155-155)
  • mem (156-156)
  • size_of (197-210)
crates/vm/src/stdlib/ctypes/array.rs (1)
crates/vm/src/stdlib/ctypes.rs (17)
  • get_size (140-159)
  • mem (142-142)
  • mem (143-143)
  • mem (144-144)
  • mem (145-145)
  • mem (146-146)
  • mem (147-147)
  • mem (148-148)
  • mem (149-149)
  • mem (150-150)
  • mem (151-151)
  • mem (152-152)
  • mem (153-153)
  • mem (154-154)
  • mem (155-155)
  • mem (156-156)
  • size_of (197-210)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Run snippets and cpython tests (windows-latest)
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Check Rust code with clippy
  • GitHub Check: Run tests under miri
  • GitHub Check: Run rust tests (windows-latest)
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Ensure compilation on various targets
🔇 Additional comments (9)
crates/vm/src/stdlib/ctypes/base.rs (1)

306-334: LGTM - element_size computation with proper fallbacks.

The logic correctly derives element size from the _type_ attribute when available, with sensible fallback to size_of::<usize>(). The initialization of PyCArray with element_size and empty buffer aligns with the new buffer-backed design.

crates/vm/src/stdlib/ctypes/union.rs (1)

16-36: LGTM - Clean metaclass constructor pattern.

The py_new implementation correctly creates the type via PyType::py_new, then processes _fields_ only when directly defined on the new class (not inherited). This follows proper ctypes semantics.

crates/vm/src/stdlib/ctypes.rs (2)

63-78: LGTM - Clean CArgObject wrapper for byref().

The CArgObject struct properly wraps the ctypes object reference with an offset, and the _obj getter provides access to the underlying object. This aligns with CPython's ctypes behavior.


344-367: LGTM - Proper ctypes instance validation in byref().

The implementation correctly checks for both PyCData and PyCSimple instances, uses a dynamic error message with the actual type name (addressing the previous review), and properly wraps the result in CArgObject.

crates/vm/src/stdlib/ctypes/field.rs (1)

34-52: LGTM - Clean constructor for PyCField.

The new constructor properly initializes all fields with sensible defaults for non-bitfield cases (anonymous: false, bitfield_size: false, bit_offset: 0).

crates/vm/src/stdlib/ctypes/array.rs (3)

33-65: LGTM - Clean Callable implementation for array instantiation.

The implementation correctly initializes the buffer based on element size and length, and properly handles positional argument initialization with bounds checking.


87-117: LGTM - Proper nested array type creation.

The __mul__ implementation correctly creates nested array types where the element size becomes the total size of the inner array. The conversion from PyObjectRef to PyType is handled appropriately.


225-244: LGTM - Clean AsSequence implementation.

The sequence methods properly delegate to the helper functions with appropriate error handling for deletion attempts.

crates/vm/src/stdlib/ctypes/structure.rs (1)

16-104: Metaclass wiring and _fields_ processing look coherent

The PyCStructType metaclass plus process_fields pipeline for _fields_ (including direct-attr checks in both py_new and __init_subclass__) is consistent and keeps field layout logic centralized. This is a solid base for the new Structure model.

Comment on lines +102 to +134
impl GetDescriptor for PyCField {
fn descr_get(
zelf: PyObjectRef,
obj: Option<PyObjectRef>,
_cls: Option<PyObjectRef>,
vm: &VirtualMachine,
) -> PyResult {
let zelf = zelf
.downcast::<PyCField>()
.map_err(|_| vm.new_type_error("expected CField".to_owned()))?;

// If obj is None, return the descriptor itself (class attribute access)
let obj = match obj {
Some(obj) if !vm.is_none(&obj) => obj,
_ => return Ok(zelf.into()),
};

// Instance attribute access - read value from the structure's buffer
if let Some(structure) = obj.downcast_ref::<PyCStructure>() {
let buffer = structure.buffer.read();
let offset = zelf.byte_offset;
let size = zelf.byte_size;

if offset + size <= buffer.len() {
let bytes = &buffer[offset..offset + size];
return PyCField::bytes_to_value(bytes, size, vm);
}
}

// Fallback: return 0 for uninitialized or unsupported types
Ok(vm.ctx.new_int(0).into())
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Descriptor fallback silently returns 0 for buffer overruns and non-Structure types.

Lines 125-128 check bounds but line 132 silently returns 0 when:

  1. The object isn't a PyCStructure
  2. The buffer is too small

This can mask bugs. Consider raising an error or at least logging when the fallback is hit unexpectedly.

         // Instance attribute access - read value from the structure's buffer
         if let Some(structure) = obj.downcast_ref::<PyCStructure>() {
             let buffer = structure.buffer.read();
             let offset = zelf.byte_offset;
             let size = zelf.byte_size;

             if offset + size <= buffer.len() {
                 let bytes = &buffer[offset..offset + size];
                 return PyCField::bytes_to_value(bytes, size, vm);
             }
+            // Buffer too small - structure not properly initialized
+            return Err(vm.new_value_error("structure buffer too small for field access".to_owned()));
         }

-        // Fallback: return 0 for uninitialized or unsupported types
-        Ok(vm.ctx.new_int(0).into())
+        // Field descriptor accessed on non-Structure type
+        Err(vm.new_type_error(format!(
+            "descriptor works only on Structure instances, got {}",
+            obj.class().name()
+        )))
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
impl GetDescriptor for PyCField {
fn descr_get(
zelf: PyObjectRef,
obj: Option<PyObjectRef>,
_cls: Option<PyObjectRef>,
vm: &VirtualMachine,
) -> PyResult {
let zelf = zelf
.downcast::<PyCField>()
.map_err(|_| vm.new_type_error("expected CField".to_owned()))?;
// If obj is None, return the descriptor itself (class attribute access)
let obj = match obj {
Some(obj) if !vm.is_none(&obj) => obj,
_ => return Ok(zelf.into()),
};
// Instance attribute access - read value from the structure's buffer
if let Some(structure) = obj.downcast_ref::<PyCStructure>() {
let buffer = structure.buffer.read();
let offset = zelf.byte_offset;
let size = zelf.byte_size;
if offset + size <= buffer.len() {
let bytes = &buffer[offset..offset + size];
return PyCField::bytes_to_value(bytes, size, vm);
}
}
// Fallback: return 0 for uninitialized or unsupported types
Ok(vm.ctx.new_int(0).into())
}
}
impl GetDescriptor for PyCField {
fn descr_get(
zelf: PyObjectRef,
obj: Option<PyObjectRef>,
_cls: Option<PyObjectRef>,
vm: &VirtualMachine,
) -> PyResult {
let zelf = zelf
.downcast::<PyCField>()
.map_err(|_| vm.new_type_error("expected CField".to_owned()))?;
// If obj is None, return the descriptor itself (class attribute access)
let obj = match obj {
Some(obj) if !vm.is_none(&obj) => obj,
_ => return Ok(zelf.into()),
};
// Instance attribute access - read value from the structure's buffer
if let Some(structure) = obj.downcast_ref::<PyCStructure>() {
let buffer = structure.buffer.read();
let offset = zelf.byte_offset;
let size = zelf.byte_size;
if offset + size <= buffer.len() {
let bytes = &buffer[offset..offset + size];
return PyCField::bytes_to_value(bytes, size, vm);
}
// Buffer too small - structure not properly initialized
return Err(vm.new_value_error("structure buffer too small for field access".to_owned()));
}
// Field descriptor accessed on non-Structure type
Err(vm.new_type_error(format!(
"descriptor works only on Structure instances, got {}",
obj.class().name()
)))
}
}
🤖 Prompt for AI Agents
In crates/vm/src/stdlib/ctypes/field.rs around lines 102 to 134, the descriptor
accessor currently returns a silent 0 when the target is not a PyCStructure or
when the structure buffer is too small, which can hide bugs; update the function
to (1) return a proper TypeError (via vm.new_type_error) when obj is not a
PyCStructure, and (2) return a BufferError/IndexError (via vm.new_index_error or
vm.new_value_error) when offset+size exceeds buffer.len(); ensure you include a
clear message describing the expected type or insufficient buffer size so
callers see the real failure instead of a silent zero.

Comment on lines +206 to +216
match value {
PySetterValue::Assign(value) => {
let offset = zelf.byte_offset;
let size = zelf.byte_size;
let bytes = PyCField::value_to_bytes(&value, size, vm)?;

let mut buffer = structure.buffer.write();
if offset + size <= buffer.len() {
buffer[offset..offset + size].copy_from_slice(&bytes);
}
Ok(())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Silent no-op when buffer is too small for write.

Lines 213-215 silently skip the write if the buffer is too small. This masks initialization bugs and deviates from CPython behavior which would raise an error.

                     let mut buffer = structure.buffer.write();
-                    if offset + size <= buffer.len() {
-                        buffer[offset..offset + size].copy_from_slice(&bytes);
+                    if offset + size > buffer.len() {
+                        return Err(vm.new_value_error(
+                            "structure buffer too small for field assignment".to_owned()
+                        ));
                     }
+                    buffer[offset..offset + size].copy_from_slice(&bytes);
                     Ok(())
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
match value {
PySetterValue::Assign(value) => {
let offset = zelf.byte_offset;
let size = zelf.byte_size;
let bytes = PyCField::value_to_bytes(&value, size, vm)?;
let mut buffer = structure.buffer.write();
if offset + size <= buffer.len() {
buffer[offset..offset + size].copy_from_slice(&bytes);
}
Ok(())
match value {
PySetterValue::Assign(value) => {
let offset = zelf.byte_offset;
let size = zelf.byte_size;
let bytes = PyCField::value_to_bytes(&value, size, vm)?;
let mut buffer = structure.buffer.write();
if offset + size > buffer.len() {
return Err(vm.new_value_error(
"structure buffer too small for field assignment".to_owned()
));
}
buffer[offset..offset + size].copy_from_slice(&bytes);
Ok(())
🤖 Prompt for AI Agents
In crates/vm/src/stdlib/ctypes/field.rs around lines 206 to 216, the current
code silently skips writing when offset + size > buffer.len(); instead of a
no-op, detect this condition and return a Python exception so it fails loudly
like CPython. Replace the silent branch with code that constructs and returns a
VM error (e.g., vm.new_value_error or an appropriate vm error type) containing a
clear message with the field name/offset, required size, and actual buffer
length; keep the buffer write path unchanged and ensure the buffer lock is still
held when performing the bounds check and returning the error.

Comment on lines +213 to 300
impl Constructor for PyCStructure {
type Args = FuncArgs;

fn py_new(cls: PyTypeRef, args: Self::Args, vm: &VirtualMachine) -> PyResult {
// Get _fields_ from the class using get_attr to properly search MRO
let fields_attr = cls.as_object().get_attr("_fields_", vm).ok();

let mut fields_map = IndexMap::new();
let mut total_size = 0usize;

if let Some(fields_attr) = fields_attr {
let fields: Vec<PyObjectRef> = if let Some(list) = fields_attr.downcast_ref::<PyList>()
{
list.borrow_vec().to_vec()
} else if let Some(tuple) = fields_attr.downcast_ref::<PyTuple>() {
tuple.to_vec()
} else {
vec![]
};

let mut offset = 0usize;
for field in fields.iter() {
let Some(field_tuple) = field.downcast_ref::<PyTuple>() else {
continue;
};
if field_tuple.len() < 2 {
continue;
}
let Some(name) = field_tuple.first().unwrap().downcast_ref::<PyStr>() else {
continue;
};
let name = name.to_string();
let field_type = field_tuple.get(1).unwrap().clone();
let size = PyCStructType::get_field_size(&field_type, vm)?;

let type_ref = field_type
.downcast::<PyType>()
.unwrap_or_else(|_| vm.ctx.types.object_type.to_owned());

fields_map.insert(
name.clone(),
FieldInfo {
name,
offset,
size,
type_ref,
},
);

offset += size;
}
total_size = offset;
}

// Initialize buffer with zeros
let buffer = vec![0u8; total_size];

let instance = PyCStructure {
buffer: PyRwLock::new(buffer),
fields: PyRwLock::new(fields_map.clone()),
size: AtomicCell::new(total_size),
};

// Handle keyword arguments for field initialization
let py_instance = instance.into_ref_with_type(vm, cls.clone())?;
let py_obj: PyObjectRef = py_instance.clone().into();

// Set field values from kwargs using standard attribute setting
for (key, value) in args.kwargs.iter() {
if fields_map.contains_key(key.as_str()) {
py_obj.set_attr(vm.ctx.intern_str(key.as_str()), value.clone(), vm)?;
}
}

// Set field values from positional args
let field_names: Vec<String> = fields_map.keys().cloned().collect();
for (i, value) in args.args.iter().enumerate() {
if i < field_names.len() {
py_obj.set_attr(
vm.ctx.intern_str(field_names[i].as_str()),
value.clone(),
vm,
)?;
}
}

Ok(py_instance.into())
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

��️ Potential issue | 🟡 Minor

PyCStructure.__new__ silently ignores invalid fields and extra arguments

A few behaviors here differ from ctypes.Structure and can hide bugs:

  • Invalid _fields_ entries at instance‑creation time are silently skipped (continue on bad tuples / names), even though class creation previously enforced stricter validation.
  • Unknown keyword arguments are ignored:
for (key, value) in args.kwargs.iter() {
    if fields_map.contains_key(key.as_str()) {
        py_obj.set_attr(...)?;
    }
}
  • Extra positional arguments beyond the number of fields are also silently dropped due to if i < field_names.len() without an error.

Consider tightening this up so user mistakes surface early:

-        for (key, value) in args.kwargs.iter() {
-            if fields_map.contains_key(key.as_str()) {
-                py_obj.set_attr(vm.ctx.intern_str(key.as_str()), value.clone(), vm)?;
-            }
-        }
+        for (key, value) in args.kwargs.iter() {
+            let field_name = key.as_str();
+            if !fields_map.contains_key(field_name) {
+                return Err(vm.new_type_error(format!(
+                    "'{field_name}' is not a valid field name for {}",
+                    cls.name()
+                )));
+            }
+            py_obj.set_attr(vm.ctx.intern_str(field_name), value.clone(), vm)?;
+        }
+
+        if args.args.len() > field_names.len() {
+            return Err(vm.new_type_error(format!(
+                "{}() expected at most {} positional arguments, got {}",
+                cls.name(),
+                field_names.len(),
+                args.args.len()
+            )));
+        }

This keeps instance initialization predictable and closer to CPython’s ctypes behavior.

@youknowone youknowone merged commit 4fcce4d into RustPython:main Nov 29, 2025
12 of 13 checks passed
@youknowone youknowone deleted the ctypes-struct branch November 29, 2025 03:13
@coderabbitai coderabbitai bot mentioned this pull request Nov 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant