-
Notifications
You must be signed in to change notification settings - Fork 1.4k
ctypes struct/union/array #6309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Caution Review failedThe pull request is closed. WalkthroughReworks _ctypes: adds metaclass-driven Structure/Union types, converts C arrays to buffer-backed, sequence-enabled types, implements descriptor-based PyCField read/write, changes byref to validate and return a new CArgObject, and exposes a platform-aware _wstring_at_addr attr. Changes
Sequence DiagramsequenceDiagram
participant User as User code
participant Meta as PyCStructType / PyCUnionType
participant Cls as Structure/Union Class
participant Field as PyCField (descriptor)
participant Inst as Structure/Union Instance
participant Core as _ctypes module
User->>Meta: define class with _fields_
Meta->>Meta: process_fields(_fields_)
Meta->>Field: create PyCField descriptors (compute size/offset)
Meta->>Cls: register descriptors and fields metadata
Meta-->>User: return created class
User->>Cls: instantiate(args/kwargs)
Cls->>Inst: py_new -> allocate buffer (Vec<u8>) and fields map
Inst->>Inst: populate buffer from args/kwargs
Inst-->>User: instance returned
User->>Inst: read attribute
Inst->>Field: descriptor __get__ (reads buffer[offset...])
Field->>Field: bytes_to_value -> PyObject
Field-->>User: value
User->>Inst: write attribute
Inst->>Field: descriptor __set__ (value_to_bytes -> bytes)
Field->>Inst: write bytes into buffer[offset...]
Field-->>User: done
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes
Possibly related PRs
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
📜 Recent review detailsConfiguration used: Path: .coderabbit.yml Review profile: CHILL Plan: Pro 📒 Files selected for processing (3)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
1e983b1 to
df369d0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 9
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/vm/src/stdlib/ctypes/field.rs (1)
68-76: Repr displays boolean as size value.
bitfield_sizeis abool(line 29), but the repr formatbit_size={bitfield_size}suggests it should be a numeric size. This will displaybit_size=trueinstead of an actual bit count.Either rename the field to
is_bitfieldor change its type to store the actual bit size:- pub(super) bitfield_size: bool, + pub(super) bitfield_size: Option<usize>, // None if not a bitfield, Some(n) for n-bit field
🧹 Nitpick comments (3)
crates/vm/src/stdlib/ctypes/union.rs (1)
85-109: Consider extracting shared size calculation logic.This
get_field_sizelogic duplicates the element size derivation inbase.rs(lines 311-324) andarray.rs. Consider extracting this into a shared helper function in_ctypesto reduce duplication and ensure consistent behavior.Example shared helper:
// In _ctypes module pub fn get_field_size_from_type(field_type: &PyObjectRef, vm: &VirtualMachine) -> PyResult<usize> { // ... shared implementation }crates/vm/src/stdlib/ctypes/array.rs (1)
374-382: Silent failure when value is not bytes.The
set_valuesetter silently ignores non-PyBytesvalues and returnsOk(()). This could mask user errors. Consider raising aTypeErrorfor consistency withset_raw.#[pygetset(setter)] fn set_value(&self, value: PyObjectRef, vm: &VirtualMachine) -> PyResult<()> { if let Some(bytes) = value.downcast_ref::<PyBytes>() { let mut buffer = self.buffer.write(); let src = bytes.as_bytes(); let len = std::cmp::min(src.len(), buffer.len()); buffer[..len].copy_from_slice(&src[..len]); + Ok(()) + } else { + Err(vm.new_type_error("expected bytes".to_owned())) } - Ok(()) }crates/vm/src/stdlib/ctypes/structure.rs (1)
233-242: Inconsistent error handling: silently skipping malformed fields.
process_fieldsreturns errors for malformed field tuples, butpy_newsilently skips them withcontinue. This inconsistency could cause hard-to-debug issues where a structure appears to work but has missing fields.Consider reusing validation logic or returning errors consistently:
- let Some(field_tuple) = field.downcast_ref::<PyTuple>() else { - continue; - }; - if field_tuple.len() < 2 { - continue; - } - let Some(name) = field_tuple.first().unwrap().downcast_ref::<PyStr>() else { - continue; - }; + let field_tuple = field + .downcast_ref::<PyTuple>() + .ok_or_else(|| vm.new_type_error("_fields_ must contain tuples".to_string()))?; + if field_tuple.len() < 2 { + return Err(vm.new_type_error("_fields_ tuple must have at least 2 elements".to_string())); + } + let name = field_tuple.first().unwrap() + .downcast_ref::<PyStr>() + .ok_or_else(|| vm.new_type_error("field name must be a string".to_string()))?;
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
crates/vm/src/stdlib/ctypes.rs(5 hunks)crates/vm/src/stdlib/ctypes/array.rs(5 hunks)crates/vm/src/stdlib/ctypes/base.rs(1 hunks)crates/vm/src/stdlib/ctypes/field.rs(4 hunks)crates/vm/src/stdlib/ctypes/pointer.rs(1 hunks)crates/vm/src/stdlib/ctypes/structure.rs(1 hunks)crates/vm/src/stdlib/ctypes/union.rs(1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs: Follow the default rustfmt code style in Rust code (cargo fmtto format)
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by your changes
Follow Rust best practices for error handling and memory management
Files:
crates/vm/src/stdlib/ctypes/base.rscrates/vm/src/stdlib/ctypes/pointer.rscrates/vm/src/stdlib/ctypes/union.rscrates/vm/src/stdlib/ctypes.rscrates/vm/src/stdlib/ctypes/array.rscrates/vm/src/stdlib/ctypes/field.rscrates/vm/src/stdlib/ctypes/structure.rs
**/src/**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use the macro system (
pyclass,pymodule,pyfunction, etc.) when implementing Python functionality in Rust
Files:
crates/vm/src/stdlib/ctypes/base.rscrates/vm/src/stdlib/ctypes/pointer.rscrates/vm/src/stdlib/ctypes/union.rscrates/vm/src/stdlib/ctypes.rscrates/vm/src/stdlib/ctypes/array.rscrates/vm/src/stdlib/ctypes/field.rscrates/vm/src/stdlib/ctypes/structure.rs
🧠 Learnings (2)
📚 Learning: 2025-11-25T11:05:02.899Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-25T11:05:02.899Z
Learning: Applies to **/src/**/*.rs : Use the macro system (`pyclass`, `pymodule`, `pyfunction`, etc.) when implementing Python functionality in Rust
Applied to files:
crates/vm/src/stdlib/ctypes.rscrates/vm/src/stdlib/ctypes/array.rs
📚 Learning: 2025-08-26T05:20:54.540Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.540Z
Learning: In the RustPython codebase, only certain builtin types should be marked with the SEQUENCE flag for pattern matching. List and tuple are sequences, but bytes, bytearray, and range are not considered sequences in this context, even though they may implement sequence-like protocols.
Applied to files:
crates/vm/src/stdlib/ctypes/array.rs
🧬 Code graph analysis (6)
crates/vm/src/stdlib/ctypes/base.rs (1)
crates/vm/src/stdlib/ctypes.rs (17)
get_size(140-159)mem(142-142)mem(143-143)mem(144-144)mem(145-145)mem(146-146)mem(147-147)mem(148-148)mem(149-149)mem(150-150)mem(151-151)mem(152-152)mem(153-153)mem(154-154)mem(155-155)mem(156-156)size_of(197-210)
crates/vm/src/stdlib/ctypes/pointer.rs (1)
crates/vm/src/stdlib/ctypes.rs (16)
mem(142-142)mem(143-143)mem(144-144)mem(145-145)mem(146-146)mem(147-147)mem(148-148)mem(149-149)mem(150-150)mem(151-151)mem(152-152)mem(153-153)mem(154-154)mem(155-155)mem(156-156)size_of(197-210)
crates/vm/src/stdlib/ctypes/union.rs (3)
crates/vm/src/stdlib/ctypes.rs (17)
get_size(140-159)mem(142-142)mem(143-143)mem(144-144)mem(145-145)mem(146-146)mem(147-147)mem(148-148)mem(149-149)mem(150-150)mem(151-151)mem(152-152)mem(153-153)mem(154-154)mem(155-155)mem(156-156)size_of(197-210)crates/vm/src/stdlib/ctypes/field.rs (1)
type_(271-273)crates/vm/src/stdlib/ctypes/structure.rs (1)
field_type(247-248)
crates/vm/src/stdlib/ctypes/array.rs (1)
crates/vm/src/stdlib/ctypes.rs (17)
get_size(140-159)mem(142-142)mem(143-143)mem(144-144)mem(145-145)mem(146-146)mem(147-147)mem(148-148)mem(149-149)mem(150-150)mem(151-151)mem(152-152)mem(153-153)mem(154-154)mem(155-155)mem(156-156)size_of(197-210)
crates/vm/src/stdlib/ctypes/field.rs (2)
crates/vm/src/stdlib/ctypes/base.rs (13)
new(169-175)value(44-46)value(48-50)value(52-54)value(84-84)value(94-94)value(104-104)value(111-111)value(112-112)value(123-123)value(134-134)value(135-135)value(287-292)crates/vm/src/stdlib/ctypes/array.rs (5)
index(324-324)index(341-341)value(367-371)value(375-375)value(392-392)
crates/vm/src/stdlib/ctypes/structure.rs (2)
crates/vm/src/stdlib/ctypes.rs (17)
get_size(140-159)mem(142-142)mem(143-143)mem(144-144)mem(145-145)mem(146-146)mem(147-147)mem(148-148)mem(149-149)mem(150-150)mem(151-151)mem(152-152)mem(153-153)mem(154-154)mem(155-155)mem(156-156)size_of(197-210)crates/vm/src/stdlib/ctypes/field.rs (2)
name(266-268)size(246-248)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Run rust tests (windows-latest)
- GitHub Check: Check Rust code with clippy
- GitHub Check: Run snippets and cpython tests (windows-latest)
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
- GitHub Check: Run snippets and cpython tests (macos-latest)
🔇 Additional comments (14)
crates/vm/src/stdlib/ctypes/pointer.rs (1)
27-34: LGTM!The pointer size calculation using
std::mem::size_of::<usize>()is correct for pointer types, and the initialization ofelement_sizeandbufferfields aligns with the newPyCArraystructure. The empty buffer is appropriate since this creates a type object, not an instance.crates/vm/src/stdlib/ctypes/base.rs (2)
306-324: LGTM!The element size derivation logic correctly extracts the type code from the
_type_attribute and usesget_sizefor single-character codes, with appropriate fallbacks tosize_of::<usize>()for unknown or complex types.
325-334: LGTM!The
PyCArrayTypeinitialization is consistent with the new buffer-backed array structure introduced across the ctypes module.crates/vm/src/stdlib/ctypes/union.rs (2)
16-36: LGTM!The metaclass constructor correctly delegates to
PyType::py_newand processes_fields_only when defined directly on the class (not inherited), which is the expected behavior for ctypes Union types.
41-83: LGTM!The
process_fieldsimplementation correctly implements union semantics where all fields start at offset 0. The validation of_fields_format (list/tuple of tuples with at least 2 elements) is thorough.crates/vm/src/stdlib/ctypes.rs (2)
24-25: LGTM!The metaclass registrations for
PyCStructTypeandPyCUnionTypeare correctly placed before the module extension, ensuring the metaclasses are available when creating the base classes.
63-78: LGTM!The
CArgObjectimplementation correctly follows thepyclassmacro conventions. The_objgetter provides access to the wrapped object as expected by ctypes semantics.crates/vm/src/stdlib/ctypes/array.rs (3)
33-65: LGTM!The
Callableimplementation correctly creates array instances from positional arguments, initializing the buffer with proper element sizing. The bounds checking (if i >= length) prevents overflow when more arguments than array length are provided.
87-116: LGTM!The
__mul__implementation correctly creates nested array types where the element size is the total size of the inner array. This enables expressions like(c_int * 5) * 3to create a 2D array structure.
248-256: LGTM!The
int_to_byteshelper correctly uses native endianness (to_ne_bytes) which matches platform ABI expectations for ctypes. The size handling covers common C integer sizes.crates/vm/src/stdlib/ctypes/structure.rs (2)
21-41: LGTM!The metaclass constructor correctly delegates to
PyType::py_newand processes_fields_only when defined directly on the class (avoiding inherited fields).
56-104: LGTM!Field processing correctly validates structure, extracts field metadata, and registers CField descriptors with proper offsets.
crates/vm/src/stdlib/ctypes/field.rs (2)
34-53: LGTM!Clean constructor with sensible defaults for optional fields.
102-134: LGTM!The descriptor correctly returns itself on class access and reads from the structure buffer on instance access with proper bounds checking.
| fn bytes_to_int(bytes: &[u8], size: usize, vm: &VirtualMachine) -> PyObjectRef { | ||
| match size { | ||
| 1 => vm.ctx.new_int(bytes[0] as i8).into(), | ||
| 2 => { | ||
| let val = i16::from_ne_bytes([bytes[0], bytes[1]]); | ||
| vm.ctx.new_int(val).into() | ||
| } | ||
| 4 => { | ||
| let val = i32::from_ne_bytes([bytes[0], bytes[1], bytes[2], bytes[3]]); | ||
| vm.ctx.new_int(val).into() | ||
| } | ||
| 8 => { | ||
| let val = i64::from_ne_bytes([ | ||
| bytes[0], bytes[1], bytes[2], bytes[3], bytes[4], bytes[5], bytes[6], bytes[7], | ||
| ]); | ||
| vm.ctx.new_int(val).into() | ||
| } | ||
| _ => vm.ctx.new_int(0).into(), | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential panic if buffer is too short.
The bytes_to_int function accesses bytes[0], bytes[1], etc. without verifying that the slice has enough elements. If called with an undersized slice, this will panic.
fn bytes_to_int(bytes: &[u8], size: usize, vm: &VirtualMachine) -> PyObjectRef {
+ if bytes.len() < size {
+ return vm.ctx.new_int(0).into();
+ }
match size {
1 => vm.ctx.new_int(bytes[0] as i8).into(),
2 => {
- let val = i16::from_ne_bytes([bytes[0], bytes[1]]);
+ let val = i16::from_ne_bytes(bytes[..2].try_into().unwrap());
vm.ctx.new_int(val).into()
}
4 => {
- let val = i32::from_ne_bytes([bytes[0], bytes[1], bytes[2], bytes[3]]);
+ let val = i32::from_ne_bytes(bytes[..4].try_into().unwrap());
vm.ctx.new_int(val).into()
}
8 => {
- let val = i64::from_ne_bytes([
- bytes[0], bytes[1], bytes[2], bytes[3], bytes[4], bytes[5], bytes[6], bytes[7],
- ]);
+ let val = i64::from_ne_bytes(bytes[..8].try_into().unwrap());
vm.ctx.new_int(val).into()
}
_ => vm.ctx.new_int(0).into(),
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| fn bytes_to_int(bytes: &[u8], size: usize, vm: &VirtualMachine) -> PyObjectRef { | |
| match size { | |
| 1 => vm.ctx.new_int(bytes[0] as i8).into(), | |
| 2 => { | |
| let val = i16::from_ne_bytes([bytes[0], bytes[1]]); | |
| vm.ctx.new_int(val).into() | |
| } | |
| 4 => { | |
| let val = i32::from_ne_bytes([bytes[0], bytes[1], bytes[2], bytes[3]]); | |
| vm.ctx.new_int(val).into() | |
| } | |
| 8 => { | |
| let val = i64::from_ne_bytes([ | |
| bytes[0], bytes[1], bytes[2], bytes[3], bytes[4], bytes[5], bytes[6], bytes[7], | |
| ]); | |
| vm.ctx.new_int(val).into() | |
| } | |
| _ => vm.ctx.new_int(0).into(), | |
| } | |
| } | |
| fn bytes_to_int(bytes: &[u8], size: usize, vm: &VirtualMachine) -> PyObjectRef { | |
| if bytes.len() < size { | |
| return vm.ctx.new_int(0).into(); | |
| } | |
| match size { | |
| 1 => vm.ctx.new_int(bytes[0] as i8).into(), | |
| 2 => { | |
| let val = i16::from_ne_bytes(bytes[..2].try_into().unwrap()); | |
| vm.ctx.new_int(val).into() | |
| } | |
| 4 => { | |
| let val = i32::from_ne_bytes(bytes[..4].try_into().unwrap()); | |
| vm.ctx.new_int(val).into() | |
| } | |
| 8 => { | |
| let val = i64::from_ne_bytes(bytes[..8].try_into().unwrap()); | |
| vm.ctx.new_int(val).into() | |
| } | |
| _ => vm.ctx.new_int(0).into(), | |
| } | |
| } |
🤖 Prompt for AI Agents
crates/vm/src/stdlib/ctypes/array.rs around lines 258 to 277: the bytes_to_int
function indexes bytes[0..7] without checking slice length which can panic on
short slices; validate the slice length against the requested size before
indexing and handle the error path instead of panicking — either return a Python
exception (e.g. raise a ValueError/IndexError via the VM context) or return a
safe default; implement a guard like if bytes.len() < size { return
vm.ctx.new_value_error("buffer too small").into() } (or equivalent using VM
APIs) then proceed to construct integers using safe indexing or by copying into
fixed-size arrays after confirming length.
| impl PyCArray { | ||
| #[allow(unused)] | ||
| pub fn to_arg(&self, _vm: &VirtualMachine) -> PyResult<libffi::middle::Arg> { | ||
| let value = self.value.read(); | ||
| let py_bytes = value.downcast_ref::<PyBytes>().unwrap(); | ||
| let bytes = py_bytes.payload().to_vec(); | ||
| Ok(libffi::middle::Arg::new(&bytes)) | ||
| let buffer = self.buffer.read(); | ||
| Ok(libffi::middle::Arg::new(&buffer.clone())) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use-after-free: Arg::new references a temporary that is immediately dropped.
libffi::middle::Arg::new(&buffer.clone()) creates an Arg referencing a cloned Vec<u8> that is dropped at the end of the expression, leaving the Arg pointing to freed memory. This will cause undefined behavior when the Arg is used.
The to_arg implementation needs to ensure the buffer lives long enough. This likely requires a different approach, such as storing the buffer in a way that outlives the FFI call, or using a raw pointer approach that matches how ctypes handles memory for FFI.
pub fn to_arg(&self, _vm: &VirtualMachine) -> PyResult<libffi::middle::Arg> {
- let buffer = self.buffer.read();
- Ok(libffi::middle::Arg::new(&buffer.clone()))
+ // TODO: This needs a different approach to ensure buffer lifetime
+ // The buffer must outlive the Arg returned here
+ let buffer = self.buffer.read();
+ let ptr = buffer.as_ptr();
+ Ok(libffi::middle::Arg::new(&ptr))
}Note: The suggested fix passes a pointer, but the overall lifetime management for FFI calls may need architectural review.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In crates/vm/src/stdlib/ctypes/array.rs around lines 404–409, the current
implementation calls libffi::middle::Arg::new(&buffer.clone()), creating an Arg
that points at a temporary Vec<u8> that is dropped immediately (use-after-free).
Fix by ensuring the buffer outlives the Arg: do not pass a reference to a
temporary clone; instead make the byte storage owned and stored with a lifetime
tied to the PyCArray (e.g., store Arc<Vec<u8>> or Box<[u8]>/Vec<u8> as a field
or in a dedicated arg_storage on self), create the Arg from a reference to that
owned storage (or use .as_ptr() with explicit length and ensure ownership
remains valid until the ffi call completes). Update PyCArray fields and to_arg
to return an Arg that references the persistent storage rather than a dropped
temporary.
| impl PyCField { | ||
| /// Convert bytes to a Python value based on size | ||
| fn bytes_to_value(bytes: &[u8], size: usize, vm: &VirtualMachine) -> PyResult { | ||
| match size { | ||
| 1 => Ok(vm.ctx.new_int(bytes[0] as i8).into()), | ||
| 2 => { | ||
| let val = i16::from_ne_bytes([bytes[0], bytes[1]]); | ||
| Ok(vm.ctx.new_int(val).into()) | ||
| } | ||
| 4 => { | ||
| let val = i32::from_ne_bytes([bytes[0], bytes[1], bytes[2], bytes[3]]); | ||
| Ok(vm.ctx.new_int(val).into()) | ||
| } | ||
| 8 => { | ||
| let val = i64::from_ne_bytes([ | ||
| bytes[0], bytes[1], bytes[2], bytes[3], bytes[4], bytes[5], bytes[6], bytes[7], | ||
| ]); | ||
| Ok(vm.ctx.new_int(val).into()) | ||
| } | ||
| _ => Ok(vm.ctx.new_int(0).into()), | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Type conversion only handles signed integers; missing unsigned and floating-point support.
This implementation always interprets bytes as signed integers, but ctypes supports unsigned types (c_uint, c_ulong, etc.) and floating-point types (c_float, c_double). Reading an unsigned field or float will produce incorrect values.
Consider using the field's proto type information to determine signedness and type:
- fn bytes_to_value(bytes: &[u8], size: usize, vm: &VirtualMachine) -> PyResult {
+ fn bytes_to_value(bytes: &[u8], size: usize, proto: &PyObjectRef, vm: &VirtualMachine) -> PyResult {
+ // Check _type_ to determine signed/unsigned/float
+ let type_code = proto.get_attr("_type_", vm).ok()
+ .and_then(|t| t.str(vm).ok())
+ .map(|s| s.to_string());
+
match size {
1 => Ok(vm.ctx.new_int(bytes[0] as i8).into()),
+ // Add unsigned variants based on type_code
// ...
}
}Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In crates/vm/src/stdlib/ctypes/field.rs around lines 136 to 157, bytes_to_value
currently always decodes signed integers; change it to inspect the field's
proto/type information to choose the correct decoding path (signed int, unsigned
int, or float) and return the appropriate Python object: for unsigned sizes use
u8/u16/u32/u64 via from_ne_bytes and pass the value to vm.ctx.new_int (Python
ints cover unsigned), for floats use f32::from_ne_bytes/f64::from_ne_bytes and
vm.ctx.new_float, and for signed sizes keep the existing i8/i16/i32/i64
handling; also validate that bytes.len() >= size and return an error or sensible
default on mismatch rather than silently returning 0.
| /// Convert a Python value to bytes | ||
| fn value_to_bytes(value: &PyObjectRef, size: usize, vm: &VirtualMachine) -> PyResult<Vec<u8>> { | ||
| if let Ok(int_val) = value.try_int(vm) { | ||
| let i = int_val.as_bigint(); | ||
| match size { | ||
| 1 => { | ||
| let val = i.to_i8().unwrap_or(0); | ||
| Ok(val.to_ne_bytes().to_vec()) | ||
| } | ||
| 2 => { | ||
| let val = i.to_i16().unwrap_or(0); | ||
| Ok(val.to_ne_bytes().to_vec()) | ||
| } | ||
| 4 => { | ||
| let val = i.to_i32().unwrap_or(0); | ||
| Ok(val.to_ne_bytes().to_vec()) | ||
| } | ||
| 8 => { | ||
| let val = i.to_i64().unwrap_or(0); | ||
| Ok(val.to_ne_bytes().to_vec()) | ||
| } | ||
| _ => Ok(vec![0u8; size]), | ||
| } | ||
| } else { | ||
| Ok(vec![0u8; size]) | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Silent truncation on integer overflow.
unwrap_or(0) silently handles values that don't fit in the target size. For example, assigning 300 to a 1-byte field silently becomes 0 instead of raising an OverflowError (as CPython ctypes does).
Consider returning an error on overflow:
1 => {
- let val = i.to_i8().unwrap_or(0);
+ let val = i.to_i8().ok_or_else(|| {
+ vm.new_overflow_error(format!("int too large to convert to c_byte"))
+ })?;
Ok(val.to_ne_bytes().to_vec())
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| /// Convert a Python value to bytes | |
| fn value_to_bytes(value: &PyObjectRef, size: usize, vm: &VirtualMachine) -> PyResult<Vec<u8>> { | |
| if let Ok(int_val) = value.try_int(vm) { | |
| let i = int_val.as_bigint(); | |
| match size { | |
| 1 => { | |
| let val = i.to_i8().unwrap_or(0); | |
| Ok(val.to_ne_bytes().to_vec()) | |
| } | |
| 2 => { | |
| let val = i.to_i16().unwrap_or(0); | |
| Ok(val.to_ne_bytes().to_vec()) | |
| } | |
| 4 => { | |
| let val = i.to_i32().unwrap_or(0); | |
| Ok(val.to_ne_bytes().to_vec()) | |
| } | |
| 8 => { | |
| let val = i.to_i64().unwrap_or(0); | |
| Ok(val.to_ne_bytes().to_vec()) | |
| } | |
| _ => Ok(vec![0u8; size]), | |
| } | |
| } else { | |
| Ok(vec![0u8; size]) | |
| } | |
| } | |
| } | |
| /// Convert a Python value to bytes | |
| fn value_to_bytes(value: &PyObjectRef, size: usize, vm: &VirtualMachine) -> PyResult<Vec<u8>> { | |
| if let Ok(int_val) = value.try_int(vm) { | |
| let i = int_val.as_bigint(); | |
| match size { | |
| 1 => { | |
| let val = i.to_i8().ok_or_else(|| { | |
| vm.new_overflow_error(format!("int too large to convert to c_byte")) | |
| })?; | |
| Ok(val.to_ne_bytes().to_vec()) | |
| } | |
| 2 => { | |
| let val = i.to_i16().ok_or_else(|| { | |
| vm.new_overflow_error(format!("int too large to convert to c_short")) | |
| })?; | |
| Ok(val.to_ne_bytes().to_vec()) | |
| } | |
| 4 => { | |
| let val = i.to_i32().ok_or_else(|| { | |
| vm.new_overflow_error(format!("int too large to convert to c_int")) | |
| })?; | |
| Ok(val.to_ne_bytes().to_vec()) | |
| } | |
| 8 => { | |
| let val = i.to_i64().ok_or_else(|| { | |
| vm.new_overflow_error(format!("int too large to convert to c_long")) | |
| })?; | |
| Ok(val.to_ne_bytes().to_vec()) | |
| } | |
| _ => Ok(vec![0u8; size]), | |
| } | |
| } else { | |
| Ok(vec![0u8; size]) | |
| } | |
| } | |
| } |
🤖 Prompt for AI Agents
In crates/vm/src/stdlib/ctypes/field.rs around lines 159-186, the code currently
uses unwrap_or(0) when converting BigInt -> i8/i16/i32/i64 which silently
truncates values that don't fit; change those conversions to check the Option
return and return an OverflowError via the VM on failure. Concretely, replace
each let val = i.to_i8().unwrap_or(0); (and the i16/i32/i64 variants) with a
match or if-let that returns Err(vm.new_overflow_error(format!("integer
overflow: value {} does not fit into {} bytes", i, size))) when the conversion
returns None, otherwise proceed to to_ne_bytes().to_vec() for the successful
value. Keep the existing default branch for unsupported sizes.
| /// Get the size of a ctypes type | ||
| fn get_field_size(field_type: &PyObjectRef, vm: &VirtualMachine) -> PyResult<usize> { | ||
| // Try to get _type_ attribute for simple types | ||
| if let Some(size) = field_type | ||
| .get_attr("_type_", vm) | ||
| .ok() | ||
| .and_then(|type_attr| type_attr.str(vm).ok()) | ||
| .and_then(|type_str| { | ||
| let s = type_str.to_string(); | ||
| (s.len() == 1).then(|| get_size(&s)) | ||
| }) | ||
| { | ||
| return Ok(size); | ||
| } | ||
|
|
||
| // Try sizeof for other types | ||
| if let Some(s) = field_type | ||
| .get_attr("size_of_instances", vm) | ||
| .ok() | ||
| .and_then(|size_method| size_method.call((), vm).ok()) | ||
| .and_then(|size| size.try_int(vm).ok()) | ||
| .and_then(|n| n.as_bigint().to_usize()) | ||
| { | ||
| return Ok(s); | ||
| } | ||
|
|
||
| // Default to pointer size for unknown types | ||
| Ok(std::mem::size_of::<usize>()) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential panic when _type_ contains unrecognized type code.
The get_size() function (from ctypes.rs) uses unreachable!() for unknown type strings. While the s.len() == 1 check filters multi-char strings, a single unrecognized character (e.g., "x") will still trigger a panic.
Consider validating the type code before calling get_size, or wrap it with a fallback:
.and_then(|type_str| {
let s = type_str.to_string();
- (s.len() == 1).then(|| get_size(&s))
+ const VALID_TYPES: &[&str] = &[
+ "u", "c", "b", "h", "H", "i", "I", "l", "q", "L", "Q", "f", "d", "g", "?", "B", "P", "z", "Z", "O"
+ ];
+ (s.len() == 1 && VALID_TYPES.contains(&s.as_str())).then(|| get_size(&s))
})📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| /// Get the size of a ctypes type | |
| fn get_field_size(field_type: &PyObjectRef, vm: &VirtualMachine) -> PyResult<usize> { | |
| // Try to get _type_ attribute for simple types | |
| if let Some(size) = field_type | |
| .get_attr("_type_", vm) | |
| .ok() | |
| .and_then(|type_attr| type_attr.str(vm).ok()) | |
| .and_then(|type_str| { | |
| let s = type_str.to_string(); | |
| (s.len() == 1).then(|| get_size(&s)) | |
| }) | |
| { | |
| return Ok(size); | |
| } | |
| // Try sizeof for other types | |
| if let Some(s) = field_type | |
| .get_attr("size_of_instances", vm) | |
| .ok() | |
| .and_then(|size_method| size_method.call((), vm).ok()) | |
| .and_then(|size| size.try_int(vm).ok()) | |
| .and_then(|n| n.as_bigint().to_usize()) | |
| { | |
| return Ok(s); | |
| } | |
| // Default to pointer size for unknown types | |
| Ok(std::mem::size_of::<usize>()) | |
| } | |
| /// Get the size of a ctypes type | |
| fn get_field_size(field_type: &PyObjectRef, vm: &VirtualMachine) -> PyResult<usize> { | |
| // Try to get _type_ attribute for simple types | |
| if let Some(size) = field_type | |
| .get_attr("_type_", vm) | |
| .ok() | |
| .and_then(|type_attr| type_attr.str(vm).ok()) | |
| .and_then(|type_str| { | |
| let s = type_str.to_string(); | |
| const VALID_TYPES: &[&str] = &[ | |
| "u", "c", "b", "h", "H", "i", "I", "l", "q", "L", "Q", "f", "d", "g", "?", "B", "P", "z", "Z", "O" | |
| ]; | |
| (s.len() == 1 && VALID_TYPES.contains(&s.as_str())).then(|| get_size(&s)) | |
| }) | |
| { | |
| return Ok(size); | |
| } | |
| // Try sizeof for other types | |
| if let Some(s) = field_type | |
| .get_attr("size_of_instances", vm) | |
| .ok() | |
| .and_then(|size_method| size_method.call((), vm).ok()) | |
| .and_then(|size| size.try_int(vm).ok()) | |
| .and_then(|n| n.as_bigint().to_usize()) | |
| { | |
| return Ok(s); | |
| } | |
| // Default to pointer size for unknown types | |
| Ok(std::mem::size_of::<usize>()) | |
| } |
🤖 Prompt for AI Agents
In crates/vm/src/stdlib/ctypes/structure.rs around lines 106 to 134, the call to
get_size(...) can panic if the single-character _type_ value is an unrecognized
code (get_size currently uses unreachable!()). Before calling get_size, validate
the single-character type code against the set of supported ctypes codes (or map
it with a safe match) and only call get_size for known codes; if the code is
unknown, fall back to the pointer-size default (or the sizeof logic already
used) so the function returns Ok(pointer_size) instead of panicking.
| #[pymethod] | ||
| fn __mul__(cls: PyTypeRef, n: isize, vm: &VirtualMachine) -> PyResult { | ||
| use super::array::{PyCArray, PyCArrayType}; | ||
| if n < 0 { | ||
| return Err(vm.new_value_error(format!("Array length must be >= 0, not {n}"))); | ||
| } | ||
| // For structures, element size is the structure size (sum of field sizes) | ||
| let element_size = std::mem::size_of::<usize>(); // Default, should calculate from fields | ||
| Ok(PyCArrayType { | ||
| inner: PyCArray { | ||
| typ: PyRwLock::new(cls), | ||
| length: AtomicCell::new(n as usize), | ||
| element_size: AtomicCell::new(element_size), | ||
| buffer: PyRwLock::new(vec![]), | ||
| }, | ||
| } | ||
| todo!("Implement PyCStructure::py_new") | ||
| .to_pyobject(vm)) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorrect element size calculation for Structure arrays.
The element_size is hardcoded to size_of::<usize>() (pointer size) instead of the actual structure size. This will cause incorrect array sizing and memory access issues.
Consider computing the actual structure size:
fn __mul__(cls: PyTypeRef, n: isize, vm: &VirtualMachine) -> PyResult {
use super::array::{PyCArray, PyCArrayType};
if n < 0 {
return Err(vm.new_value_error(format!("Array length must be >= 0, not {n}")));
}
- // For structures, element size is the structure size (sum of field sizes)
- let element_size = std::mem::size_of::<usize>(); // Default, should calculate from fields
+ // Calculate element size from _fields_
+ let element_size = if let Some(fields_attr) = cls.get_attr(vm.ctx.intern_str("_fields_"), vm).ok() {
+ // Sum field sizes (reuse process_fields logic or store size on class)
+ Self::calculate_struct_size(&fields_attr, vm).unwrap_or(std::mem::size_of::<usize>())
+ } else {
+ std::mem::size_of::<usize>()
+ };Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In crates/vm/src/stdlib/ctypes/structure.rs around lines 136–154, the
element_size for structure arrays is incorrectly hardcoded to
size_of::<usize>(); replace this with computation of the actual structure size
derived from the Structure type's field definitions (iterate the structure's
fields or call the type's layout/size helper), summing each field's C type size
and applying proper alignment/padding or using the structure's computed layout
helper if available, then store that computed size in element_size; also handle
any errors when reading field metadata and return a PyResult error if the
structure size cannot be determined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (4)
crates/vm/src/stdlib/ctypes/array.rs (2)
258-277: Potential panic if buffer slice is too short.The
bytes_to_intfunction directly indexesbytes[0],bytes[1], etc. without verifying slice length. If called with an undersized slice, this will panic.Add a length guard at the start:
fn bytes_to_int(bytes: &[u8], size: usize, vm: &VirtualMachine) -> PyObjectRef { + if bytes.len() < size { + return vm.ctx.new_int(0).into(); + } match size { 1 => vm.ctx.new_int(bytes[0] as i8).into(),
404-412: Lifetime concern:Arg::new(&ptr)references a stack-local pointer.The
ptrvariable is local to this function. WhileArg::new(&ptr)takes a reference to the pointer value itself (not the buffer data), this still creates anArgthat references memory that will be invalid after the function returns and the read guard is dropped.This requires architectural changes to ensure the buffer outlives the FFI call. Consider storing the buffer pointer in a way that persists for the duration of the FFI call, or restructuring how arguments are passed to FFI.
crates/vm/src/stdlib/ctypes/structure.rs (2)
106-134: Potential panic with unrecognized single-character type codes.The
get_size()function usesunreachable!()for unknown type strings. While thes.len() == 1check filters multi-char strings, an unrecognized single character (e.g.,"x") will still trigger a panic.Validate the type code before calling
get_size:.and_then(|type_str| { let s = type_str.to_string(); - (s.len() == 1).then(|| get_size(&s)) + const VALID_TYPES: &str = "cbBhHiIlqLQfdg?PzZOu"; + (s.len() == 1 && VALID_TYPES.contains(&s)).then(|| get_size(&s)) })
136-155: Element size is hardcoded to pointer size instead of actual structure size.The
element_sizeshould be calculated from the structure's fields, not hardcoded tosize_of::<usize>(). This will cause incorrect array sizing when creating arrays of structures.Consider computing the actual structure size from
_fields_:fn __mul__(cls: PyTypeRef, n: isize, vm: &VirtualMachine) -> PyResult { use super::array::{PyCArray, PyCArrayType}; if n < 0 { return Err(vm.new_value_error(format!("Array length must be >= 0, not {n}"))); } - // TODO: Calculate element size properly - // For structures, element size is the structure size (sum of field sizes) - let element_size = std::mem::size_of::<usize>(); // Default, should calculate from fields + // Calculate element size from _fields_ + let element_size = if let Ok(fields_attr) = cls.as_object().get_attr("_fields_", vm) { + Self::calculate_struct_size_from_fields(&fields_attr, vm).unwrap_or(std::mem::size_of::<usize>()) + } else { + std::mem::size_of::<usize>() + };You'll need to add a helper method or reuse logic from
process_fieldsto sum field sizes.
🧹 Nitpick comments (1)
crates/vm/src/stdlib/ctypes/structure.rs (1)
308-312:_fields_getter returns None instead of actual fields.The
_fields_property always returnsvm.ctx.none()instead of retrieving the actual_fields_from the class. While the comment suggests this is intentional, this may break code that expects to read_fields_from an instance.Consider returning the class's
_fields_attribute:#[pygetset] fn _fields_(&self, vm: &VirtualMachine) -> PyObjectRef { - // Return the _fields_ from the class, not instance - vm.ctx.none() + // Return the _fields_ from the class + self.class() + .get_attr(vm.ctx.intern_str("_fields_"), vm) + .unwrap_or_else(|_| vm.ctx.none()) }
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
crates/vm/src/stdlib/ctypes.rs(5 hunks)crates/vm/src/stdlib/ctypes/array.rs(5 hunks)crates/vm/src/stdlib/ctypes/structure.rs(1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs: Follow the default rustfmt code style in Rust code (cargo fmtto format)
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by your changes
Follow Rust best practices for error handling and memory management
Files:
crates/vm/src/stdlib/ctypes.rscrates/vm/src/stdlib/ctypes/structure.rscrates/vm/src/stdlib/ctypes/array.rs
**/src/**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use the macro system (
pyclass,pymodule,pyfunction, etc.) when implementing Python functionality in Rust
Files:
crates/vm/src/stdlib/ctypes.rscrates/vm/src/stdlib/ctypes/structure.rscrates/vm/src/stdlib/ctypes/array.rs
🧠 Learnings (3)
📚 Learning: 2025-11-10T06:27:41.954Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6243
File: vm/src/function/buffer.rs:123-129
Timestamp: 2025-11-10T06:27:41.954Z
Learning: In vm/src/function/buffer.rs, line 36 in the try_rw_bytes_like method intentionally uses TypeError (not BufferError) for the "buffer is not a read-write bytes-like object" error case, even though a similar error message in ArgMemoryBuffer::try_from_borrowed_object uses BufferError. The TypeError is the intended exception type for this specific code path.
Applied to files:
crates/vm/src/stdlib/ctypes.rscrates/vm/src/stdlib/ctypes/array.rs
📚 Learning: 2025-11-25T11:05:02.899Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-25T11:05:02.899Z
Learning: Applies to **/src/**/*.rs : Use the macro system (`pyclass`, `pymodule`, `pyfunction`, etc.) when implementing Python functionality in Rust
Applied to files:
crates/vm/src/stdlib/ctypes.rscrates/vm/src/stdlib/ctypes/array.rs
📚 Learning: 2025-08-26T05:20:54.540Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.540Z
Learning: In the RustPython codebase, only certain builtin types should be marked with the SEQUENCE flag for pattern matching. List and tuple are sequences, but bytes, bytearray, and range are not considered sequences in this context, even though they may implement sequence-like protocols.
Applied to files:
crates/vm/src/stdlib/ctypes/array.rs
🧬 Code graph analysis (1)
crates/vm/src/stdlib/ctypes/array.rs (1)
crates/vm/src/stdlib/ctypes.rs (17)
get_size(140-159)mem(142-142)mem(143-143)mem(144-144)mem(145-145)mem(146-146)mem(147-147)mem(148-148)mem(149-149)mem(150-150)mem(151-151)mem(152-152)mem(153-153)mem(154-154)mem(155-155)mem(156-156)size_of(197-210)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Run snippets and cpython tests on wasm-wasi
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
- GitHub Check: Run snippets and cpython tests (windows-latest)
- GitHub Check: Check the WASM package and demo
- GitHub Check: Run tests under miri
- GitHub Check: Run rust tests (windows-latest)
- GitHub Check: Check Rust code with clippy
- GitHub Check: Run rust tests (macos-latest)
- GitHub Check: Run rust tests (ubuntu-latest)
- GitHub Check: Ensure compilation on various targets
🔇 Additional comments (8)
crates/vm/src/stdlib/ctypes.rs (3)
63-78: LGTM! Clean CArgObject implementation.The
CArgObjectstruct follows the RustPython macro conventions correctly with#[pyclass]and#[pygetset]. The#[allow(dead_code)]onoffsetis appropriate since it's stored for future use but not yet exposed.
345-367: LGTM! byref error message now includes actual type name.The fix correctly uses
obj.class().name()to provide helpful error messages showing the actual type passed, addressing the previous review feedback.
436-439:_cast_addrusestodo!()which will panic at runtime.Similar to
_wstring_at_addr, this will panic if called. If this is intentional for tracking unimplemented features, consider using a more explicit approach.Is this intentional as a development placeholder? If users might call this, consider returning a Python exception instead.
crates/vm/src/stdlib/ctypes/array.rs (2)
33-65: LGTM! Clean Callable implementation for PyCArrayType.The implementation correctly creates array instances from the type metadata, bounds-checks the positional arguments, and initializes the buffer appropriately.
87-116: LGTM! Well-structured__mul__for nested array types.The implementation correctly validates negative lengths and creates nested array types with proper element size calculation. The approach of using the current array type as the element type for the new outer array is correct for ctypes semantics.
crates/vm/src/stdlib/ctypes/structure.rs (3)
21-41: LGTM! Clean metaclass constructor.The
py_newcorrectly delegates toPyType::py_newand processes_fields_only when defined directly on the new class (not inherited), which is the correct behavior for Structure metaclasses.
56-104: Well-structured field processing with proper offset tracking.The
process_fieldsmethod correctly validates_fields_format, computes offsets, and createsPyCFielddescriptors. Usingset_attrto register descriptors on the class is the right approach.
220-265: Good: IndexMap preserves insertion order for positional arguments.The use of
IndexMapinstead ofHashMapensures thatfields_map.keys()returns field names in declaration order, so positional arguments are correctly assigned. This addresses the concern from the previous review.Also applies to: 287-297
| fn set_value(&self, value: PyObjectRef, _vm: &VirtualMachine) -> PyResult<()> { | ||
| if let Some(bytes) = value.downcast_ref::<PyBytes>() { | ||
| let mut buffer = self.buffer.write(); | ||
| let src = bytes.as_bytes(); | ||
| let len = std::cmp::min(src.len(), buffer.len()); | ||
| buffer[..len].copy_from_slice(&src[..len]); | ||
| } | ||
| Ok(()) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
set_value silently ignores non-bytes input.
When value is not a PyBytes, the function returns Ok(()) without any indication that the value wasn't set. This could lead to confusing behavior where assignments appear to succeed but have no effect.
Consider raising a type error for non-bytes input:
fn set_value(&self, value: PyObjectRef, vm: &VirtualMachine) -> PyResult<()> {
if let Some(bytes) = value.downcast_ref::<PyBytes>() {
let mut buffer = self.buffer.write();
let src = bytes.as_bytes();
let len = std::cmp::min(src.len(), buffer.len());
buffer[..len].copy_from_slice(&src[..len]);
+ Ok(())
+ } else {
+ Err(vm.new_type_error("expected bytes".to_owned()))
}
- Ok(())
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| fn set_value(&self, value: PyObjectRef, _vm: &VirtualMachine) -> PyResult<()> { | |
| if let Some(bytes) = value.downcast_ref::<PyBytes>() { | |
| let mut buffer = self.buffer.write(); | |
| let src = bytes.as_bytes(); | |
| let len = std::cmp::min(src.len(), buffer.len()); | |
| buffer[..len].copy_from_slice(&src[..len]); | |
| } | |
| Ok(()) | |
| } | |
| fn set_value(&self, value: PyObjectRef, vm: &VirtualMachine) -> PyResult<()> { | |
| if let Some(bytes) = value.downcast_ref::<PyBytes>() { | |
| let mut buffer = self.buffer.write(); | |
| let src = bytes.as_bytes(); | |
| let len = std::cmp::min(src.len(), buffer.len()); | |
| buffer[..len].copy_from_slice(&src[..len]); | |
| Ok(()) | |
| } else { | |
| Err(vm.new_type_error("expected bytes".to_owned())) | |
| } | |
| } |
🤖 Prompt for AI Agents
In crates/vm/src/stdlib/ctypes/array.rs around lines 374 to 382, the set_value
function currently returns Ok(()) when value is not a PyBytes which silently
ignores the assignment; change this to return a Python TypeError instead. Update
the function to check if value.downcast_ref::<PyBytes>() is Some and perform the
copy as before, but if it is None return Err(vm.new_type_error("expected a
bytes-like object".to_owned())) (or a similar clear message), so non-bytes input
raises an error rather than silently succeeding.
39e6885 to
10fe041
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/vm/src/stdlib/ctypes/base.rs (1)
44-52: Duplicate condition check - likely copy-paste error.The same
PyByteswith length 1 check appears twice. The second check (lines 48-51) should likely check a different type (perhapsPyByteArray?).if value .clone() .downcast_exact::<PyBytes>(vm) .is_ok_and(|v| v.len() == 1) - || value - .clone() - .downcast_exact::<PyBytes>(vm) - .is_ok_and(|v| v.len() == 1) + || value + .clone() + .downcast_exact::<PyByteArray>(vm) + .is_ok_and(|v| v.len() == 1) || value .clone() .downcast_exact::<PyInt>(vm)
♻️ Duplicate comments (8)
crates/vm/src/stdlib/ctypes.rs (1)
422-434: Windowstodo!()will panic at runtime.This was flagged in a previous review. The
todo!()macro will cause a runtime panic on Windows. Consider returning 0 with a TODO comment, or raising a proper Python exception.#[cfg(target_os = "windows")] { - todo!("On Windows, use wcslen from ucrt") + // TODO: RUSTPYTHON - On Windows, should use wcslen from ucrt + // Returning 0 as placeholder until Windows support is added + 0 }crates/vm/src/stdlib/ctypes/field.rs (2)
136-157: Type conversion only handles signed integers.This was flagged in a previous review. The implementation always interprets bytes as signed integers, but ctypes supports unsigned types (
c_uint,c_ulong) and floating-point types (c_float,c_double). Consider using the field'sprototype information to determine the correct interpretation.
159-186: Silent truncation on integer overflow.This was flagged in a previous review. Using
unwrap_or(0)silently handles values that don't fit, whereas CPython ctypes raisesOverflowError. For example, assigning300to a 1-byte field silently becomes0.1 => { - let val = i.to_i8().unwrap_or(0); + let val = i.to_i8().ok_or_else(|| { + vm.new_overflow_error("int too large to convert".to_owned()) + })?; Ok(val.to_ne_bytes().to_vec()) }crates/vm/src/stdlib/ctypes/array.rs (3)
258-277: Potential panic if buffer is too short.This was flagged in a previous review. The
bytes_to_intfunction accesses array indices without validating slice length. If called with an undersized slice, this will panic.fn bytes_to_int(bytes: &[u8], size: usize, vm: &VirtualMachine) -> PyObjectRef { + if bytes.len() < size { + return vm.ctx.new_int(0).into(); + } match size { 1 => vm.ctx.new_int(bytes[0] as i8).into(), 2 => { - let val = i16::from_ne_bytes([bytes[0], bytes[1]]); + let val = i16::from_ne_bytes(bytes[..2].try_into().unwrap()); vm.ctx.new_int(val).into() }
373-382:set_valuesilently ignores non-bytes input.This was flagged in a previous review. When
valueis notPyBytes, the function returnsOk(())without any indication that the value wasn't set.fn set_value(&self, value: PyObjectRef, vm: &VirtualMachine) -> PyResult<()> { if let Some(bytes) = value.downcast_ref::<PyBytes>() { let mut buffer = self.buffer.write(); let src = bytes.as_bytes(); let len = std::cmp::min(src.len(), buffer.len()); buffer[..len].copy_from_slice(&src[..len]); + Ok(()) + } else { + Err(vm.new_type_error("expected bytes".to_owned())) } - Ok(()) }
404-412: Lifetime issue withto_arg- pointer may dangle after read lock is released.This was flagged in a previous review. The
buffer.read()guard is dropped at the end of the function, butptrstill references that memory. TheArgreturned will point to potentially invalid memory when used.The TODO comment acknowledges this, but this is a use-after-free that will cause undefined behavior. Consider:
- Storing a persistent buffer pointer in
PyCArraythat outlives FFI calls- Using
Arc<RwLock<Vec<u8>>>and keeping the guard alive during FFI- Copying the buffer to a location that's guaranteed to outlive the call
crates/vm/src/stdlib/ctypes/structure.rs (2)
106-134: Guardget_sizeagainst unknown_type_codes to avoid VM panics
get_field_sizecallsget_size(&s)for any single‑char_type_value without validating the code. If_type_is a single unknown character (e.g. user‑defined or corrupted),get_size’s internalunreachable!()will panic the VM.Consider explicitly validating the code before calling
get_size, or delegating to a non‑panicking helper in_ctypes:- .and_then(|type_str| { - let s = type_str.to_string(); - (s.len() == 1).then(|| get_size(&s)) - }) + .and_then(|type_str| { + let s = type_str.to_string(); + if s.len() != 1 { + return None; + } + // Keep this set in sync with `_ctypes::get_size`. + const VALID_TYPES: &[&str] = &[ + "u", "c", "b", "h", "H", "i", "I", "l", "q", "L", "Q", "f", "d", "g", "?", + "B", "P", "z", "Z", "O", + ]; + if !VALID_TYPES.contains(&s.as_str()) { + return None; + } + Some(get_size(&s)) + })This keeps
get_field_sizetotal‑failure‑free and avoids panics even if_type_is malformed.
136-175:Structure * nuses pointer size instead of structure size for arrays
PyCStructType::__mul__sets:// TODO: Calculate element size properly let element_size = std::mem::size_of::<usize>();For arrays of structures (
Point * 10), this will allocaten * sizeof(usize)bytes instead ofn * sizeof(struct). Any field descriptor that uses struct offsets larger thansizeof(usize)risks indexing past the end of the array buffer and panicking.You should compute the real structure size from
_fields_, ideally reusing the same logic as inPyCStructure::py_new:- // TODO: Calculate element size properly - // For structures, element size is the structure size (sum of field sizes) - let element_size = std::mem::size_of::<usize>(); // Default, should calculate from fields + // Element size is the structure size derived from the class's `_fields_` + let element_size = { + let fields_attr = cls + .as_object() + .get_attr("_fields_", vm) + .map_err(|_| vm.new_type_error("Structure has no _fields_".to_owned()))?; + // Factor this out of `PyCStructure::py_new` into a shared helper + PyCStructType::get_struct_size_from_fields(&fields_attr, vm)? + };…and add a
get_struct_size_from_fieldshelper that mirrors the size/offset accumulation inPyCStructure::py_new, so arrays and instances share one definition of layout.Until this is fixed, arrays of non‑trivial
Structureare very likely incorrect and fragile.
🧹 Nitpick comments (3)
crates/vm/src/stdlib/ctypes/union.rs (1)
38-110: Code duplication withPyCStructType- consider extracting shared logic.The
process_fieldsandget_field_sizemethods are nearly identical to those instructure.rs(only differing in offset calculation - always 0 for unions vs. cumulative for structs). Consider extracting the common field parsing and size calculation into a shared helper module.For example, a shared helper could be:
// In a common module pub fn parse_fields(fields_attr: &PyObjectRef, vm: &VirtualMachine) -> PyResult<Vec<(String, PyObjectRef, usize)>> { ... } pub fn get_field_size(field_type: &PyObjectRef, vm: &VirtualMachine) -> PyResult<usize> { ... }crates/vm/src/stdlib/ctypes/array.rs (1)
248-256:int_to_bytessilently truncates on overflow (same issue as field.rs).Using
unwrap_or(0)silently handles values that don't fit. Consider returning aPyResult<Vec<u8>>and raisingOverflowErrorfor consistency with CPython.crates/vm/src/stdlib/ctypes/structure.rs (1)
306-313: Implement_fields_getter for instances (currently always returnsNone)The
_fields_getter onPyCStructurecurrently returnsNone, despite the comment saying it should expose the class’s_fields_. This breaks introspection likeinstance._fields_and diverges from expectations aroundctypes.Structure.Now that you store
fields: IndexMap<String, FieldInfo>and the class already has a_fields_attribute, consider:
- Either returning the class attribute:
#[pygetset] fn _fields_(&self, vm: &VirtualMachine) -> PyObjectRef { - // Return the _fields_ from the class, not instance - vm.ctx.none() + self.as_object() + .class() + .get_attr("_fields_", vm) + .unwrap_or_else(|_| vm.ctx.none()) }
- Or reconstructing
_fields_fromself.fieldsif you want the instance to be the single source of truth.Either way, returning
Nonepermanently is likely to surprise users.
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
crates/vm/src/stdlib/ctypes.rs(5 hunks)crates/vm/src/stdlib/ctypes/array.rs(5 hunks)crates/vm/src/stdlib/ctypes/base.rs(1 hunks)crates/vm/src/stdlib/ctypes/field.rs(4 hunks)crates/vm/src/stdlib/ctypes/pointer.rs(1 hunks)crates/vm/src/stdlib/ctypes/structure.rs(1 hunks)crates/vm/src/stdlib/ctypes/union.rs(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- crates/vm/src/stdlib/ctypes/pointer.rs
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs: Follow the default rustfmt code style in Rust code (cargo fmtto format)
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by your changes
Follow Rust best practices for error handling and memory management
Files:
crates/vm/src/stdlib/ctypes/union.rscrates/vm/src/stdlib/ctypes.rscrates/vm/src/stdlib/ctypes/structure.rscrates/vm/src/stdlib/ctypes/field.rscrates/vm/src/stdlib/ctypes/base.rscrates/vm/src/stdlib/ctypes/array.rs
**/src/**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use the macro system (
pyclass,pymodule,pyfunction, etc.) when implementing Python functionality in Rust
Files:
crates/vm/src/stdlib/ctypes/union.rscrates/vm/src/stdlib/ctypes.rscrates/vm/src/stdlib/ctypes/structure.rscrates/vm/src/stdlib/ctypes/field.rscrates/vm/src/stdlib/ctypes/base.rscrates/vm/src/stdlib/ctypes/array.rs
🧠 Learnings (6)
📚 Learning: 2025-11-10T06:27:41.954Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6243
File: vm/src/function/buffer.rs:123-129
Timestamp: 2025-11-10T06:27:41.954Z
Learning: In vm/src/function/buffer.rs, line 36 in the try_rw_bytes_like method intentionally uses TypeError (not BufferError) for the "buffer is not a read-write bytes-like object" error case, even though a similar error message in ArgMemoryBuffer::try_from_borrowed_object uses BufferError. The TypeError is the intended exception type for this specific code path.
Applied to files:
crates/vm/src/stdlib/ctypes.rscrates/vm/src/stdlib/ctypes/field.rscrates/vm/src/stdlib/ctypes/array.rs
📚 Learning: 2025-11-25T11:05:02.899Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-25T11:05:02.899Z
Learning: Applies to Lib/**/*.py : When modifying files in Lib/, add a `# TODO: RUSTPYTHON` comment to mark modifications, or use `unittest.skip("TODO: RustPython <reason>")` or `unittest.expectedFailure` with a comment
Applied to files:
crates/vm/src/stdlib/ctypes.rs
📚 Learning: 2025-06-27T14:47:28.810Z
Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration using #ifdef checks rather than providing fallback values for other platforms.
Applied to files:
crates/vm/src/stdlib/ctypes.rs
📚 Learning: 2025-06-27T14:47:28.810Z
Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration rather than providing fallback values for other platforms.
Applied to files:
crates/vm/src/stdlib/ctypes.rs
📚 Learning: 2025-11-25T11:05:02.899Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-25T11:05:02.899Z
Learning: Applies to **/src/**/*.rs : Use the macro system (`pyclass`, `pymodule`, `pyfunction`, etc.) when implementing Python functionality in Rust
Applied to files:
crates/vm/src/stdlib/ctypes.rscrates/vm/src/stdlib/ctypes/array.rs
📚 Learning: 2025-08-26T05:20:54.540Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.540Z
Learning: In the RustPython codebase, only certain builtin types should be marked with the SEQUENCE flag for pattern matching. List and tuple are sequences, but bytes, bytearray, and range are not considered sequences in this context, even though they may implement sequence-like protocols.
Applied to files:
crates/vm/src/stdlib/ctypes/array.rs
🧬 Code graph analysis (5)
crates/vm/src/stdlib/ctypes/union.rs (3)
crates/vm/src/stdlib/ctypes.rs (17)
get_size(140-159)mem(142-142)mem(143-143)mem(144-144)mem(145-145)mem(146-146)mem(147-147)mem(148-148)mem(149-149)mem(150-150)mem(151-151)mem(152-152)mem(153-153)mem(154-154)mem(155-155)mem(156-156)size_of(197-210)crates/vm/src/stdlib/ctypes/field.rs (1)
type_(271-273)crates/vm/src/stdlib/ctypes/structure.rs (1)
field_type(248-249)
crates/vm/src/stdlib/ctypes.rs (3)
crates/vm/src/stdlib/ctypes/field.rs (4)
name(266-268)obj(120-120)obj(205-205)offset(276-278)crates/vm/src/stdlib/ctypes/function.rs (2)
obj(351-351)f(273-274)crates/vm/src/frame.rs (1)
class(122-124)
crates/vm/src/stdlib/ctypes/field.rs (2)
crates/vm/src/stdlib/ctypes/array.rs (5)
index(324-324)index(341-341)value(367-371)value(375-375)value(392-392)crates/vm/src/stdlib/ctypes.rs (1)
obj(337-337)
crates/vm/src/stdlib/ctypes/base.rs (1)
crates/vm/src/stdlib/ctypes.rs (17)
get_size(140-159)mem(142-142)mem(143-143)mem(144-144)mem(145-145)mem(146-146)mem(147-147)mem(148-148)mem(149-149)mem(150-150)mem(151-151)mem(152-152)mem(153-153)mem(154-154)mem(155-155)mem(156-156)size_of(197-210)
crates/vm/src/stdlib/ctypes/array.rs (1)
crates/vm/src/stdlib/ctypes.rs (17)
get_size(140-159)mem(142-142)mem(143-143)mem(144-144)mem(145-145)mem(146-146)mem(147-147)mem(148-148)mem(149-149)mem(150-150)mem(151-151)mem(152-152)mem(153-153)mem(154-154)mem(155-155)mem(156-156)size_of(197-210)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: Run snippets and cpython tests on wasm-wasi
- GitHub Check: Run snippets and cpython tests (windows-latest)
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
- GitHub Check: Check the WASM package and demo
- GitHub Check: Run rust tests (ubuntu-latest)
- GitHub Check: Check Rust code with clippy
- GitHub Check: Run tests under miri
- GitHub Check: Run rust tests (windows-latest)
- GitHub Check: Run rust tests (macos-latest)
- GitHub Check: Ensure compilation on various targets
🔇 Additional comments (9)
crates/vm/src/stdlib/ctypes/base.rs (1)
306-334: LGTM - element_size computation with proper fallbacks.The logic correctly derives element size from the
_type_attribute when available, with sensible fallback tosize_of::<usize>(). The initialization ofPyCArraywithelement_sizeand emptybufferaligns with the new buffer-backed design.crates/vm/src/stdlib/ctypes/union.rs (1)
16-36: LGTM - Clean metaclass constructor pattern.The
py_newimplementation correctly creates the type viaPyType::py_new, then processes_fields_only when directly defined on the new class (not inherited). This follows proper ctypes semantics.crates/vm/src/stdlib/ctypes.rs (2)
63-78: LGTM - Clean CArgObject wrapper for byref().The
CArgObjectstruct properly wraps the ctypes object reference with an offset, and the_objgetter provides access to the underlying object. This aligns with CPython's ctypes behavior.
344-367: LGTM - Proper ctypes instance validation in byref().The implementation correctly checks for both
PyCDataandPyCSimpleinstances, uses a dynamic error message with the actual type name (addressing the previous review), and properly wraps the result inCArgObject.crates/vm/src/stdlib/ctypes/field.rs (1)
34-52: LGTM - Clean constructor for PyCField.The
newconstructor properly initializes all fields with sensible defaults for non-bitfield cases (anonymous: false,bitfield_size: false,bit_offset: 0).crates/vm/src/stdlib/ctypes/array.rs (3)
33-65: LGTM - Clean Callable implementation for array instantiation.The implementation correctly initializes the buffer based on element size and length, and properly handles positional argument initialization with bounds checking.
87-117: LGTM - Proper nested array type creation.The
__mul__implementation correctly creates nested array types where the element size becomes the total size of the inner array. The conversion fromPyObjectReftoPyTypeis handled appropriately.
225-244: LGTM - Clean AsSequence implementation.The sequence methods properly delegate to the helper functions with appropriate error handling for deletion attempts.
crates/vm/src/stdlib/ctypes/structure.rs (1)
16-104: Metaclass wiring and_fields_processing look coherentThe
PyCStructTypemetaclass plusprocess_fieldspipeline for_fields_(including direct-attr checks in bothpy_newand__init_subclass__) is consistent and keeps field layout logic centralized. This is a solid base for the newStructuremodel.
| impl GetDescriptor for PyCField { | ||
| fn descr_get( | ||
| zelf: PyObjectRef, | ||
| obj: Option<PyObjectRef>, | ||
| _cls: Option<PyObjectRef>, | ||
| vm: &VirtualMachine, | ||
| ) -> PyResult { | ||
| let zelf = zelf | ||
| .downcast::<PyCField>() | ||
| .map_err(|_| vm.new_type_error("expected CField".to_owned()))?; | ||
|
|
||
| // If obj is None, return the descriptor itself (class attribute access) | ||
| let obj = match obj { | ||
| Some(obj) if !vm.is_none(&obj) => obj, | ||
| _ => return Ok(zelf.into()), | ||
| }; | ||
|
|
||
| // Instance attribute access - read value from the structure's buffer | ||
| if let Some(structure) = obj.downcast_ref::<PyCStructure>() { | ||
| let buffer = structure.buffer.read(); | ||
| let offset = zelf.byte_offset; | ||
| let size = zelf.byte_size; | ||
|
|
||
| if offset + size <= buffer.len() { | ||
| let bytes = &buffer[offset..offset + size]; | ||
| return PyCField::bytes_to_value(bytes, size, vm); | ||
| } | ||
| } | ||
|
|
||
| // Fallback: return 0 for uninitialized or unsupported types | ||
| Ok(vm.ctx.new_int(0).into()) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Descriptor fallback silently returns 0 for buffer overruns and non-Structure types.
Lines 125-128 check bounds but line 132 silently returns 0 when:
- The object isn't a
PyCStructure - The buffer is too small
This can mask bugs. Consider raising an error or at least logging when the fallback is hit unexpectedly.
// Instance attribute access - read value from the structure's buffer
if let Some(structure) = obj.downcast_ref::<PyCStructure>() {
let buffer = structure.buffer.read();
let offset = zelf.byte_offset;
let size = zelf.byte_size;
if offset + size <= buffer.len() {
let bytes = &buffer[offset..offset + size];
return PyCField::bytes_to_value(bytes, size, vm);
}
+ // Buffer too small - structure not properly initialized
+ return Err(vm.new_value_error("structure buffer too small for field access".to_owned()));
}
- // Fallback: return 0 for uninitialized or unsupported types
- Ok(vm.ctx.new_int(0).into())
+ // Field descriptor accessed on non-Structure type
+ Err(vm.new_type_error(format!(
+ "descriptor works only on Structure instances, got {}",
+ obj.class().name()
+ )))📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| impl GetDescriptor for PyCField { | |
| fn descr_get( | |
| zelf: PyObjectRef, | |
| obj: Option<PyObjectRef>, | |
| _cls: Option<PyObjectRef>, | |
| vm: &VirtualMachine, | |
| ) -> PyResult { | |
| let zelf = zelf | |
| .downcast::<PyCField>() | |
| .map_err(|_| vm.new_type_error("expected CField".to_owned()))?; | |
| // If obj is None, return the descriptor itself (class attribute access) | |
| let obj = match obj { | |
| Some(obj) if !vm.is_none(&obj) => obj, | |
| _ => return Ok(zelf.into()), | |
| }; | |
| // Instance attribute access - read value from the structure's buffer | |
| if let Some(structure) = obj.downcast_ref::<PyCStructure>() { | |
| let buffer = structure.buffer.read(); | |
| let offset = zelf.byte_offset; | |
| let size = zelf.byte_size; | |
| if offset + size <= buffer.len() { | |
| let bytes = &buffer[offset..offset + size]; | |
| return PyCField::bytes_to_value(bytes, size, vm); | |
| } | |
| } | |
| // Fallback: return 0 for uninitialized or unsupported types | |
| Ok(vm.ctx.new_int(0).into()) | |
| } | |
| } | |
| impl GetDescriptor for PyCField { | |
| fn descr_get( | |
| zelf: PyObjectRef, | |
| obj: Option<PyObjectRef>, | |
| _cls: Option<PyObjectRef>, | |
| vm: &VirtualMachine, | |
| ) -> PyResult { | |
| let zelf = zelf | |
| .downcast::<PyCField>() | |
| .map_err(|_| vm.new_type_error("expected CField".to_owned()))?; | |
| // If obj is None, return the descriptor itself (class attribute access) | |
| let obj = match obj { | |
| Some(obj) if !vm.is_none(&obj) => obj, | |
| _ => return Ok(zelf.into()), | |
| }; | |
| // Instance attribute access - read value from the structure's buffer | |
| if let Some(structure) = obj.downcast_ref::<PyCStructure>() { | |
| let buffer = structure.buffer.read(); | |
| let offset = zelf.byte_offset; | |
| let size = zelf.byte_size; | |
| if offset + size <= buffer.len() { | |
| let bytes = &buffer[offset..offset + size]; | |
| return PyCField::bytes_to_value(bytes, size, vm); | |
| } | |
| // Buffer too small - structure not properly initialized | |
| return Err(vm.new_value_error("structure buffer too small for field access".to_owned())); | |
| } | |
| // Field descriptor accessed on non-Structure type | |
| Err(vm.new_type_error(format!( | |
| "descriptor works only on Structure instances, got {}", | |
| obj.class().name() | |
| ))) | |
| } | |
| } |
🤖 Prompt for AI Agents
In crates/vm/src/stdlib/ctypes/field.rs around lines 102 to 134, the descriptor
accessor currently returns a silent 0 when the target is not a PyCStructure or
when the structure buffer is too small, which can hide bugs; update the function
to (1) return a proper TypeError (via vm.new_type_error) when obj is not a
PyCStructure, and (2) return a BufferError/IndexError (via vm.new_index_error or
vm.new_value_error) when offset+size exceeds buffer.len(); ensure you include a
clear message describing the expected type or insufficient buffer size so
callers see the real failure instead of a silent zero.
| match value { | ||
| PySetterValue::Assign(value) => { | ||
| let offset = zelf.byte_offset; | ||
| let size = zelf.byte_size; | ||
| let bytes = PyCField::value_to_bytes(&value, size, vm)?; | ||
|
|
||
| let mut buffer = structure.buffer.write(); | ||
| if offset + size <= buffer.len() { | ||
| buffer[offset..offset + size].copy_from_slice(&bytes); | ||
| } | ||
| Ok(()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Silent no-op when buffer is too small for write.
Lines 213-215 silently skip the write if the buffer is too small. This masks initialization bugs and deviates from CPython behavior which would raise an error.
let mut buffer = structure.buffer.write();
- if offset + size <= buffer.len() {
- buffer[offset..offset + size].copy_from_slice(&bytes);
+ if offset + size > buffer.len() {
+ return Err(vm.new_value_error(
+ "structure buffer too small for field assignment".to_owned()
+ ));
}
+ buffer[offset..offset + size].copy_from_slice(&bytes);
Ok(())📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| match value { | |
| PySetterValue::Assign(value) => { | |
| let offset = zelf.byte_offset; | |
| let size = zelf.byte_size; | |
| let bytes = PyCField::value_to_bytes(&value, size, vm)?; | |
| let mut buffer = structure.buffer.write(); | |
| if offset + size <= buffer.len() { | |
| buffer[offset..offset + size].copy_from_slice(&bytes); | |
| } | |
| Ok(()) | |
| match value { | |
| PySetterValue::Assign(value) => { | |
| let offset = zelf.byte_offset; | |
| let size = zelf.byte_size; | |
| let bytes = PyCField::value_to_bytes(&value, size, vm)?; | |
| let mut buffer = structure.buffer.write(); | |
| if offset + size > buffer.len() { | |
| return Err(vm.new_value_error( | |
| "structure buffer too small for field assignment".to_owned() | |
| )); | |
| } | |
| buffer[offset..offset + size].copy_from_slice(&bytes); | |
| Ok(()) |
🤖 Prompt for AI Agents
In crates/vm/src/stdlib/ctypes/field.rs around lines 206 to 216, the current
code silently skips writing when offset + size > buffer.len(); instead of a
no-op, detect this condition and return a Python exception so it fails loudly
like CPython. Replace the silent branch with code that constructs and returns a
VM error (e.g., vm.new_value_error or an appropriate vm error type) containing a
clear message with the field name/offset, required size, and actual buffer
length; keep the buffer write path unchanged and ensure the buffer lock is still
held when performing the bounds check and returning the error.
| impl Constructor for PyCStructure { | ||
| type Args = FuncArgs; | ||
|
|
||
| fn py_new(cls: PyTypeRef, args: Self::Args, vm: &VirtualMachine) -> PyResult { | ||
| // Get _fields_ from the class using get_attr to properly search MRO | ||
| let fields_attr = cls.as_object().get_attr("_fields_", vm).ok(); | ||
|
|
||
| let mut fields_map = IndexMap::new(); | ||
| let mut total_size = 0usize; | ||
|
|
||
| if let Some(fields_attr) = fields_attr { | ||
| let fields: Vec<PyObjectRef> = if let Some(list) = fields_attr.downcast_ref::<PyList>() | ||
| { | ||
| list.borrow_vec().to_vec() | ||
| } else if let Some(tuple) = fields_attr.downcast_ref::<PyTuple>() { | ||
| tuple.to_vec() | ||
| } else { | ||
| vec![] | ||
| }; | ||
|
|
||
| let mut offset = 0usize; | ||
| for field in fields.iter() { | ||
| let Some(field_tuple) = field.downcast_ref::<PyTuple>() else { | ||
| continue; | ||
| }; | ||
| if field_tuple.len() < 2 { | ||
| continue; | ||
| } | ||
| let Some(name) = field_tuple.first().unwrap().downcast_ref::<PyStr>() else { | ||
| continue; | ||
| }; | ||
| let name = name.to_string(); | ||
| let field_type = field_tuple.get(1).unwrap().clone(); | ||
| let size = PyCStructType::get_field_size(&field_type, vm)?; | ||
|
|
||
| let type_ref = field_type | ||
| .downcast::<PyType>() | ||
| .unwrap_or_else(|_| vm.ctx.types.object_type.to_owned()); | ||
|
|
||
| fields_map.insert( | ||
| name.clone(), | ||
| FieldInfo { | ||
| name, | ||
| offset, | ||
| size, | ||
| type_ref, | ||
| }, | ||
| ); | ||
|
|
||
| offset += size; | ||
| } | ||
| total_size = offset; | ||
| } | ||
|
|
||
| // Initialize buffer with zeros | ||
| let buffer = vec![0u8; total_size]; | ||
|
|
||
| let instance = PyCStructure { | ||
| buffer: PyRwLock::new(buffer), | ||
| fields: PyRwLock::new(fields_map.clone()), | ||
| size: AtomicCell::new(total_size), | ||
| }; | ||
|
|
||
| // Handle keyword arguments for field initialization | ||
| let py_instance = instance.into_ref_with_type(vm, cls.clone())?; | ||
| let py_obj: PyObjectRef = py_instance.clone().into(); | ||
|
|
||
| // Set field values from kwargs using standard attribute setting | ||
| for (key, value) in args.kwargs.iter() { | ||
| if fields_map.contains_key(key.as_str()) { | ||
| py_obj.set_attr(vm.ctx.intern_str(key.as_str()), value.clone(), vm)?; | ||
| } | ||
| } | ||
|
|
||
| // Set field values from positional args | ||
| let field_names: Vec<String> = fields_map.keys().cloned().collect(); | ||
| for (i, value) in args.args.iter().enumerate() { | ||
| if i < field_names.len() { | ||
| py_obj.set_attr( | ||
| vm.ctx.intern_str(field_names[i].as_str()), | ||
| value.clone(), | ||
| vm, | ||
| )?; | ||
| } | ||
| } | ||
|
|
||
| Ok(py_instance.into()) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PyCStructure.__new__ silently ignores invalid fields and extra arguments
A few behaviors here differ from ctypes.Structure and can hide bugs:
- Invalid
_fields_entries at instance‑creation time are silently skipped (continueon bad tuples / names), even though class creation previously enforced stricter validation. - Unknown keyword arguments are ignored:
for (key, value) in args.kwargs.iter() {
if fields_map.contains_key(key.as_str()) {
py_obj.set_attr(...)?;
}
}- Extra positional arguments beyond the number of fields are also silently dropped due to
if i < field_names.len()without an error.
Consider tightening this up so user mistakes surface early:
- for (key, value) in args.kwargs.iter() {
- if fields_map.contains_key(key.as_str()) {
- py_obj.set_attr(vm.ctx.intern_str(key.as_str()), value.clone(), vm)?;
- }
- }
+ for (key, value) in args.kwargs.iter() {
+ let field_name = key.as_str();
+ if !fields_map.contains_key(field_name) {
+ return Err(vm.new_type_error(format!(
+ "'{field_name}' is not a valid field name for {}",
+ cls.name()
+ )));
+ }
+ py_obj.set_attr(vm.ctx.intern_str(field_name), value.clone(), vm)?;
+ }
+
+ if args.args.len() > field_names.len() {
+ return Err(vm.new_type_error(format!(
+ "{}() expected at most {} positional arguments, got {}",
+ cls.name(),
+ field_names.len(),
+ args.args.len()
+ )));
+ }This keeps instance initialization predictable and closer to CPython’s ctypes behavior.
10fe041 to
711ea51
Compare
Summary by CodeRabbit
New Features
Bug Fixes
Improvements
✏️ Tip: You can customize this high-level summary in your review settings.