🌐 AI搜索 & 代理 主页
Skip to content

Commit 6a0135a

Browse files
authored
gh-138122: Add exception profiling mode to the sampling profiler (#142561)
1 parent e38967e commit 6a0135a

File tree

19 files changed

+802
-17
lines changed

19 files changed

+802
-17
lines changed

Doc/library/profiling.sampling.rst

Lines changed: 67 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -470,9 +470,10 @@ which you can use to judge whether the data is sufficient for your analysis.
470470
Profiling modes
471471
===============
472472

473-
The sampling profiler supports three modes that control which samples are
473+
The sampling profiler supports four modes that control which samples are
474474
recorded. The mode determines what the profile measures: total elapsed time,
475-
CPU execution time, or time spent holding the global interpreter lock.
475+
CPU execution time, time spent holding the global interpreter lock, or
476+
exception handling.
476477

477478

478479
Wall-clock mode
@@ -553,6 +554,67 @@ single-threaded programs to distinguish Python execution time from time spent
553554
in C extensions or I/O.
554555

555556

557+
Exception mode
558+
--------------
559+
560+
Exception mode (``--mode=exception``) records samples only when a thread has
561+
an active exception::
562+
563+
python -m profiling.sampling run --mode=exception script.py
564+
565+
Samples are recorded in two situations: when an exception is being propagated
566+
up the call stack (after ``raise`` but before being caught), or when code is
567+
executing inside an ``except`` block where exception information is still
568+
present in the thread state.
569+
570+
The following example illustrates which code regions are captured:
571+
572+
.. code-block:: python
573+
574+
def example():
575+
try:
576+
raise ValueError("error") # Captured: exception being raised
577+
except ValueError:
578+
process_error() # Captured: inside except block
579+
finally:
580+
cleanup() # NOT captured: exception already handled
581+
582+
def example_propagating():
583+
try:
584+
try:
585+
raise ValueError("error")
586+
finally:
587+
cleanup() # Captured: exception propagating through
588+
except ValueError:
589+
pass
590+
591+
def example_no_exception():
592+
try:
593+
do_work()
594+
finally:
595+
cleanup() # NOT captured: no exception involved
596+
597+
Note that ``finally`` blocks are only captured when an exception is actively
598+
propagating through them. Once an ``except`` block finishes executing, Python
599+
clears the exception information before running any subsequent ``finally``
600+
block. Similarly, ``finally`` blocks that run during normal execution (when no
601+
exception was raised) are not captured because no exception state is present.
602+
603+
This mode is useful for understanding where your program spends time handling
604+
errors. Exception handling can be a significant source of overhead in code
605+
that uses exceptions for flow control (such as ``StopIteration`` in iterators)
606+
or in applications that process many error conditions (such as network servers
607+
handling connection failures).
608+
609+
Exception mode helps answer questions like "how much time is spent handling
610+
exceptions?" and "which exception handlers are the most expensive?" It can
611+
reveal hidden performance costs in code that catches and processes many
612+
exceptions, even when those exceptions are handled gracefully. For example,
613+
if a parsing library uses exceptions internally to signal format errors, this
614+
mode will capture time spent in those handlers even if the calling code never
615+
sees the exceptions.
616+
617+
556618
Output formats
557619
==============
558620

@@ -1006,8 +1068,9 @@ Mode options
10061068

10071069
.. option:: --mode <mode>
10081070

1009-
Sampling mode: ``wall`` (default), ``cpu``, or ``gil``.
1010-
The ``cpu`` and ``gil`` modes are incompatible with ``--async-aware``.
1071+
Sampling mode: ``wall`` (default), ``cpu``, ``gil``, or ``exception``.
1072+
The ``cpu``, ``gil``, and ``exception`` modes are incompatible with
1073+
``--async-aware``.
10111074

10121075
.. option:: --async-mode <mode>
10131076

Doc/whatsnew/3.15.rst

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -146,6 +146,8 @@ Key features include:
146146
and blocking. Use this to identify CPU-bound bottlenecks and optimize computational work.
147147
* **GIL-holding time** (``--mode gil``): Measures time spent holding Python's Global Interpreter
148148
Lock. Use this to identify which threads dominate GIL usage in multi-threaded applications.
149+
* **Exception handling time** (``--mode exception``): Captures samples only from threads with
150+
an active exception. Use this to analyze exception handling overhead.
149151

150152
* **Thread-aware profiling**: Option to profile all threads (``-a``) or just the main thread,
151153
essential for understanding multi-threaded application behavior.
@@ -175,6 +177,10 @@ Key features include:
175177
(``--async-aware``). See which coroutines are consuming time, with options to show only
176178
running tasks or all tasks including those waiting.
177179

180+
* **Opcode-level profiling**: Gather bytecode opcode information for instruction-level
181+
profiling (``--opcodes``). Shows which bytecode instructions are executing, including
182+
specializations from the adaptive interpreter.
183+
178184
See :mod:`profiling.sampling` for the complete documentation, including all
179185
available output formats, profiling modes, and configuration options.
180186

Include/internal/pycore_debug_offsets.h

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -110,8 +110,15 @@ typedef struct _Py_DebugOffsets {
110110
uint64_t status;
111111
uint64_t holds_gil;
112112
uint64_t gil_requested;
113+
uint64_t current_exception;
114+
uint64_t exc_state;
113115
} thread_state;
114116

117+
// Exception stack item offset
118+
struct {
119+
uint64_t exc_value;
120+
} err_stackitem;
121+
115122
// InterpreterFrame offset;
116123
struct _interpreter_frame {
117124
uint64_t size;
@@ -282,6 +289,11 @@ typedef struct _Py_DebugOffsets {
282289
.status = offsetof(PyThreadState, _status), \
283290
.holds_gil = offsetof(PyThreadState, holds_gil), \
284291
.gil_requested = offsetof(PyThreadState, gil_requested), \
292+
.current_exception = offsetof(PyThreadState, current_exception), \
293+
.exc_state = offsetof(PyThreadState, exc_state), \
294+
}, \
295+
.err_stackitem = { \
296+
.exc_value = offsetof(_PyErr_StackItem, exc_value), \
285297
}, \
286298
.interpreter_frame = { \
287299
.size = sizeof(_PyInterpreterFrame), \

Lib/profiling/sampling/_flamegraph_assets/flamegraph.js

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -717,6 +717,10 @@ function populateThreadStats(data, selectedThreadId = null) {
717717

718718
const gcPctElem = document.getElementById('gc-pct');
719719
if (gcPctElem) gcPctElem.textContent = `${(threadStats.gc_pct || 0).toFixed(1)}%`;
720+
721+
// Exception stats
722+
const excPctElem = document.getElementById('exc-pct');
723+
if (excPctElem) excPctElem.textContent = `${(threadStats.has_exception_pct || 0).toFixed(1)}%`;
720724
}
721725

722726
// ============================================================================

Lib/profiling/sampling/_flamegraph_assets/flamegraph_template.html

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -161,6 +161,10 @@ <h3 class="section-title">Runtime Stats</h3>
161161
<div class="stat-tile-value" id="gc-pct">--</div>
162162
<div class="stat-tile-label">GC</div>
163163
</div>
164+
<div class="stat-tile stat-tile--red" id="exc-stat">
165+
<div class="stat-tile-value" id="exc-pct">--</div>
166+
<div class="stat-tile-label">Exception</div>
167+
</div>
164168
</div>
165169
</div>
166170
</section>

Lib/profiling/sampling/cli.py

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@
1616
PROFILING_MODE_WALL,
1717
PROFILING_MODE_CPU,
1818
PROFILING_MODE_GIL,
19+
PROFILING_MODE_EXCEPTION,
1920
SORT_MODE_NSAMPLES,
2021
SORT_MODE_TOTTIME,
2122
SORT_MODE_CUMTIME,
@@ -90,6 +91,7 @@ def _parse_mode(mode_string):
9091
"wall": PROFILING_MODE_WALL,
9192
"cpu": PROFILING_MODE_CPU,
9293
"gil": PROFILING_MODE_GIL,
94+
"exception": PROFILING_MODE_EXCEPTION,
9395
}
9496
return mode_map[mode_string]
9597

@@ -213,10 +215,12 @@ def _add_mode_options(parser):
213215
mode_group = parser.add_argument_group("Mode options")
214216
mode_group.add_argument(
215217
"--mode",
216-
choices=["wall", "cpu", "gil"],
218+
choices=["wall", "cpu", "gil", "exception"],
217219
default="wall",
218220
help="Sampling mode: wall (all samples), cpu (only samples when thread is on CPU), "
219-
"gil (only samples when thread holds the GIL). Incompatible with --async-aware",
221+
"gil (only samples when thread holds the GIL), "
222+
"exception (only samples when thread has an active exception). "
223+
"Incompatible with --async-aware",
220224
)
221225
mode_group.add_argument(
222226
"--async-mode",

Lib/profiling/sampling/collector.py

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
THREAD_STATUS_ON_CPU,
66
THREAD_STATUS_GIL_REQUESTED,
77
THREAD_STATUS_UNKNOWN,
8+
THREAD_STATUS_HAS_EXCEPTION,
89
)
910

1011
try:
@@ -170,7 +171,7 @@ def _collect_thread_status_stats(self, stack_frames):
170171
171172
Returns:
172173
tuple: (aggregate_status_counts, has_gc_frame, per_thread_stats)
173-
- aggregate_status_counts: dict with has_gil, on_cpu, etc.
174+
- aggregate_status_counts: dict with has_gil, on_cpu, has_exception, etc.
174175
- has_gc_frame: bool indicating if any thread has GC frames
175176
- per_thread_stats: dict mapping thread_id to per-thread counts
176177
"""
@@ -179,6 +180,7 @@ def _collect_thread_status_stats(self, stack_frames):
179180
"on_cpu": 0,
180181
"gil_requested": 0,
181182
"unknown": 0,
183+
"has_exception": 0,
182184
"total": 0,
183185
}
184186
has_gc_frame = False
@@ -200,6 +202,8 @@ def _collect_thread_status_stats(self, stack_frames):
200202
status_counts["gil_requested"] += 1
201203
if status_flags & THREAD_STATUS_UNKNOWN:
202204
status_counts["unknown"] += 1
205+
if status_flags & THREAD_STATUS_HAS_EXCEPTION:
206+
status_counts["has_exception"] += 1
203207

204208
# Track per-thread statistics
205209
thread_id = getattr(thread_info, "thread_id", None)
@@ -210,6 +214,7 @@ def _collect_thread_status_stats(self, stack_frames):
210214
"on_cpu": 0,
211215
"gil_requested": 0,
212216
"unknown": 0,
217+
"has_exception": 0,
213218
"total": 0,
214219
"gc_samples": 0,
215220
}
@@ -225,6 +230,8 @@ def _collect_thread_status_stats(self, stack_frames):
225230
thread_stats["gil_requested"] += 1
226231
if status_flags & THREAD_STATUS_UNKNOWN:
227232
thread_stats["unknown"] += 1
233+
if status_flags & THREAD_STATUS_HAS_EXCEPTION:
234+
thread_stats["has_exception"] += 1
228235

229236
# Check for GC frames in this thread
230237
frames = getattr(thread_info, "frame_info", None)

Lib/profiling/sampling/constants.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
PROFILING_MODE_CPU = 1
66
PROFILING_MODE_GIL = 2
77
PROFILING_MODE_ALL = 3 # Combines GIL + CPU checks
8+
PROFILING_MODE_EXCEPTION = 4 # Only samples when thread has an active exception
89

910
# Sort mode constants
1011
SORT_MODE_NSAMPLES = 0
@@ -25,10 +26,12 @@
2526
THREAD_STATUS_ON_CPU,
2627
THREAD_STATUS_UNKNOWN,
2728
THREAD_STATUS_GIL_REQUESTED,
29+
THREAD_STATUS_HAS_EXCEPTION,
2830
)
2931
except ImportError:
3032
# Fallback for tests or when module is not available
3133
THREAD_STATUS_HAS_GIL = (1 << 0)
3234
THREAD_STATUS_ON_CPU = (1 << 1)
3335
THREAD_STATUS_UNKNOWN = (1 << 2)
3436
THREAD_STATUS_GIL_REQUESTED = (1 << 3)
37+
THREAD_STATUS_HAS_EXCEPTION = (1 << 4)

Lib/profiling/sampling/gecko_collector.py

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,13 +9,14 @@
99
from .collector import Collector
1010
from .opcode_utils import get_opcode_info, format_opcode
1111
try:
12-
from _remote_debugging import THREAD_STATUS_HAS_GIL, THREAD_STATUS_ON_CPU, THREAD_STATUS_UNKNOWN, THREAD_STATUS_GIL_REQUESTED
12+
from _remote_debugging import THREAD_STATUS_HAS_GIL, THREAD_STATUS_ON_CPU, THREAD_STATUS_UNKNOWN, THREAD_STATUS_GIL_REQUESTED, THREAD_STATUS_HAS_EXCEPTION
1313
except ImportError:
1414
# Fallback if module not available (shouldn't happen in normal use)
1515
THREAD_STATUS_HAS_GIL = (1 << 0)
1616
THREAD_STATUS_ON_CPU = (1 << 1)
1717
THREAD_STATUS_UNKNOWN = (1 << 2)
1818
THREAD_STATUS_GIL_REQUESTED = (1 << 3)
19+
THREAD_STATUS_HAS_EXCEPTION = (1 << 4)
1920

2021

2122
# Categories matching Firefox Profiler expectations
@@ -28,6 +29,7 @@
2829
{"name": "CPU", "color": "purple", "subcategories": ["Other"]},
2930
{"name": "Code Type", "color": "red", "subcategories": ["Other"]},
3031
{"name": "Opcodes", "color": "magenta", "subcategories": ["Other"]},
32+
{"name": "Exception", "color": "lightblue", "subcategories": ["Other"]},
3133
]
3234

3335
# Category indices
@@ -39,6 +41,7 @@
3941
CATEGORY_CPU = 5
4042
CATEGORY_CODE_TYPE = 6
4143
CATEGORY_OPCODES = 7
44+
CATEGORY_EXCEPTION = 8
4245

4346
# Subcategory indices
4447
DEFAULT_SUBCATEGORY = 0
@@ -88,6 +91,8 @@ def __init__(self, sample_interval_usec, *, skip_idle=False, opcodes=False):
8891
self.python_code_start = {} # Thread running Python code (has GIL)
8992
self.native_code_start = {} # Thread running native code (on CPU without GIL)
9093
self.gil_wait_start = {} # Thread waiting for GIL
94+
self.exception_start = {} # Thread has an exception set
95+
self.no_exception_start = {} # Thread has no exception set
9196

9297
# GC event tracking: track GC start time per thread
9398
self.gc_start_per_thread = {} # tid -> start_time
@@ -204,6 +209,13 @@ def collect(self, stack_frames):
204209
self._add_marker(tid, "Waiting for GIL", self.gil_wait_start.pop(tid),
205210
current_time, CATEGORY_GIL)
206211

212+
# Track exception state (Has Exception / No Exception)
213+
has_exception = bool(status_flags & THREAD_STATUS_HAS_EXCEPTION)
214+
self._track_state_transition(
215+
tid, has_exception, self.exception_start, self.no_exception_start,
216+
"Has Exception", "No Exception", CATEGORY_EXCEPTION, current_time
217+
)
218+
207219
# Track GC events by detecting <GC> frames in the stack trace
208220
# This leverages the improved GC frame tracking from commit 336366fd7ca
209221
# which precisely identifies the thread that initiated GC collection
@@ -622,6 +634,8 @@ def _finalize_markers(self):
622634
(self.native_code_start, "Native Code", CATEGORY_CODE_TYPE),
623635
(self.gil_wait_start, "Waiting for GIL", CATEGORY_GIL),
624636
(self.gc_start_per_thread, "GC Collecting", CATEGORY_GC),
637+
(self.exception_start, "Has Exception", CATEGORY_EXCEPTION),
638+
(self.no_exception_start, "No Exception", CATEGORY_EXCEPTION),
625639
]
626640

627641
for state_dict, marker_name, category in marker_states:

Lib/profiling/sampling/live_collector/collector.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@
1717
THREAD_STATUS_ON_CPU,
1818
THREAD_STATUS_UNKNOWN,
1919
THREAD_STATUS_GIL_REQUESTED,
20+
THREAD_STATUS_HAS_EXCEPTION,
2021
PROFILING_MODE_CPU,
2122
PROFILING_MODE_GIL,
2223
PROFILING_MODE_WALL,
@@ -61,6 +62,7 @@ class ThreadData:
6162
on_cpu: int = 0
6263
gil_requested: int = 0
6364
unknown: int = 0
65+
has_exception: int = 0
6466
total: int = 0 # Total status samples for this thread
6567

6668
# Sample counts
@@ -82,6 +84,8 @@ def increment_status_flag(self, status_flags):
8284
self.gil_requested += 1
8385
if status_flags & THREAD_STATUS_UNKNOWN:
8486
self.unknown += 1
87+
if status_flags & THREAD_STATUS_HAS_EXCEPTION:
88+
self.has_exception += 1
8589
self.total += 1
8690

8791
def as_status_dict(self):
@@ -91,6 +95,7 @@ def as_status_dict(self):
9195
"on_cpu": self.on_cpu,
9296
"gil_requested": self.gil_requested,
9397
"unknown": self.unknown,
98+
"has_exception": self.has_exception,
9499
"total": self.total,
95100
}
96101

@@ -160,6 +165,7 @@ def __init__(
160165
"on_cpu": 0,
161166
"gil_requested": 0,
162167
"unknown": 0,
168+
"has_exception": 0,
163169
"total": 0, # Total thread count across all samples
164170
}
165171
self.gc_frame_samples = 0 # Track samples with GC frames
@@ -359,6 +365,7 @@ def collect(self, stack_frames):
359365
thread_data.on_cpu += stats.get("on_cpu", 0)
360366
thread_data.gil_requested += stats.get("gil_requested", 0)
361367
thread_data.unknown += stats.get("unknown", 0)
368+
thread_data.has_exception += stats.get("has_exception", 0)
362369
thread_data.total += stats.get("total", 0)
363370
if stats.get("gc_samples", 0):
364371
thread_data.gc_frame_samples += stats["gc_samples"]
@@ -723,6 +730,7 @@ def reset_stats(self):
723730
"on_cpu": 0,
724731
"gil_requested": 0,
725732
"unknown": 0,
733+
"has_exception": 0,
726734
"total": 0,
727735
}
728736
self.gc_frame_samples = 0

0 commit comments

Comments
 (0)