🌐 AI搜索 & 代理 主页
Skip to content

Commit f281886

Browse files
author
Amit Kapila
committed
Fix LOCK_TIMEOUT handling in slotsync worker.
Previously, the slotsync worker relied on SIGINT for graceful shutdown during promotion. However, SIGINT is also used by the LOCK_TIMEOUT handler to cancel queries. Since the slotsync worker can lock catalog tables while parsing libpq tuples, this overlap caused it to ignore LOCK_TIMEOUT signals and potentially wait indefinitely on locks. This patch replaces the slotsync worker's SIGINT handler with StatementCancelHandler to correctly process query-cancel interrupts. Additionally, the startup process now uses SIGUSR1 to signal the slotsync worker to stop during promotion. The worker exits after detecting that the shared memory flag stopSignaled is set. Author: Hou Zhijie <houzj.fnst@fujitsu.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Backpatch-through: 17, here it was introduced Discussion: https://postgr.es/m/TY4PR01MB169078F33846E9568412D878C94A2A@TY4PR01MB16907.jpnprd01.prod.outlook.com
1 parent ca98d8b commit f281886

File tree

1 file changed

+10
-5
lines changed

1 file changed

+10
-5
lines changed

src/backend/replication/logical/slotsync.c

Lines changed: 10 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1156,10 +1156,10 @@ ProcessSlotSyncInterrupts(WalReceiverConn *wrconn)
11561156
{
11571157
CHECK_FOR_INTERRUPTS();
11581158

1159-
if (ShutdownRequestPending)
1159+
if (SlotSyncCtx->stopSignaled)
11601160
{
11611161
ereport(LOG,
1162-
errmsg("replication slot synchronization worker is shutting down on receiving SIGINT"));
1162+
errmsg("replication slot synchronization worker is shutting down because promotion is triggered"));
11631163

11641164
proc_exit(0);
11651165
}
@@ -1390,7 +1390,7 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
13901390

13911391
/* Setup signal handling */
13921392
pqsignal(SIGHUP, SignalHandlerForConfigReload);
1393-
pqsignal(SIGINT, SignalHandlerForShutdownRequest);
1393+
pqsignal(SIGINT, StatementCancelHandler);
13941394
pqsignal(SIGTERM, die);
13951395
pqsignal(SIGFPE, FloatExceptionHandler);
13961396
pqsignal(SIGUSR1, procsignal_sigusr1_handler);
@@ -1495,7 +1495,8 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
14951495

14961496
/*
14971497
* The slot sync worker can't get here because it will only stop when it
1498-
* receives a SIGINT from the startup process, or when there is an error.
1498+
* receives a stop request from the startup process, or when there is an
1499+
* error.
14991500
*/
15001501
Assert(false);
15011502
}
@@ -1582,8 +1583,12 @@ ShutDownSlotSync(void)
15821583

15831584
SpinLockRelease(&SlotSyncCtx->mutex);
15841585

1586+
/*
1587+
* Signal slotsync worker if it was still running. The worker will stop
1588+
* upon detecting that the stopSignaled flag is set to true.
1589+
*/
15851590
if (worker_pid != InvalidPid)
1586-
kill(worker_pid, SIGINT);
1591+
kill(worker_pid, SIGUSR1);
15871592

15881593
/* Wait for slot sync to end */
15891594
for (;;)

0 commit comments

Comments
 (0)