🌐 AI搜索 & 代理 主页
Skip to content

Commit 8e11859

Browse files
author
Richard Guo
committed
Implement Eager Aggregation
Eager aggregation is a query optimization technique that partially pushes aggregation past a join, and finalizes it once all the relations are joined. Eager aggregation may reduce the number of input rows to the join and thus could result in a better overall plan. In the current planner architecture, the separation between the scan/join planning phase and the post-scan/join phase means that aggregation steps are not visible when constructing the join tree, limiting the planner's ability to exploit aggregation-aware optimizations. To implement eager aggregation, we collect information about aggregate functions in the targetlist and HAVING clause, along with grouping expressions from the GROUP BY clause, and store it in the PlannerInfo node. During the scan/join planning phase, this information is used to evaluate each base or join relation to determine whether eager aggregation can be applied. If applicable, we create a separate RelOptInfo, referred to as a grouped relation, to represent the partially-aggregated version of the relation and generate grouped paths for it. Grouped relation paths can be generated in two ways. The first method involves adding sorted and hashed partial aggregation paths on top of the non-grouped paths. To limit planning time, we only consider the cheapest or suitably-sorted non-grouped paths in this step. Alternatively, grouped paths can be generated by joining a grouped relation with a non-grouped relation. Joining two grouped relations is currently not supported. To further limit planning time, we currently adopt a strategy where partial aggregation is pushed only to the lowest feasible level in the join tree where it provides a significant reduction in row count. This strategy also helps ensure that all grouped paths for the same grouped relation produce the same set of rows, which is important to support a fundamental assumption of the planner. For the partial aggregation that is pushed down to a non-aggregated relation, we need to consider all expressions from this relation that are involved in upper join clauses and include them in the grouping keys, using compatible operators. This is essential to ensure that an aggregated row from the partial aggregation matches the other side of the join if and only if each row in the partial group does. This ensures that all rows within the same partial group share the same "destiny", which is crucial for maintaining correctness. One restriction is that we cannot push partial aggregation down to a relation that is in the nullable side of an outer join, because the NULL-extended rows produced by the outer join would not be available when we perform the partial aggregation, while with a non-eager-aggregation plan these rows are available for the top-level aggregation. Pushing partial aggregation in this case may result in the rows being grouped differently than expected, or produce incorrect values from the aggregate functions. If we have generated a grouped relation for the topmost join relation, we finalize its paths at the end. The final paths will compete in the usual way with paths built from regular planning. The patch was originally proposed by Antonin Houska in 2017. This commit reworks various important aspects and rewrites most of the current code. However, the original patch and reviews were very useful. Author: Richard Guo <guofenglinux@gmail.com> Author: Antonin Houska <ah@cybertec.at> (in an older version) Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Jian He <jian.universality@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Tomas Vondra <tomas@vondra.me> (in an older version) Reviewed-by: Andy Fan <zhihuifan1213@163.com> (in an older version) Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> (in an older version) Discussion: https://postgr.es/m/CAMbWs48jzLrPt1J_00ZcPZXWUQKawQOFE8ROc-ADiYqsqrpBNw@mail.gmail.com
1 parent 185e304 commit 8e11859

File tree

26 files changed

+4293
-76
lines changed

26 files changed

+4293
-76
lines changed

contrib/postgres_fdw/expected/postgres_fdw.out

Lines changed: 26 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -3701,30 +3701,33 @@ select count(t1.c3) from ft2 t1 left join ft2 t2 on (t1.c1 = random() * t2.c2);
37013701
-- Subquery in FROM clause having aggregate
37023702
explain (verbose, costs off)
37033703
select count(*), x.b from ft1, (select c2 a, sum(c1) b from ft1 group by c2) x where ft1.c2 = x.a group by x.b order by 1, 2;
3704-
QUERY PLAN
3705-
-----------------------------------------------------------------------------------------------
3704+
QUERY PLAN
3705+
-----------------------------------------------------------------------------------------
37063706
Sort
3707-
Output: (count(*)), x.b
3708-
Sort Key: (count(*)), x.b
3709-
-> HashAggregate
3710-
Output: count(*), x.b
3711-
Group Key: x.b
3712-
-> Hash Join
3713-
Output: x.b
3714-
Inner Unique: true
3715-
Hash Cond: (ft1.c2 = x.a)
3716-
-> Foreign Scan on public.ft1
3717-
Output: ft1.c2
3718-
Remote SQL: SELECT c2 FROM "S 1"."T 1"
3719-
-> Hash
3720-
Output: x.b, x.a
3721-
-> Subquery Scan on x
3722-
Output: x.b, x.a
3723-
-> Foreign Scan
3724-
Output: ft1_1.c2, (sum(ft1_1.c1))
3725-
Relations: Aggregate on (public.ft1 ft1_1)
3726-
Remote SQL: SELECT c2, sum("C 1") FROM "S 1"."T 1" GROUP BY 1
3727-
(21 rows)
3707+
Output: (count(*)), (sum(ft1_1.c1))
3708+
Sort Key: (count(*)), (sum(ft1_1.c1))
3709+
-> Finalize GroupAggregate
3710+
Output: count(*), (sum(ft1_1.c1))
3711+
Group Key: (sum(ft1_1.c1))
3712+
-> Sort
3713+
Output: (sum(ft1_1.c1)), (PARTIAL count(*))
3714+
Sort Key: (sum(ft1_1.c1))
3715+
-> Hash Join
3716+
Output: (sum(ft1_1.c1)), (PARTIAL count(*))
3717+
Hash Cond: (ft1_1.c2 = ft1.c2)
3718+
-> Foreign Scan
3719+
Output: ft1_1.c2, (sum(ft1_1.c1))
3720+
Relations: Aggregate on (public.ft1 ft1_1)
3721+
Remote SQL: SELECT c2, sum("C 1") FROM "S 1"."T 1" GROUP BY 1
3722+
-> Hash
3723+
Output: ft1.c2, (PARTIAL count(*))
3724+
-> Partial HashAggregate
3725+
Output: ft1.c2, PARTIAL count(*)
3726+
Group Key: ft1.c2
3727+
-> Foreign Scan on public.ft1
3728+
Output: ft1.c2
3729+
Remote SQL: SELECT c2 FROM "S 1"."T 1"
3730+
(24 rows)
37283731

37293732
select count(*), x.b from ft1, (select c2 a, sum(c1) b from ft1 group by c2) x where ft1.c2 = x.a group by x.b order by 1, 2;
37303733
count | b

doc/src/sgml/config.sgml

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5475,6 +5475,21 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
54755475
</listitem>
54765476
</varlistentry>
54775477

5478+
<varlistentry id="guc-enable-eager-aggregate" xreflabel="enable_eager_aggregate">
5479+
<term><varname>enable_eager_aggregate</varname> (<type>boolean</type>)
5480+
<indexterm>
5481+
<primary><varname>enable_eager_aggregate</varname> configuration parameter</primary>
5482+
</indexterm>
5483+
</term>
5484+
<listitem>
5485+
<para>
5486+
Enables or disables the query planner's ability to partially push
5487+
aggregation past a join, and finalize it once all the relations are
5488+
joined. The default is <literal>on</literal>.
5489+
</para>
5490+
</listitem>
5491+
</varlistentry>
5492+
54785493
<varlistentry id="guc-enable-gathermerge" xreflabel="enable_gathermerge">
54795494
<term><varname>enable_gathermerge</varname> (<type>boolean</type>)
54805495
<indexterm>
@@ -6095,6 +6110,22 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
60956110
</listitem>
60966111
</varlistentry>
60976112

6113+
<varlistentry id="guc-min-eager-agg-group-size" xreflabel="min_eager_agg_group_size">
6114+
<term><varname>min_eager_agg_group_size</varname> (<type>floating point</type>)
6115+
<indexterm>
6116+
<primary><varname>min_eager_agg_group_size</varname> configuration parameter</primary>
6117+
</indexterm>
6118+
</term>
6119+
<listitem>
6120+
<para>
6121+
Sets the minimum average group size required to consider applying
6122+
eager aggregation. This helps avoid the overhead of eager
6123+
aggregation when it does not offer significant row count reduction.
6124+
The default is <literal>8</literal>.
6125+
</para>
6126+
</listitem>
6127+
</varlistentry>
6128+
60986129
<varlistentry id="guc-jit-above-cost" xreflabel="jit_above_cost">
60996130
<term><varname>jit_above_cost</varname> (<type>floating point</type>)
61006131
<indexterm>

src/backend/optimizer/README

Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1500,3 +1500,113 @@ breaking down aggregation or grouping over a partitioned relation into
15001500
aggregation or grouping over its partitions is called partitionwise
15011501
aggregation. Especially when the partition keys match the GROUP BY clause,
15021502
this can be significantly faster than the regular method.
1503+
1504+
Eager aggregation
1505+
-----------------
1506+
1507+
Eager aggregation is a query optimization technique that partially
1508+
pushes aggregation past a join, and finalizes it once all the
1509+
relations are joined. Eager aggregation may reduce the number of
1510+
input rows to the join and thus could result in a better overall plan.
1511+
1512+
To prove that the transformation is correct, let's first consider the
1513+
case where only inner joins are involved. In this case, we partition
1514+
the tables in the FROM clause into two groups: those that contain at
1515+
least one aggregation column, and those that do not contain any
1516+
aggregation columns. Each group can be treated as a single relation
1517+
formed by the Cartesian product of the tables within that group.
1518+
Therefore, without loss of generality, we can assume that the FROM
1519+
clause contains exactly two relations, R1 and R2, where R1 represents
1520+
the relation containing all aggregation columns, and R2 represents the
1521+
relation without any aggregation columns.
1522+
1523+
Let the query be of the form:
1524+
1525+
SELECT G, AGG(A)
1526+
FROM R1 JOIN R2 ON J
1527+
GROUP BY G;
1528+
1529+
where G is the set of grouping keys that may include columns from R1
1530+
and/or R2; AGG(A) is an aggregate function over columns A from R1; J
1531+
is the join condition between R1 and R2.
1532+
1533+
The transformation of eager aggregation is:
1534+
1535+
GROUP BY G, AGG(A) on (R1 JOIN R2 ON J)
1536+
=
1537+
GROUP BY G, AGG(agg_A) on ((GROUP BY G1, AGG(A) AS agg_A on R1) JOIN R2 ON J)
1538+
1539+
This equivalence holds under the following conditions:
1540+
1541+
1) AGG is decomposable, meaning that it can be computed in two stages:
1542+
a partial aggregation followed by a final aggregation;
1543+
2) The set G1 used in the pre-aggregation of R1 includes:
1544+
* all columns from R1 that are part of the grouping keys G, and
1545+
* all columns from R1 that appear in the join condition J.
1546+
3) The grouping operator for any column in G1 must be compatible with
1547+
the operator used for that column in the join condition J.
1548+
1549+
Since G1 includes all columns from R1 that appear in either the
1550+
grouping keys G or the join condition J, all rows within each partial
1551+
group have identical values for both the grouping keys and the
1552+
join-relevant columns from R1, assuming compatible operators are used.
1553+
As a result, the rows within a partial group are indistinguishable in
1554+
terms of their contribution to the aggregation and their behavior in
1555+
the join. This ensures that all rows in the same partial group share
1556+
the same "destiny": they either all match or all fail to match a given
1557+
row in R2. Because the aggregate function AGG is decomposable,
1558+
aggregating the partial results after the join yields the same final
1559+
result as aggregating after the full join, thereby preserving query
1560+
semantics. Q.E.D.
1561+
1562+
In the case where there are any outer joins, the situation becomes
1563+
more complex due to join order constraints and the semantics of
1564+
null-extension in outer joins. If the relations that contain at least
1565+
one aggregation column cannot be treated as a single relation because
1566+
of the join order constraints, partial aggregation paths will not be
1567+
generated, and thus the transformation is not applicable. Otherwise,
1568+
let R1 be the relation containing all aggregation columns, and R2, R3,
1569+
... be the remaining relations. From the inner join case, under the
1570+
aforementioned conditions, we have the equivalence:
1571+
1572+
GROUP BY G, AGG(A) on (R1 JOIN R2 JOIN R3 ...)
1573+
=
1574+
GROUP BY G, AGG(agg_A) on ((GROUP BY G1, AGG(A) AS agg_A on R1) JOIN R2 JOIN R3 ...)
1575+
1576+
To preserve correctness when outer joins are involved, we require an
1577+
additional condition:
1578+
1579+
4) R1 must not be on the nullable side of any outer join.
1580+
1581+
This condition ensures that partial aggregation over R1 does not
1582+
suppress any null-extended rows that would be introduced by outer
1583+
joins. If R1 is on the nullable side of an outer join, the
1584+
NULL-extended rows produced by the outer join would not be available
1585+
when we perform the partial aggregation, while with a
1586+
non-eager-aggregation plan these rows are available for the top-level
1587+
aggregation. Pushing partial aggregation in this case may result in
1588+
the rows being grouped differently than expected, or produce incorrect
1589+
values from the aggregate functions.
1590+
1591+
During the construction of the join tree, we evaluate each base or
1592+
join relation to determine if eager aggregation can be applied. If
1593+
feasible, we create a separate RelOptInfo called a "grouped relation"
1594+
and generate grouped paths by adding sorted and hashed partial
1595+
aggregation paths on top of the non-grouped paths. To limit planning
1596+
time, we consider only the cheapest or suitably-sorted non-grouped
1597+
paths in this step.
1598+
1599+
Another way to generate grouped paths is to join a grouped relation
1600+
with a non-grouped relation. Joining two grouped relations is
1601+
currently not supported.
1602+
1603+
To further limit planning time, we currently adopt a strategy where
1604+
partial aggregation is pushed only to the lowest feasible level in the
1605+
join tree where it provides a significant reduction in row count.
1606+
This strategy also helps ensure that all grouped paths for the same
1607+
grouped relation produce the same set of rows, which is important to
1608+
support a fundamental assumption of the planner.
1609+
1610+
If we have generated a grouped relation for the topmost join relation,
1611+
we need to finalize its paths at the end. The final paths will
1612+
compete in the usual way with paths built from regular planning.

src/backend/optimizer/geqo/geqo_eval.c

Lines changed: 20 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -264,6 +264,9 @@ merge_clump(PlannerInfo *root, List *clumps, Clump *new_clump, int num_gene,
264264
/* Keep searching if join order is not valid */
265265
if (joinrel)
266266
{
267+
bool is_top_rel = bms_equal(joinrel->relids,
268+
root->all_query_rels);
269+
267270
/* Create paths for partitionwise joins. */
268271
generate_partitionwise_join_paths(root, joinrel);
269272

@@ -273,12 +276,28 @@ merge_clump(PlannerInfo *root, List *clumps, Clump *new_clump, int num_gene,
273276
* rel once we know the final targetlist (see
274277
* grouping_planner).
275278
*/
276-
if (!bms_equal(joinrel->relids, root->all_query_rels))
279+
if (!is_top_rel)
277280
generate_useful_gather_paths(root, joinrel, false);
278281

279282
/* Find and save the cheapest paths for this joinrel */
280283
set_cheapest(joinrel);
281284

285+
/*
286+
* Except for the topmost scan/join rel, consider generating
287+
* partial aggregation paths for the grouped relation on top
288+
* of the paths of this rel. After that, we're done creating
289+
* paths for the grouped relation, so run set_cheapest().
290+
*/
291+
if (joinrel->grouped_rel != NULL && !is_top_rel)
292+
{
293+
RelOptInfo *grouped_rel = joinrel->grouped_rel;
294+
295+
Assert(IS_GROUPED_REL(grouped_rel));
296+
297+
generate_grouped_paths(root, grouped_rel, joinrel);
298+
set_cheapest(grouped_rel);
299+
}
300+
282301
/* Absorb new clump into old */
283302
old_clump->joinrel = joinrel;
284303
old_clump->size += new_clump->size;

0 commit comments

Comments
 (0)