澳门新葡8455手机版-澳门新葡8455最新网站

您的位置:澳门新葡8455手机版 > 新闻资讯 > planner中子函数reconsider,planner中子函数query

planner中子函数reconsider,planner中子函数query

2019-10-05 10:19

从前的章节已介绍了函数query_planner中子函数reconsider_outer_join_clauses和generate_base_implied_equalities的显要完毕逻辑,本节继续介绍query_planner中qp_callback、fix_placeholder_input_needed_levels函数的达成逻辑。

在此之前的章节已介绍了函数query_planner中子函数query_planner中qp_callback和fix_placeholder_input_needed_levels的首要性达成逻辑,本节继续介绍remove_useless_joins、reduce_unique_semijoins和add_placeholders_to_base_rels的兑现逻辑。

query_planner代码片段:

query_planner代码片段:

 //... /* * We have completed merging equivalence sets, so it's now possible to * generate pathkeys in canonical form; so compute query_pathkeys and * other pathkeys fields in PlannerInfo. */ (*qp_callback) (root, qp_extra);//调用回调函数,处理PathKeys /* * Examine any "placeholder" expressions generated during subquery pullup. * Make sure that the Vars they need are marked as needed at the relevant * join level. This must be done before join removal because it might * cause Vars or placeholders to be needed above a join when they weren't * so marked before. */ fix_placeholder_input_needed_levels;//检查在子查询上拉时生成的PH表达式,确保Vars是OK的 //...
 //... /* * Remove any useless outer joins. Ideally this would be done during * jointree preprocessing, but the necessary information isn't available * until we've built baserel data structures and classified qual clauses. */ joinlist = remove_useless_joins(root, joinlist);//清除无用的外连接 /* * Also, reduce any semijoins with unique inner rels to plain inner joins. * Likewise, this can't be done until now for lack of needed info. */ reduce_unique_semijoins;//消除半连接 /* * Now distribute "placeholders" to base rels as needed. This has to be * done after join removal because removal could change whether a * placeholder is evaluable at a base rel. */ add_placeholders_to_base_rels;//在"base rels"中添加PH //...

一、数据结构

PlannerInfo与RelOptInfo结构体贯彻逻辑优化和情理优化进度的始终.PlannerInfo

 /*---------- * PlannerInfo * Per-query information for planning/optimization * * This struct is conventionally called "root" in all the planner routines. * It holds links to all of the planner's working state, in addition to the * original Query. Note that at present the planner extensively modifies * the passed-in Query data structure; someday that should stop. *---------- */ struct AppendRelInfo; typedef struct PlannerInfo { NodeTag type;//Node标识 Query *parse; /* 查询树,the Query being planned */ PlannerGlobal *glob; /* 当前的planner全局信息,global info for current planner run */ Index query_level; /* 查询层次,1标识最高层,1 at the outermost Query */ struct PlannerInfo *parent_root; /* 如为子计划,则这里存储父计划器指针,NULL标识最高层,NULL at outermost Query */ /* * plan_params contains the expressions that this query level needs to * make available to a lower query level that is currently being planned. * outer_params contains the paramIds of PARAM_EXEC Params that outer * query levels will make available to this query level. */ List *plan_params; /* list of PlannerParamItems, see below */ Bitmapset *outer_params; /* * simple_rel_array holds pointers to "base rels" and "other rels" (see * comments for RelOptInfo for more info). It is indexed by rangetable * index (so entry 0 is always wasted). Entries can be NULL when an RTE * does not correspond to a base relation, such as a join RTE or an * unreferenced view RTE; or if the RelOptInfo hasn't been made yet. */ /* RelOptInfo数组,存储"base rels",比如基表/子查询等.该数组与RTE的顺序一一对应,而且是从1开始,因此[0]无用 */ struct RelOptInfo **simple_rel_array; /* All 1-rel RelOptInfos */ int simple_rel_array_size; /* 数组大小,allocated size of array */ /* * simple_rte_array is the same length as simple_rel_array and holds * pointers to the associated rangetable entries. This lets us avoid * rt_fetch(), which can be a bit slow once large inheritance sets have * been expanded. */ RangeTblEntry **simple_rte_array; /* RTE数组,rangetable as an array */ /* * append_rel_array is the same length as the above arrays, and holds * pointers to the corresponding AppendRelInfo entry indexed by * child_relid, or NULL if none. The array itself is not allocated if * append_rel_list is empty. */ struct AppendRelInfo **append_rel_array;//先前已介绍,在处理集合操作如UNION ALL时使用 /* * all_baserels is a Relids set of all base relids (but not "other" * relids) in the query; that is, the Relids identifier of the final join * we need to form. This is computed in make_one_rel, just before we * start making Paths. */ Relids all_baserels;//"base rels" /* * nullable_baserels is a Relids set of base relids that are nullable by * some outer join in the jointree; these are rels that are potentially * nullable below the WHERE clause, SELECT targetlist, etc. This is * computed in deconstruct_jointree. */ Relids nullable_baserels;//Nullable-side端的"base rels" /* * join_rel_list is a list of all join-relation RelOptInfos we have * considered in this planning run. For small problems we just scan the * list to do lookups, but when there are many join relations we build a * hash table for faster lookups. The hash table is present and valid * when join_rel_hash is not NULL. Note that we still maintain the list * even when using the hash table for lookups; this simplifies life for * GEQO. */ List *join_rel_list; /* 参与连接的Relation的RelOptInfo链表,list of join-relation RelOptInfos */ struct HTAB *join_rel_hash; /* 可加快链表访问的hash表,optional hashtable for join relations */ /* * When doing a dynamic-programming-style join search, join_rel_level[k] * is a list of all join-relation RelOptInfos of level k, and * join_cur_level is the current level. New join-relation RelOptInfos are * automatically added to the join_rel_level[join_cur_level] list. * join_rel_level is NULL if not in use. */ List **join_rel_level; /* RelOptInfo指针链表数组,k层的join存储在[k]中,lists of join-relation RelOptInfos */ int join_cur_level; /* 当前的join层次,index of list being extended */ List *init_plans; /* 查询的初始化计划链表,init SubPlans for query */ List *cte_plan_ids; /* CTE子计划ID链表,per-CTE-item list of subplan IDs */ List *multiexpr_params; /* List of Lists of Params for MULTIEXPR * subquery outputs */ List *eq_classes; /* 活动的等价类链表,list of active EquivalenceClasses */ List *canon_pathkeys; /* 规范化PathKey链表,list of "canonical" PathKeys */ List *left_join_clauses; /* 外连接约束条件链表,list of RestrictInfos for mergejoinable * outer join clauses w/nonnullable var on * left */ List *right_join_clauses; /* 外连接约束条件链表,list of RestrictInfos for mergejoinable * outer join clauses w/nonnullable var on * right */ List *full_join_clauses; /* 全连接约束条件链表,list of RestrictInfos for mergejoinable * full join clauses */ List *join_info_list; /* 特殊连接信息链表,list of SpecialJoinInfos */ List *append_rel_list; /* AppendRelInfo链表,list of AppendRelInfos */ List *rowMarks; /* list of PlanRowMarks */ List *placeholder_list; /* PHI链表,list of PlaceHolderInfos */ List *fkey_list; /* 外键信息链表,list of ForeignKeyOptInfos */ List *query_pathkeys; /* uery_planner()要求的PathKeys,desired pathkeys for query_planner() */ List *group_pathkeys; /* groupClause pathkeys, if any */ List *window_pathkeys; /* pathkeys of bottom window, if any */ List *distinct_pathkeys; /* distinctClause pathkeys, if any */ List *sort_pathkeys; /* sortClause pathkeys, if any */ List *part_schemes; /* 已规范化的分区Schema,Canonicalised partition schemes used in the * query. */ List *initial_rels; /* 尝试连接的RelOptInfo链表,RelOptInfos we are now trying to join */ /* Use fetch_upper_rel() to get any particular upper rel */ List *upper_rels[UPPERREL_FINAL + 1]; /* 上层的RelOptInfo链表, upper-rel RelOptInfos */ /* Result tlists chosen by grouping_planner for upper-stage processing */ struct PathTarget *upper_targets[UPPERREL_FINAL + 1];// /* * grouping_planner passes back its final processed targetlist here, for * use in relabeling the topmost tlist of the finished Plan. */ List *processed_tlist;//最后需处理的投影列 /* Fields filled during create_plan() for use in setrefs.c */ AttrNumber *grouping_map; /* for GroupingFunc fixup */ List *minmax_aggs; /* List of MinMaxAggInfos */ MemoryContext planner_cxt; /* 内存上下文,context holding PlannerInfo */ double total_table_pages; /* 所有的pages,# of pages in all tables of query */ double tuple_fraction; /* query_planner输入参数:元组处理比例,tuple_fraction passed to query_planner */ double limit_tuples; /* query_planner输入参数:limit_tuples passed to query_planner */ Index qual_security_level; /* 表达式的最新安全等级,minimum security_level for quals */ /* Note: qual_security_level is zero if there are no securityQuals */ InheritanceKind inhTargetKind; /* indicates if the target relation is an * inheritance child or partition or a * partitioned table */ bool hasJoinRTEs; /* 存在RTE_JOIN的RTE,true if any RTEs are RTE_JOIN kind */ bool hasLateralRTEs; /* 存在标记为LATERAL的RTE,true if any RTEs are marked LATERAL */ bool hasDeletedRTEs; /* 存在已在jointree删除的RTE,true if any RTE was deleted from jointree */ bool hasHavingQual; /* 存在Having子句,true if havingQual was non-null */ bool hasPseudoConstantQuals; /* true if any RestrictInfo has * pseudoconstant = true */ bool hasRecursion; /* 递归语句,true if planning a recursive WITH item */ /* These fields are used only when hasRecursion is true: */ int wt_param_id; /* PARAM_EXEC ID for the work table */ struct Path *non_recursive_path; /* a path for non-recursive term */ /* These fields are workspace for createplan.c */ Relids curOuterRels; /* outer rels above current node */ List *curOuterParams; /* not-yet-assigned NestLoopParams */ /* optional private data for join_search_hook, e.g., GEQO */ void *join_search_private; /* Does this query modify any partition key columns? */ bool partColsUpdated; } PlannerInfo; 

RelOptInfo

 typedef struct RelOptInfo { NodeTag type;//节点标识 RelOptKind reloptkind;//RelOpt类型 /* all relations included in this RelOptInfo */ Relids relids; /*Relids集合 set of base relids (rangetable indexes) */ /* size estimates generated by planner */ double rows; /*结果元组的估算数量 estimated number of result tuples */ /* per-relation planner control flags */ bool consider_startup; /*是否考虑启动成本?是,需要保留启动成本低的路径 keep cheap-startup-cost paths? */ bool consider_param_startup; /*是否考虑参数化?的路径 ditto, for parameterized paths? */ bool consider_parallel; /*是否考虑并行处理路径 consider parallel paths? */ /* default result targetlist for Paths scanning this relation */ struct PathTarget *reltarget; /*扫描该Relation时默认的结果 list of Vars/Exprs, cost, width */ /* materialization information */ List *pathlist; /*访问路径链表 Path structures */ List *ppilist; /*路径链表中使用参数化路径进行 ParamPathInfos used in pathlist */ List *partial_pathlist; /* partial Paths */ struct Path *cheapest_startup_path;//代价最低的启动路径 struct Path *cheapest_total_path;//代价最低的整体路径 struct Path *cheapest_unique_path;//代价最低的获取唯一值的路径 List *cheapest_parameterized_paths;//代价最低的参数化?路径链表 /* parameterization information needed for both base rels and join rels */ /* (see also lateral_vars and lateral_referencers) */ Relids direct_lateral_relids; /*使用lateral语法,需依赖的Relids rels directly laterally referenced */ Relids lateral_relids; /* minimum parameterization of rel */ /* information about a base rel (not set for join rels!) */ //reloptkind=RELOPT_BASEREL时使用的数据结构 Index relid; /* Relation ID */ Oid reltablespace; /* 表空间 containing tablespace */ RTEKind rtekind; /* 基表?子查询?还是函数等等?RELATION, SUBQUERY, FUNCTION, etc */ AttrNumber min_attr; /* 最小的属性编号 smallest attrno of rel (often <0) */ AttrNumber max_attr; /* 最大的属性编号 largest attrno of rel */ Relids *attr_needed; /* 数组 array indexed [min_attr .. max_attr] */ int32 *attr_widths; /* 属性宽度 array indexed [min_attr .. max_attr] */ List *lateral_vars; /* 关系依赖的Vars/PHVs LATERAL Vars and PHVs referenced by rel */ Relids lateral_referencers; /*依赖该关系的Relids rels that reference me laterally */ List *indexlist; /* 该关系的IndexOptInfo链表 list of IndexOptInfo */ List *statlist; /* 统计信息链表 list of StatisticExtInfo */ BlockNumber pages; /* 块数 size estimates derived from pg_class */ double tuples; /* 元组数 */ double allvisfrac; /* ? */ PlannerInfo *subroot; /* 如为子查询,存储子查询的root if subquery */ List *subplan_params; /* 如为子查询,存储子查询的参数 if subquery */ int rel_parallel_workers; /* 并行执行,需要多少个workers? wanted number of parallel workers */ /* Information about foreign tables and foreign joins */ //FWD相关信息 Oid serverid; /* identifies server for the table or join */ Oid userid; /* identifies user to check access as */ bool useridiscurrent; /* join is only valid for current user */ /* use "struct FdwRoutine" to avoid including fdwapi.h here */ struct FdwRoutine *fdwroutine; void *fdw_private; /* cache space for remembering if we have proven this relation unique */ //已知的,可保证唯一的Relids链表 List *unique_for_rels; /* known unique for these other relid * set */ List *non_unique_for_rels; /* 已知的,不唯一的Relids链表 known not unique for these set */ /* used by various scans and joins: */ List *baserestrictinfo; /* 如为基本关系,存储约束条件 RestrictInfo structures (if base rel) */ QualCost baserestrictcost; /* 解析约束表达式的成本? cost of evaluating the above */ Index baserestrict_min_security; /* 最低安全等级 min security_level found in * baserestrictinfo */ List *joininfo; /* 连接语句的约束条件信息 RestrictInfo structures for join clauses * involving this rel */ bool has_eclass_joins; /* 是否存在等价类连接? T means joininfo is incomplete */ /* used by partitionwise joins: */ bool consider_partitionwise_join; /* 分区? consider partitionwise * join paths? (if * partitioned rel) */ Relids top_parent_relids; /* Relids of topmost parents (if "other" * rel) */ /* used for partitioned relations */ //分区表使用 PartitionScheme part_scheme; /* 分区的schema Partitioning scheme. */ int nparts; /* 分区数 number of partitions */ struct PartitionBoundInfoData *boundinfo; /* 分区边界信息 Partition bounds */ List *partition_qual; /* 分区约束 partition constraint */ struct RelOptInfo **part_rels; /* 分区的RelOptInfo数组 Array of RelOptInfos of partitions, * stored in the same order of bounds */ List **partexprs; /* 非空分区键表达式 Non-nullable partition key expressions. */ List **nullable_partexprs; /* 可为空的分区键表达式 Nullable partition key expressions. */ List *partitioned_child_rels; /* RT Indexes链表 List of RT indexes. */ } RelOptInfo;

PlaceHolderInfo

/* * For each distinct placeholder expression generated during planning, we * store a PlaceHolderInfo node in the PlannerInfo node's placeholder_list. * This stores info that is needed centrally rather than in each copy of the * PlaceHolderVar. The phid fields identify which PlaceHolderInfo goes with * each PlaceHolderVar. Note that phid is unique throughout a planner run, * not just within a query level --- this is so that we need not reassign ID's * when pulling a subquery into its parent. * * The idea is to evaluate the expression at  the ph_eval_at join level, * then allow it to bubble up like a Var until the ph_needed join level. * ph_needed has the same definition as attr_needed for a regular Var. * * The PlaceHolderVar's expression might contain LATERAL references to vars * coming from outside its syntactic scope. If so, those rels are *not* * included in ph_eval_at, but they are recorded in ph_lateral. * * Notice that when ph_eval_at is a join rather than a single baserel, the * PlaceHolderInfo may create constraints on join order: the ph_eval_at join * has to be formed below any outer joins that should null the PlaceHolderVar. * * We create a PlaceHolderInfo only after determining that the PlaceHolderVar * is actually referenced in the plan tree, so that unreferenced placeholders * don't result in unnecessary constraints on join order. */ typedef struct PlaceHolderInfo { NodeTag type; Index phid; /* PH的ID,ID for PH (unique within planner run) */ PlaceHolderVar *ph_var; /* copy of PlaceHolderVar tree */ Relids ph_eval_at; /* lowest level we can evaluate value at */ Relids ph_lateral; /* relids of contained lateral refs, if any */ Relids ph_needed; /* highest level the value is needed at */ int32 ph_width; /* estimated attribute width */ } PlaceHolderInfo;

一、数据结构

PlaceHolderVar上一小节已介绍过PHInfo

 /* * Placeholder node for an expression to be evaluated below the top level * of a plan tree. This is used during planning to represent the contained * expression. At the end of the planning process it is replaced by either * the contained expression or a Var referring to a lower-level evaluation of * the contained expression. Typically the evaluation occurs below an outer * join, and Var references above the outer join might thereby yield NULL * instead of the expression value. * * Although the planner treats this as an expression node type, it is not * recognized by the parser or executor, so we declare it here rather than * in primnodes.h. */ typedef struct PlaceHolderVar { Expr xpr; Expr *phexpr; /* the represented expression */ Relids phrels; /* base relids syntactically within expr src */ Index phid; /* ID for PHV (unique within planner run) */ Index phlevelsup; /* > 0 if PHV belongs to outer query */ } PlaceHolderVar;

SpecialJoinInfo

 /* * "Special join" info. * * One-sided outer joins constrain the order of joining partially but not * completely. We flatten such joins into the planner's top-level list of * relations to join, but record information about each outer join in a * SpecialJoinInfo struct. These structs are kept in the PlannerInfo node's * join_info_list. * * Similarly, semijoins and antijoins created by flattening IN (subselect) * and EXISTS(subselect) clauses create partial constraints on join order. * These are likewise recorded in SpecialJoinInfo structs. * * We make SpecialJoinInfos for FULL JOINs even though there is no flexibility * of planning for them, because this simplifies make_join_rel()'s API. * * min_lefthand and min_righthand are the sets of base relids that must be * available on each side when performing the special join. lhs_strict is * true if the special join's condition cannot succeed when the LHS variables * are all NULL (this means that an outer join can commute with upper-level * outer joins even if it appears in their RHS). We don't bother to set * lhs_strict for FULL JOINs, however. * * It is not valid for either min_lefthand or min_righthand to be empty sets; * if they were, this would break the logic that enforces join order. * * syn_lefthand and syn_righthand are the sets of base relids that are * syntactically below this special join. (These are needed to help compute * min_lefthand and min_righthand for higher joins.) * * delay_upper_joins is set true if we detect a pushed-down clause that has * to be evaluated after this join is formed (because it references the RHS). * Any outer joins that have such a clause and this join in their RHS cannot * commute with this join, because that would leave noplace to check the * pushed-down clause. (We don't track this for FULL JOINs, either.) * * For a semijoin, we also extract the join operators and their RHS arguments * and set semi_operators, semi_rhs_exprs, semi_can_btree, and semi_can_hash. * This is done in support of possibly unique-ifying the RHS, so we don't * bother unless at least one of semi_can_btree and semi_can_hash can be set * true. (You might expect that this information would be computed during * join planning; but it's helpful to have it available during planning of * parameterized table scans, so we store it in the SpecialJoinInfo structs.) * * jointype is never JOIN_RIGHT; a RIGHT JOIN is handled by switching * the inputs to make it a LEFT JOIN. So the allowed values of jointype * in a join_info_list member are only LEFT, FULL, SEMI, or ANTI. * * For purposes of join selectivity estimation, we create transient * SpecialJoinInfo structures for regular inner joins; so it is possible * to have jointype == JOIN_INNER in such a structure, even though this is * not allowed within join_info_list. We also create transient * SpecialJoinInfos with jointype == JOIN_INNER for outer joins, since for * cost estimation purposes it is sometimes useful to know the join size under * plain innerjoin semantics. Note that lhs_strict, delay_upper_joins, and * of course the semi_xxx fields are not set meaningfully within such structs. */ typedef struct SpecialJoinInfo { NodeTag type; Relids min_lefthand; /* base relids in minimum LHS for join */ Relids min_righthand; /* base relids in minimum RHS for join */ Relids syn_lefthand; /* base relids syntactically within LHS */ Relids syn_righthand; /* base relids syntactically within RHS */ JoinType jointype; /* always INNER, LEFT, FULL, SEMI, or ANTI */ bool lhs_strict; /* joinclause is strict for some LHS rel */ bool delay_upper_joins; /* can't commute with upper RHS */ /* Remaining fields are set only for JOIN_SEMI jointype: */ bool semi_can_btree; /* true if semi_operators are all btree */ bool semi_can_hash; /* true if semi_operators are all hash */ List *semi_operators; /* OIDs of equality join operators */ List *semi_rhs_exprs; /* righthand-side expressions of these ops */ } SpecialJoinInfo;

二、源码解读

standard_qp_callback标准的query_planner回调函数,在更换布置的中间管理query_pathkeys和其他pathkeys

 /* * Compute query_pathkeys and other pathkeys during plan generation */ static void standard_qp_callback(PlannerInfo *root, void *extra) { Query *parse = root->parse;//查询树 standard_qp_extra *qp_extra = (standard_qp_extra *) extra;//参数 List *tlist = qp_extra->tlist; List *activeWindows = qp_extra->activeWindows; /* * Calculate pathkeys that represent grouping/ordering requirements. The * sortClause is certainly sort-able, but GROUP BY and DISTINCT might not * be, in which case we just leave their pathkeys empty. */ if (qp_extra->groupClause && grouping_is_sortable(qp_extra->groupClause))//group语句&要求排序 root->group_pathkeys = make_pathkeys_for_sortclauses(root, qp_extra->groupClause, tlist);//构建pathkeys else root->group_pathkeys = NIL; /* We consider only the first  window in pathkeys logic */ if (activeWindows != NIL)//窗口函数 { WindowClause *wc = linitial_node(WindowClause, activeWindows); root->window_pathkeys = make_pathkeys_for_window(root, wc, tlist); } else root->window_pathkeys = NIL; if (parse->distinctClause && grouping_is_sortable(parse->distinctClause))//存在distinct语句&按相关字段排序 root->distinct_pathkeys = make_pathkeys_for_sortclauses(root, parse->distinctClause, tlist);//构建pathkeys else root->distinct_pathkeys = NIL; root->sort_pathkeys = make_pathkeys_for_sortclauses(root, parse->sortClause, tlist);//构建常规的排序pathkeys /* * Figure out whether we want a sorted result from query_planner. * * If we have a sortable GROUP BY clause, then we want a result sorted * properly for grouping. Otherwise, if we have window functions to * evaluate, we try to sort for the first window. Otherwise, if there's a * sortable DISTINCT clause that's more rigorous than the ORDER BY clause, * we try to produce output that's sufficiently well sorted for the * DISTINCT. Otherwise, if there is an ORDER BY clause, we want to sort * by the ORDER BY clause. * * Note: if we have both ORDER BY and GROUP BY, and ORDER BY is a superset * of GROUP BY, it would be tempting to request sort by ORDER BY --- but * that might just leave us failing to exploit an available sort order at * all. Needs more thought. The choice for DISTINCT versus ORDER BY is * much easier, since we know that the parser ensured that one is a * superset of the other. */ if (root->groupremove_useless_joins_pathkeys) root->query_pathkeys = root->group_pathkeys; else if (root->window_pathkeys) root->query_pathkeys = root->window_pathkeys; else if (list_length(root->distinct_pathkeys) > list_length(root->sort_pathkeys)) root->query_pathkeys = root->distinct_pathkeys; else if (root->sort_pathkeys) root->query_pathkeys = root->sort_pathkeys; else root->query_pathkeys = NIL; }/* * make_pathkeys_for_sortclauses * Generate a pathkeys list that represents the sort order specified * by a list of SortGroupClauses * * The resulting PathKeys are always in canonical form. (Actually, there * is no longer any code anywhere that creates non-canonical PathKeys.) * * We assume that root->nullable_baserels is the set of base relids that could * have gone to NULL below the SortGroupClause expressions. This is okay if * the expressions came from the query's top level (ORDER BY, DISTINCT, etc) * and if this function is only invoked after deconstruct_jointree. In the * future we might have to make callers pass in the appropriate * nullable-relids set, but for now it seems unnecessary. * * 'sortclauses' is a list of SortGroupClause nodes * 'tlist' is the targetlist to find the referenced tlist entries in */ List * make_pathkeys_for_sortclauses(PlannerInfo *root, List *sortclauses, List *tlist) { List *pathkeys = NIL; ListCell *l; foreach(l, sortclauses) { SortGroupClause *sortcl = (SortGroupClause *) lfirst; Expr *sortkey; PathKey *pathkey; sortkey =  get_sortgroupclause_expr(sortcl, tlist); Assert(OidIsValid(sortcl->sortop)); pathkey = make_pathkey_from_sortop(root, sortkey, root->nullable_baserels, sortcl->sortop, sortcl->nulls_first, sortcl->tleSortGroupRef, true); /* Canonical form eliminates redundant ordering keys */ if (!pathkey_is_redundant(pathkey, pathkeys))//不是多余的Key的情况下,才保留 pathkeys = lappend(pathkeys, pathkey); } return pathkeys; }

测量检验脚本:

testdb=# select t1.dwbh,t2.grbhfrom t_dwxx t1 left join t_grxx t2 on t1.dwbh = t2.dwbhwhere t1.dwbh = '1001'order by t1.dwbh;

钉住分析,步向make_pathkeys_for_sortclauses函数:

... stepmake_pathkeys_for_sortclauses (root=0x1702958, sortclauses=0x170d068, tlist=0x1746758) at pathkeys.c:878878 List *pathkeys = NIL; p *(SortGroupClause *)sortclauses->head->data.ptr_value$5 = {type = T_SortGroupClause, tleSortGroupRef = 1, eqop = 98, sortop = 664, nulls_first = false, hashable = true}... n889 pathkey = make_pathkey_from_sortop(root, 898 if (!pathkey_is_redundant(pathkey, pathkeys)) p *pathkey$11 = {type = T_PathKey, pk_eclass = 0x17486c0, pk_opfamily = 1994, pk_strategy = 1, pk_nulls_first = false}

函数pathkey_is_redundant通过等价类推断排序是或不是多余(redundant),在本例中,已存在限制条件dwbh='1001',因而该排序是剩下的,重回NULL.

901 return pathkeys; p *pathkeysCannot access memory at address 0x0

fix_placeholder_input_needed_levels

 /* * fix_placeholder_input_needed_levels * Adjust the "needed at" levels for placeholder inputs * * This is called after we've finished determining the eval_at levels for * all placeholders. We need to make sure that all vars and placeholders * needed to evaluate each placeholder will be available at the scan or join * level where the evaluation will be done. (It might seem that scan-level * evaluations aren't interesting, but that's not so: a LATERAL reference * within a placeholder's expression needs to cause the referenced var or * placeholder to be marked as needed in the scan where it's evaluated.) * Note that this loop can have side-effects on the ph_needed sets of other * PlaceHolderInfos; that's okay because we don't examine ph_needed here, so * there are no ordering issues to worry about. */ void fix_placeholder_input_needed_levels(PlannerInfo *root) { ListCell *lc; foreach(lc, root->placeholder_list)//遍历链表 { PlaceHolderInfo *phinfo = (PlaceHolderInfo *) lfirst; List *vars = pull_var_clause phinfo->ph_var->phexpr, PVC_RECURSE_AGGREGATES | PVC_RECURSE_WINDOWFUNCS | PVC_INCLUDE_PLACEHOLDERS);//获取Vars add_vars_to_targetlist(root, vars, phinfo->ph_eval_at, false);//添加到投影列中 list_free; } }

观测上边包车型地铁SQL语句:

testdb=# explain verbose select t1.dwbh,t2.grbh,t2.constant_fieldfrom t_dwxx t1 left join (select a.dwbh,a.grbh,'TEST' as constant_field from t_grxx a) t2 on t1.dwbh = t2.dwbh where t1.dwbh = '1001'order by t1.dwbh; QUERY PLAN ------------------------------------------------------------------------ Nested Loop Left Join (cost=0.00..16.06 rows=2 width=108) Output: t1.dwbh, a.grbh, ('TEST'::text) -- PlaceHolderVar Join Filter: ::text = ::text) -> Seq Scan on public.t_dwxx t1 (cost=0.00..1.04 rows=1 width=38) Output: t1.dwmc, t1.dwbh, t1.dwdz Filter: ::text = '1001'::text) -> Seq Scan on public.t_grxx a (cost=0.00..15.00 rows=2 width=108) Output: a.grbh, a.dwbh, 'TEST'::text Filter: ::text = '1001'::text)

子查询上拉与t_dwxx举行一而再,上拉进度中,不能够简单的把手查询中的"'TEST' as constant_田野先生"作为上层查询的Var来对待(即使不作特殊管理,跟外连接就不等价了,因为该值有极大希望是NULL),PG因而引进了PlaceHolderVar这么一个Var来对这种变量实行超过常规规管理.

钉住剖判:

 b planmain.c:161Breakpoint 1 at 0x769602: file planmain.c, line 161. cContinuing.Breakpoint 1, query_planner (root=0x1702b08, tlist=0x1749c20, qp_callback=0x76e97d <standard_qp_callback>, qp_extra=0x7ffd35e059c0) at planmain.c:163163 reconsider_outer_join_clauses;

注意root中的placeholder_list

 p *root$1 = {type = T_PlannerInfo, ..., placeholder_list = 0x174bf00, ...}

翻看其内部存款和储蓄器结构:

#1个PHV p *root->placeholder_list$2 = {type = T_List, length = 1, head = 0x174bee0, tail = 0x174bee0} p *root->placeholder_list->head->data.ptr_value$3 = {type = T_PlaceHolderInfo} p *(PlaceHolderInfo *)root->placeholder_list->head->data.ptr_value$4 = {type = T_PlaceHolderInfo, phid = 1, ph_var = 0x174be18, ph_eval_at = 0x174bec8, ph_lateral = 0x0, ph_needed = 0x174bf30, ph_width = 32} set $phi=(PlaceHolderInfo *)root->placeholder_list->head->data.ptr_value p *$phi->ph_var$5 = {xpr = {type = T_PlaceHolderVar}, phexpr = 0x174be48, phrels = 0x174beb0, phid = 1, phlevelsup = 0}#该PHV位于编号为4的RTE中 p *$phi->ph_eval_at$6 = {nwords = 1, words = 0x174becc} p *$phi->ph_eval_at->words$7 = 16 p *$phi->ph_needed$8 = {nwords = 1, words = 0x174bf34}#该PHV在编号为1的RTE中需要用到 p *$phi->ph_needed->words$9 = 1

二、源码解读

remove_useless_joins取消无用的连年,举例以下的SQL语句:

select t1.dwbh from t_grxx t1 left join t_dwxx t2 on t1.dwbh = t2.dwbh;

左连接,而且t_dwxx.dwbh独一,那样的连接是不要求的三回九转,直接查询t_grxx就能够.从实行安顿来看,PG只对t_grxx举办围观:

testdb=# explain verbose select t1.dwbh from t_grxx t1 left join t_dwxx t2 on t1.dwbh = t2.dwbh; QUERY PLAN -------------------------------------------------------------------- Seq Scan on public.t_grxx t1 (cost=0.00..14.00 rows=400 width=38) Output: t1.dwbh

源代码如下:

 /* * remove_useless_joins * Check for relations that don't actually need to be joined at all, * and remove them from the query. * * We are passed the current joinlist and return the updated list. Other * data structures that have to be updated are accessible via "root". */ List * remove_useless_joins(PlannerInfo *root, List *joinlist) { ListCell *lc; /* * We are only interested in relations that are left-joined to, so we can * scan the join_info_list to find them easily. */ restart: foreach(lc, root->join_info_list)//遍历连接信息链表 { SpecialJoinInfo *sjinfo = (SpecialJoinInfo *) lfirst; int innerrelid; int nremoved; /* Skip if not removable */ if (!join_is_removable(root, sjinfo))//判断是否可以清除连接 continue; /* * Currently, join_is_removable can only succeed when the sjinfo's * righthand is a single baserel. Remove that rel from the query and * joinlist. */ innerrelid = bms_singleton_member(sjinfo->min_righthand); remove_rel_from_query(root, innerrelid, bms_union(sjinfo->min_lefthand, sjinfo->min_righthand));//从查询中删除相应的Rel /* We verify that exactly one reference gets removed from joinlist */ nremoved = 0; joinlist = remove_rel_from_joinlist(joinlist, innerrelid, &nremoved); if (nremoved != 1) elog(ERROR, "failed to find relation %d in joinlist", innerrelid); /* * We can delete this SpecialJoinInfo from the list too, since it's no * longer of interest. */ //更新连接链表信息 root->join_info_list = list_delete_ptr(root->join_info_list, sjinfo); /* * Restart the scan. This is necessary to ensure we find all * removable joins independently of ordering of the join_info_list * (note that removal of attr_needed bits may make a join appear * removable that did not before). Also, since we just deleted the * current list cell, we'd have to have some kluge to continue the * list scan anyway. */ goto restart; } return joinlist; }

reduce_unique_semijoins把能够简化的半连接转化为内连接.比方以下的SQL语句:

select t1.*from t_grxx t1 where dwbh IN (select t2.dwbh from t_dwxx t2);

是因为子查询"select t2.dwbh from t_dwxx t2"的dwbh是PK,子查询进步后,t_grxx的dwbh只对应t_dwxx独一的一条记下,因而得以把半一而再转变为内接连,实施陈设如下:

testdb=# explain verbose select t1.*from t_grxx t1 where dwbh IN (select t2.dwbh from t_dwxx t2); QUERY PLAN ----------------------------------------------------------------------------- Hash Join (cost=1.07..20.10 rows=6 width=176) Output: t1.dwbh, t1.grbh, t1.xm, t1.xb, t1.nl Inner Unique: true Hash Cond: ::text = ::text) -> Seq Scan on public.t_grxx t1 (cost=0.00..14.00 rows=400 width=176) Output: t1.dwbh, t1.grbh, t1.xm, t1.xb, t1.nl -> Hash (cost=1.03..1.03 rows=3 width=38) Output: t2.dwbh -> Seq Scan on public.t_dwxx t2 (cost=0.00..1.03 rows=3 width=38) Output: t2.dwbh

盯住解析:

 n199 reduce_unique_semijoins; stepreduce_unique_semijoins (root=0x1702968) at analyzejoins.c:520520 for (lc = list_head(root->join_info_list); lc != NULL; lc = next) n522 SpecialJoinInfo *sjinfo = (SpecialJoinInfo *) lfirst 

翻开SpecialJoinInfo内部存款和储蓄器结构:

528 next = lnext p *sjinfo$1 = {type = T_SpecialJoinInfo, min_lefthand = 0x1749818, min_righthand = 0x1749830, syn_lefthand = 0x1749570, syn_righthand = 0x17495d0, jointype = JOIN_SEMI, lhs_strict = true, delay_upper_joins = false, semi_can_btree = true, semi_can_hash = true, semi_operators = 0x17496c8, semi_rhs_exprs = 0x17497b8}

内表(innerrel,即t_dwxx)如帮忙唯一性,则能够虚构把半接连转换为内接连

550 if (!rel_supports_distinctness(root, innerrel))...575 root->join_info_list = list_delete_ptr(root->join_info_list, sjinfo);...

源代码如下:

 /* * reduce_unique_semijoins * Check for semijoins that can be simplified to plain inner joins * because the inner relation is provably unique for the join clauses. * * Ideally this would happen during reduce_outer_joins, but we don't have * enough information at that point. * * To perform the strength reduction when applicable, we need only delete * the semijoin's SpecialJoinInfo from root->join_info_list. (We don't * bother fixing the join type attributed to it in the query jointree, * since that won't be consulted again.) */ void reduce_unique_semijoins(PlannerInfo *root) { ListCell *lc; ListCell *next; /* * Scan the join_info_list to find semijoins. We can't use foreach * because we may delete the current cell. */ for (lc = list_head(root->join_info_list); lc != NULL; lc = next) { SpecialJoinInfo *sjinfo = (SpecialJoinInfo *) lfirst;//特殊连接信息,先前通过deconstruct函数生成 int innerrelid; RelOptInfo *innerrel; Relids joinrelids; List *restrictlist; next = lnext; /* * Must be a non-delaying semijoin to a single baserel, else we aren't * going to be able to do anything with it. (It's probably not * possible for delay_upper_joins to be set on a semijoin, but we * might as well check.) */ if (sjinfo->jointype != JOIN_SEMI || sjinfo->delay_upper_joins) continue; if (!bms_get_singleton_member(sjinfo->min_righthand, &innerrelid)) continue; innerrel = find_base_rel(root, innerrelid); /* * Before we trouble to run generate_join_implied_equalities, make a * quick check to eliminate cases in which we will surely be unable to * prove uniqueness of the innerrel. */ if (!rel_supports_distinctness(root, innerrel)) continue; /* Compute the relid set for the join we are considering */ joinrelids = bms_union(sjinfo->min_lefthand, sjinfo->min_righthand); /* * Since we're only considering a single-rel RHS, any join clauses it * has must be clauses linking it to the semijoin's min_lefthand. We * can also consider EC-derived join clauses. */ restrictlist = list_concat(generate_join_implied_equalities(root, joinrelids, sjinfo->min_lefthand, innerrel), innerrel->joininfo); /* Test whether the innerrel is unique for those clauses. */ if (!innerrel_is_unique(root, joinrelids, sjinfo->min_lefthand, innerrel, JOIN_SEMI, restrictlist, true)) continue; /* OK, remove the SpecialJoinInfo from the list. */ root->join_info_list = list_delete_ptr(root->join_info_list, sjinfo);//删除特殊连接信息 } }

add_placeholders_to_base_rels把PHV分发到base rels中,代码较为简单

 /* * add_placeholders_to_base_rels * Add any required PlaceHolderVars to base rels' targetlists. * * If any placeholder can be computed at a base rel and is needed above it, * add it to that rel's targetlist. This might look like it could be merged * with fix_placeholder_input_needed_levels, but it must be separate because * join removal happens in between, and can change the ph_eval_at sets. There * is essentially the same logic in add_placeholders_to_joinrel, but we can't * do that part until joinrels are formed. */ void add_placeholders_to_base_rels(PlannerInfo *root) { ListCell *lc; foreach(lc, root->placeholder_list)//遍历PH链表 { PlaceHolderInfo *phinfo = (PlaceHolderInfo *) lfirst; Relids eval_at = phinfo->ph_eval_at; int varno; if (bms_get_singleton_member(eval_at, &varno) && bms_nonempty_difference(phinfo->ph_needed, eval_at))//添加到需要的RelOptInfo中 { RelOptInfo *rel = find_base_rel(root, varno); rel->reltarget->exprs = lappend(rel->reltarget->exprs, copyObject(phinfo->ph_var)); /* reltarget's cost and width fields will be updated later */ } } }

三、参谋资料

planmain.cwhat exactly is a PlaceHolderVAr

三、参照他事他说加以考察资料

planmain.crelation.h

本文由澳门新葡8455手机版发布于新闻资讯,转载请注明出处:planner中子函数reconsider,planner中子函数query

关键词:

  • 上一篇:没有了
  • 下一篇:没有了