Phases

  • Parsing → Logical Planning & Optimization → Physical Planning → Execution
  • The SQL front-end is responsible for parsing, desugaring, free simplifications and semantic analysis. The SQL middle-end is responsible for logical planning and optimization. For efficiency and engineering reasons, the the front-end and middle-end are grouped together in the code
  • If the architecture calls out a thing called e.g. “query runner”, which takes as input a logical query plan (a data structure) and outputs result rows (another data structure), you’d usually expect a thing in the source code called “query runner” that looks like a class whose instances would carry the execution’s internal state providing some methods that take a logical plan as input, and returning result rows as results.
  • It is common for SQL engines to separate processing of a query into two phases: preparation and execution. This is especially valuable because the work of preparation can be performed just once for multiple executions.

Parsing

  • CockroachDB uses an LALR parser that is generated by goyacc from a Yacc-like grammar file. This file specifies CockroachDB’s SQL dialect, and was originally copied from PostgreSQL and then modified.

// Statement is the result of parsing a single statement. It contains the AST
// node along with other information.
type Statement[T any] struct {
	// AST is the root of the AST tree for the parsed statement.
	// Note that it is NOT SAFE to access this currently with statement execution,
	// as unfortunately the AST is not immutable.
	// See issue https://github.com/cockroachdb/cockroach/issues/22847 for more
	// details on this problem.
	AST T

	// Comments is the list of parsed SQL comments.
	Comments []string

	// SQL is the original SQL from which the statement was parsed. Note that this
	// is not appropriate for use in logging, as it may contain passwords and
	// other sensitive data.
	SQL string

	// NumPlaceholders indicates the number of arguments to the statement (which
	// are referenced through placeholders). This corresponds to the highest
	// argument position (i.e. the x in "$x") that appears in the query.
	//
	// Note: where there are "gaps" in the placeholder positions, this number is
	// based on the highest position encountered. For example, for `SELECT $3`,
	// NumPlaceholders is 3. These cases are malformed and will result in a
	// type-check error.
	NumPlaceholders int

	// NumAnnotations indicates the number of annotations in the tree. It is equal
	// to the maximum annotation index.
	NumAnnotations tree.AnnotationIdx
}
  • Note that the AST only represents the syntax of the query, and says nothing about how, or even if, it can be executed. For example, it has no idea whether a table exists or what datatype a column has. Figuring that out is the job of the planner.
  • mainly Parser.Parse() in sql/parser/parse.go.

Logical Planning and Optimization

  • Query planning is the process of converting the SQL query AST into a tree of relational algebra operators that will produce the desired result.
EXPLAIN
    SELECT companies.name AS company, countries.name AS country, employees
    FROM companies JOIN countries ON companies.country = countries.id
    WHERE employees > 1e6
    ORDER BY employees DESC;
  • Each node in the plan is a relational operator, and results flow upwards from one node to the next. We can see that the optimizer has decided to do a full table scan of the companies table, filter the rows on employees, perform a lookup join with the countries table, and finally sort the results by employees in descending order.
  • Scalar expressions (algebraic expressions such as tax / total * 100) are originally represented by nested tree.Expr AST nodes and implement a visitor pattern for analysis and transformation.

Planner


// planner is the centerpiece of SQL statement execution combining session
// state and database state with the logic for SQL execution. It is logically
// scoped to the execution of a single statement, and should not be used to
// execute multiple statements. It is not safe to use the same planner from
// multiple goroutines concurrently.
//
// planners are usually created by using the newPlanner method on a Session.
// If one needs to be created outside of a Session, use makeInternalPlanner().
type planner struct {
	txn *kv.Txn

	// isInternalPlanner is set to true when this planner is not bound to
	// a SQL session.
	isInternalPlanner bool

	// Corresponding Statement for this query.
	stmt Statement

	instrumentation instrumentationHelper

	// Contexts for different stages of planning and execution.
	semaCtx         tree.SemaContext
	extendedEvalCtx extendedEvalContext

	// sessionDataMutator is used to mutate the session variables. Read
	// access to them is provided through evalCtx.
	sessionDataMutator *sessionDataMutator

	// execCfg is used to access the server configuration for the Executor.
	execCfg *ExecutorConfig

	preparedStatements preparedStatementsAccessor

	// avoidCachedDescriptors, when true, instructs all code that
	// accesses table/view descriptors to force reading the descriptors
	// within the transaction. This is necessary to read descriptors
	// from the store for:
	// 1. Descriptors that are part of a schema change but are not
	// modified by the schema change. (reading a table in CREATE VIEW)
	// 2. Disable the use of the table cache in tests.
	avoidCachedDescriptors bool

	// If set, the planner should skip checking for the SELECT privilege when
	// initializing plans to read from a table. This should be used with care.
	skipSelectPrivilegeChecks bool

	// autoCommit indicates whether we're planning for an implicit transaction.
	// If autoCommit is true, the plan is allowed (but not required) to commit the
	// transaction along with other KV operations. Committing the txn might be
	// beneficial because it may enable the 1PC optimization.
	//
	// NOTE: plan node must be configured appropriately to actually perform an
	// auto-commit. This is dependent on information from the optimizer.
	autoCommit bool

	// cancelChecker is used by planNodes to check for cancellation of the associated
	// query.
	cancelChecker *cancelchecker.CancelChecker

	// isPreparing is true if this planner is currently preparing.
	isPreparing bool

	// curPlan collects the properties of the current plan being prepared. This state
	// is undefined at the beginning of the planning of each new statement, and cannot
	// be reused for an old prepared statement after a new statement has been prepared.
	curPlan planTop

	// Avoid allocations by embedding commonly used objects and visitors.
	txCtx                 transform.ExprTransformContext
	nameResolutionVisitor schemaexpr.NameResolutionVisitor
	tableName             tree.TableName

	// Use a common datum allocator across all the plan nodes. This separates the
	// plan lifetime from the lifetime of returned results allowing plan nodes to
	// be pool allocated.
	alloc *rowenc.DatumAlloc

	// optPlanningCtx stores the optimizer planning context, which contains
	// data structures that can be reused between queries (for efficiency).
	optPlanningCtx optPlanningCtx

	// noticeSender allows the sending of notices.
	// Do not use this object directly; use the BufferClientNotice() method
	// instead.
	noticeSender noticeSender

	queryCacheSession querycache.Session

	// contextDatabaseID is the ID of a database. It is set during some name
	// resolution processes to disallow cross database references. In particular,
	// the type resolution steps will disallow resolution of types that have a
	// parentID != contextDatabaseID when it is set.
	contextDatabaseID descpb.ID
}
  • These normalization rules are compiled into Go code along with a norm.Factory that builds normalized memo expressions.
  • Now that we’ve built a normalized memo, buildExecMemo() goes on to optimize the plan using xform.Optimizer. This explores possible query plans by applying Optgen transformation rules to generate equivalent memo groups, and tries to pick the best combination through cost-based optimization based on table statistics.
  • opt/optbuilder
    • in-order depth-first recursive traversal of the AST;
    • invokes semantics checks on the way;
    • constructs a tree of “relational expression nodes”. This tree is also called “the memo” because of the data structure it uses internally.
    • the resulting tree is the logical plan.

References