EducationSoftwareStrategy.com
StrategyCommunity

Knowledge Base

Product

Community

Knowledge Base

TopicsBrowse ArticlesDeveloper Zone

Product

Download SoftwareProduct DocumentationSecurity Hub

Education

Tutorial VideosSolution GalleryEducation courses

Community

GuidelinesGrandmastersEvents
x_social-icon_white.svglinkedin_social-icon_white.svg
Strategy logoCommunity

© Strategy Inc. All Rights Reserved.

LegalTerms of UsePrivacy Policy
  1. Home
  2. Topics
  3. Amazon Redshift Query Execution

Amazon Redshift Query Execution


Community Admin

• Strategy


This example contains commands in Amazon Redshift for processing a Data manipulation language (DML) query.

EXPLAIN Operators

Query Execution Steps

Description

SCAN

Sequential Scan

scan

Amazon Redshift relation scan or table scan operator or step. Scans whole table sequentially from beginning to end; also evaluates query constraints for every row (Filter) if specified with WHERE clause. Also used to run INSERT, UPDATE, and DELETE statements.

JOINS: Amazon Redshift uses different join operators based on the physical design of the tables being joined, the location of the data required for the join, and specific attributes of the query itself. Subquery Scan -- Subquery scan and append are used to run UNION queries.

Nested Loop

nloop

Least optimal join; mainly used for cross-joins (Cartesian products; without a join condition) and some inequality joins.

Hash Join

hjoin

Also used for inner joins and left and right outer joins and typically faster than a nested loop join. Hash Join reads the outer table, hashes the joining column, and finds matches in the inner hash table. Step can spill to disk. (Inner input of hjoin is hash step which can be disk-based.)

Merge Join

mjoin

Also used for inner joins and outer joins (for join tables that are both distributed and sorted on the joining columns). Typically the fastest Amazon Redshift join algorithm, not including other cost considerations.

AGGREGATION: Operators and steps used for queries that involve aggregate functions and GROUP BY operations.

Aggregate

aggr

Operator/step for scalar aggregate functions.

HashAggregate

aggr

Operator/step for grouped aggregate functions. Can operate from disk by virtue of hash table spilling to disk.

GroupAggregate

aggr

Operator sometimes chosen for grouped aggregate queries if the Amazon Redshift configuration setting for force_hash_grouping setting is off.

SORT: Operators and steps used when queries have to sort or merge result sets.

Sort

sort

Sort performs the sorting specified by the ORDER BY clause as well as other operations such as UNIONs and joins. Can operate from disk.

Merge

merge

Produces final sorted results of a query based on intermediate sorted results derived from operations performed in parallel.

EXCEPT, INTERSECT, and UNION operations:

SetOp Except [Distinct]

hjoin

Used for EXCEPT queries. Can operate from disk based on virtue of fact that input hash can be disk-based.

Hash Intersect [Distinct]

hjoin

Used for INTERSECT queries. Can operate from disk based on virtue of fact that input hash can be disk-based.

Append [All |Distinct]

save

Append used with Subquery Scan to implement UNION and UNION ALL queries. Can operate from disk based on virtue of "save".

Miscellaneous/Other:

Hash

hash

Used for inner joins and left and right outer joins (provides input to a hash join). The Hash operator creates the hash table for the inner table of a join. (The inner table is the table that is checked for matches and, in a join of two tables, is usually the smaller of the two.)

Limit

limit

Evaluates the LIMIT clause.

Materialize

save

Materialize rows for input to nested loop joins and some merge joins. Can operate from disk.

--

parse

Used to parse textual input data during a load.

--

project

Used to rearrange columns and compute expressions, that is, project data.

Result

--

Run scalar functions that do not involve any table access.

--

return

Return rows to the leader or client.

Subplan

--

Used for certain subqueries.

Unique

unique

Eliminates duplicates from SELECT DISTINCT and UNION queries.

Window

window

Compute aggregate and ranking window functions. Can operate from disk.

Network Operations:

Network (Broadcast)

bcast

Broadcast is also an attribute of Join Explain operators and steps.

Network (Distribute)

dist

Distribute rows to compute nodes for parallel processing by data warehouse cluster.

Network (Send to Leader)

return

Sends results back to the leader for further processing.

DML Operations (operators that modify data):

Insert (using Result)

insert

Inserts data.

Delete (Scan + Filter)

delete

Deletes data. Can operate from disk.

Update (Scan + Filter)

delete, insert

Implemented as delete and Insert.


Comment

0 comments

Details

Example

Published:

February 11, 2019

Last Updated:

January 31, 2024