EducationSoftwareStrategy.com
StrategyCommunity

Knowledge Base

Product

Community

Knowledge Base

TopicsBrowse ArticlesDeveloper Zone

Product

Download SoftwareProduct DocumentationSecurity Hub

Education

Tutorial VideosSolution GalleryEducation courses

Community

GuidelinesGrandmastersEvents
x_social-icon_white.svglinkedin_social-icon_white.svg
Strategy logoCommunity

© Strategy Inc. All Rights Reserved.

LegalTerms of UsePrivacy Policy
  1. Home
  2. Topics

Amazon Redshift Performance Tuning - Statistics


Norman Matos

Associate Scrum Master • Strategy


This article contains scripts that can be run to obtain statistics about tables and data blocks being stored, to provide insight for query planning.

Amazon Redshift, like other databases, requires statistics about tables and the composition of data blocks being stored in order to make good decisions when planning a query. Run one of those scripts to determine which tables are missing statistics, or contain outdated statistics:


SELECT database, schema || '.' || "table" AS "table", stats_off 
FROM info
WHERE stats_off > 5 
ORDER BY 2;

 


 


SELECT substring(trim(plannode),1,100) AS plannode,COUNT(*)
FROM dwk
WHERE plannode LIKE '%missing statistics%'
GROUP BY plannode
ORDER BY 2 DESC; 
Whenever you insert, delete, or update a significant number of rows, you should run a VACUUM command and then an ANALYZE command. Gather table statistics: 
vacuum t1;
analyze t1;

 

 

The entire database should be analyzed weekly. Also, you should plan for a daily stats collection job for most critical tables/volatile tables. To avoid a resource intensive VACUUM operation, you can load the data in sort key order, or design your table maintain data for a rolling time period, using time series tables.
 


Comment

0 comments

Details

Example

Published:

November 7, 2017

Last Updated:

November 7, 2017