EducationSoftwareStrategy.com
StrategyCommunity

Knowledge Base

Product

Community

Knowledge Base

TopicsBrowse ArticlesDeveloper Zone

Product

Download SoftwareProduct DocumentationSecurity Hub

Education

Tutorial VideosSolution GalleryEducation courses

Community

GuidelinesGrandmastersEvents
x_social-icon_white.svglinkedin_social-icon_white.svg
Strategy logoCommunity

© Strategy Inc. All Rights Reserved.

LegalTerms of UsePrivacy Policy
  1. Home
  2. Topics
  3. KB485086: Best practices for performance tuning based on PostgreSQL 13

KB485086: Best practices for performance tuning based on PostgreSQL 13


Junjie Shi

Quality Engineer, Senior • MicroStrategy


This article provides best practices for optimizing performance using PostgreSQL 13. This article covers which drivers retrieve data faster, how to tune the database to improve performance, authentication methods, and more.

Starting with the release of Strategy ONE (March 2024), dossiers are also known as dashboards.

Overview


This article describes best practices to be taken when connecting an instance of PostgreSQL in Strategy. Understanding your data characters and choosing the matched configuration could enhance the performance and accelerate the process to view the latest data. The following key areas are covered in this article:

  • Which driver can retrieve data faster?
  • How to tune database to improve performance
  • How to configure DSN/DB instance to improve performance
  • How does the new feature in Strategy 2021 work with PostgreSQL?
  • Strategy supported authentication methods on PostgreSQL

Introduction of PostgreSQL

PostgreSQL, also known as Postgres, is a free and open source relational database management system (RDBMS) emphasizing extensibility and SQL compliance. It was originally named POSTGRES, referring to its origins as a successor to the Ingres database developed at the University of California, Berkeley.
The support level of on-premise PostgreSQL within Strategy is Diamond. For Diamond gateways, Strategy proactively runs end-to-end regression tests and performance tests to ensure that our gateways are optimized.
Strategy has certified PostgreSQL 13.x for the following components.
 

Strategy Component

PostgreSQL Version Support

Warehouse

13.x

Metadata

13.x

Platform Analytics

13.x

Collaboration Server

13.x

History List

13.x

Besides on-premise PostgreSQL, Strategy is also expanding our support to cloud based PostgreSQL. For example, Amazon Aurora PostgreSQL is already supported within Strategy.

Gateway Driver

ODBC vs JDBC


Strategy ships three PostgreSQL drivers on Strategy 2021 Update 2, as shown in the following table. 

Driver Type

Driver

Windows

Linux

Driver Version

Note

ODBC

MicroStrategy ODBC Driver for PostgreSQL Wire Protocol

<install path>\ODBC\mypsql64.dll

<install path>/install/lib/MYpsql64.so

7.10.06.381

Provided by the MicroStrategy driver vendor

ODBC

PostgreSQL Unicode (x64)

<install path>\PostgreSQL\1202\bin\psqlodbc35w.dll

<install path>/install/pgsql-12/lib/psqlodbcw.so

12.02.0000

Native PostgreSQL ODBC driver

JDBC

com.microstrategy.jdbc.postgresql.PostgreSQLDriver

<install path>\JDBC\mypostgresql.jar

<install path>/install/JDBC/mypostgresql.jar

5.1.4.000315

Provided by MicroStrategy driver vendor


*Shipped driver refers to Data Direct driver which is installed along with the MircroStrategy Intelligence server
*Native driver refers to PostgreSQL ODBC driver (PostgreSQL Unicode (x64))
 

Cube Publish Performance Test Result 


It's suggested to use Strategy ODBC Driver for PostgreSQL Wire Protocol and JDBC Driver (com.Strategy.jdbc.postgresql.PostgreSQLDriver) to get a more integrated and thorough support since we have a fully certified shipped driver from all aspects, including connectivity, functionality, security, and performance.
The following chart shows the throughput with different drivers. The PostgreSQL ODBC Driver has the best cube publish performance as it has the highest throughput. The test is for a single cube (4M rows) republish executed on the Strategy 2021 Update 2 release with platform as 32 CPU, 244 GB memory Amazon Linux machine. The ODBC connection uses 3.x ODBC calls by default. 

ka0PW0000001JllYAE_0EM4W0000028zIt.jpeg

Prior to Strategy 2021 Update 2, PostgreSQL JDBC has a limitation to fetch large size data set (>5 Gb). This limitation has been addressed in the Strategy 2021 Update 2 release. 
 

Database Instance Configuration 

ODBC Version 2 and 3

By default, Strategy uses ODBC call 3.x for PostgreSQL since Strategy 2021 Update 1. 
For a single cube publish, using 3.x ODBC calls have approximately 30% performance improvement than using 2.0 ODBC calls. 
The below test is based on a 4M rows dataset which shows the throughput improvement with ODBC call 3.5 compared to ODBC call 2.0.

ka0PW0000001JllYAE_0EM4W0000028zKG.jpeg

You can change the ODBC version in the Strategy Database Instance level, as described below.
Select the database instances, right-click it, and select the Use 3.x ODBC calls option. 

ka0PW0000001JllYAE_0EM4W00000293tA.jpeg

Tuning PostgreSQL Database


This section describes tuning PostgreSQL for an the Analytical workload based on internal tests in Strategy.
The tuning requires administrative privilege against the PostgreSQL database server.

Parameter & Operation

Initial Value

Changed To

Performance Gain

% Execution Time of Original Value 

shared_buffer

128 MB

1280 MB

Metadata garbage collection time reduces from 5724 sec to 47 sec.

0.82%

work_mem

4 MB

50 MB

Metadata garbage collection time reduces from 47 sec to 17 sec.

36.2%

maintenance_work_mem

64 MB

1536 MB

4 times performance gain on the index creation on platform_analytics_wh.access_transactions table.

25%

max_parallel_maintenance_workers

2

8

random_page_cost

4

1

Execution time of query "select tran_id from access_transactions as a join temp_lu_object as b on a.object_id=b.object_id limit 1000;" in Platform Analytics get reduced from 214185 ms to 2.3 ms. 

<0.01%

enable_nestloop

on

off

Table scan time reduces from 585 sec to 8 sec. 

1.36%

vacuum

NA

NA

Table size reduces from 110 GB to 1.5 MB after vacuuming the table platform_analytics_wh.access_transactions

<0.01%

  • shared_buffers
    • Sets the amount of memory the database server uses for shared memory buffers. The default is typically 128 megabytes (128MB), but might be less if your kernel settings do not support it. This setting must be at least 128 kilobytes. However, settings significantly higher than the minimum are usually needed for good performance.
    • If you have a dedicated database server with 1GB or more of RAM, a reasonable starting value for shared_buffers is 25% of the memory in your system. There are some workloads where even larger settings for shared_buffers are effective, but because PostgreSQL also relies on the operating system cache, it is unlikely that an allocation of more than 40% of RAM to shared_buffers will work better than a smaller amount.
    • On systems with less than 1GB of RAM, a smaller percentage of RAM is appropriate, so as to leave adequate space for the operating system.
  • work_mem
    • Sets the base maximum amount of memory to be used by a query operation (such as a sort or hash table) before writing to temporary disk files. The default value is 4 megabytes (4MB).
    • Increase the value could improve the performance of sorting operation. Sort operations are used for ORDER BY, DISTINCT, and merge joins.
    • Related article: KB483745  
    • Example:
      By increasing work_mem from 4MB to 50 MB, Garbage Collection of the PostgreSQL metadata improves from 47 seconds to 17 seconds.
  • maintenance_work_mem
    • Specifies the maximum amount of memory to be used by maintenance operations, such as VACUUM, CREATE INDEX, and ALTER TABLE ADD FOREIGN KEY. It defaults to 64 megabytes (64MB). Since only one of these operations can be executed at a time by a database session, and an installation normally doesn't have many of them running concurrently, it's safe to set this value significantly larger than work_mem. Larger settings might improve performance for vacuuming and for restoring database dumps.
    • Example:
      By setting maintenance_work_mem='1536MB' and setting max_parallel_maintenance_workers=8, the result is a four times performance gain on the index creation on platform_analytics_wh.access_transactions table.
  • random_page_cost
    • Sets the planner's estimate of the cost of a non-sequentially-fetched disk page. The default is 4.0.
    • Reducing this value relative to seq_page_cost will cause the system to prefer index scans; raising it will make index scans look relatively more expensive.
    • Example:
      By setting rondom_page_cost from 4 to 1, we reduce the execution time of query “select tran_id from access_transactions as a join temp_lu_object as b on a.object_id=b.object_id limit 1000;” in Platform Analytics from 214185 ms to 2.3 ms.
  • nestloop
    • If your queries take a long time to execute, you can check the execution plan of the query to see if nested loops are used by executing the Explain analyze %detail query% in the dbquery tool.
    • Related article: KB483814
    • Example:
      In the example of KB483814, nestloop is used to scan the access transactions table 259 times, which takes 585 seconds. After disabling nestloop, the same query takes 8 seconds to execute.
  • VACUUM
    • VACUUM reclaims storage occupied by dead tuples. In a normal PostgreSQL operation, tuples that are deleted or obsoleted by an update are not physically removed from their table; they remain present until a VACUUM is done. Therefore, it's necessary to do a VACUUM periodically, especially on frequently-updated tables.
    • Example:
      The test table 'platform_analytics_wh.access_transactions' has only 621 rows, but the table size is 110GB, and the free percent of data file is 99+%. It will lead to unnecessary data IO and bad performance for all queries on this table. After shrinking this table (using command VACUUM full platform_analytics_wh.access_transactions;), the table size could be reduced to 1.5MB. The performance will improve significantly.
      • Select count(*) from platform_analytics_wh.access_transactions, takes less than 1 second. Before shrinking, it need 270 seconds.
      • CREATE INDEX access_transactions_update_ts2 ON platform_analytics_wh.access_transactions(update_ts); takes less than 1 second.

New Strategy Features

Parameterized Queries Since Strategy 2021


Parameterized queries are SQL queries that can use placeholders for data. Using placeholders allows these queries to be re-used.For details, see KB484512. The following is an example of a parameterized query:


select	distinct "a11"."CUSTOMER_ID"  "CUSTOMER_ID",
	CONCAT("a11"."CUST_LAST_NAME", "a11"."CUST_FIRST_NAME")  "CustCol_12"
from	"lu_customer"	"a11"
where	CONCAT("a11"."CUST_LAST_NAME", "a11"."CUST_FIRST_NAME") like ?
with parameters:
	%?%

 

How to Enable/Disable Parameterized Query

Starting in Strategy 2021, parameterized query is turned on by default. You can enable or disable this feature in the Database Connections dialog, as shown below.
Select the Use parameterized queries checkbox.

ka0PW0000001JllYAE_0EM4W00000295tg.jpeg

 

Data Engine Version Upgrade Since Strategy 2021


In our internal UB performance test, when using Data Engine version 2021 in Strategy 2021, there is an overall 35% performance gain compared to using Data Engine version 10.4.
Starting in Strategy 2021, the default Data Engine version is 2021. For detailed changes, see Data Engine changes in MicroStrategy 2021 release.
For example, with Data Engine version 2020 and above, Strategy pushes filters down to the database for execution to minimize the intermediate results set size. This pushdown can significantly improve the performance of dossier execution. For more details, see KB483573.
The following chart shows the throughput change with different Data Engine versions. The throughput of Data Engine version 2021 is higher than Data Engine 10.4. The test is executed with 50 concurrent connections.

ka0PW0000001JllYAE_0EM4W00000295u0.jpeg

Unified Quoting Identifier Since Strategy 2020


Starting in Strategy 2020, a new feature is introduced where the correct quotes is applied to all identifiers.
For PostgreSQL, double quotes are used in SQL generation. The following is an example SQL:


select	"a11"."badge_id"  "badge_id_bigint",
	"a11"."Badge_Name"  "Badge_Name"
from	"idm_intel2"."badge"	"a11"
where	(("a11"."badge_id")
 in	(select	"s23"."badge_id"
	from	"idm_intel2"."LU_BD_HISTORY"	"s22"
		join	"idm_intel2"."REL_BADGE_BDHISTORY"	"s23"
		  on 	("s22"."bd_history_id" = "s23"."bd_history_id")
	where	"s22"."directorate_id" in (40)))

In Strategy 2021, we applied a more visible entry in VLDBs for user to turn on/off the quoting feature in Strategy Developer (under Project Configuration > Advanced > Project-Level VLDB settings > Query Optimizations).

ka0PW0000001JllYAE_0EM4W00000296RY.jpeg

Other VLDB Settings That Affect Query Generation


The Strategy platform provides VLDB settings for all supported RDBMS platforms to generate optimized SQL that takes advantage of database-specific functionality.

Authentication Methods


The following table describes the authentication methods supported by PostgreSQL. For details, please refer to this PostgreSQL documentation.

Authentication Method

Note

MicroStrategy Supported? 

Password Authentication: 
  • scram-sha-256
  • md5
  • password

Requires that user sends a password.

Yes-
A known issue has been fixed in MicroStrategy 2021 Update 3 where SSL connection would fail with the scram-sha-256 method when using MicroStrategy ODBC driver for PostgreSQL.

GSSAPI Kerberos Authentication

Relies on a GSSAPI-compatible security library. Typically, this is used to access an authentication server, such as a Kerberos or Microsoft Active Directory server.

Yes-
Supported since MicroStrategy 2020 Update 3 and MicroStrategy 2021. For details, see KB484540 and KB484541.

Certificate Authentication

Requires an SSL connection and authenticates users by checking the SSL certificate they send.

Yes-
Supports only with native ODBC driver. For details, see KB484539.

LDAP Authentication 

Relies on an LDAP authentication server.

Yes

Other authentication modes

See this PostgreSQL documentation.

On the roadmap. 

 

Data Encryption: SSL/TLS


Transport Layer Security (TLS) protocol enables a client to validate the certificate of a server, and provides private communication between the client and server by use of encryption. TLS configuration requires both the server and client configuration. Your system/database administrator may have already configured on the server side, or you can refer to the PostgreSQL documentation for more details. For client configuration of Strategy as the PostgreSQL client, refer to KB44454. Strategy uses TLS v1.2 by default.

Examples of PostgreSQL SSL Connection Strings

  • Strategy ODBC Driver for PostgreSQL Wire Protocol:
    DRIVER={Strategy ODBC Driver for PostgreSQL Wire Protocol}; HostName={hostname}; PortNumber={port}; Database={database};AuthenticationMethod=1; EncryptionMethod=1; ValidateServerCertificate=0;

  • PostgreSQL Unicode (x64):
    DRIVER={PostgreSQL Unicode(x64)}; Servername={hostname}; Port={port}; Database={database}; DriverUnicodeType=1;UseDeclareFetch=1;Fetch=1000;SSLmode=require;

  • com.Strategy.jdbc.postgresql.PostgreSQLDriver:
    JDBC;DRIVER={com.Strategy.jdbc.postgresql.PostgreSQLDriver};URL={jdbc:Strategy:postgresql://{hostname}:{port};DATABASENAME={database};EncryptionMethod=SSL;ValidateServerCertificate=false;fetchTWFSasTime=TRUE};

Appendixes

Connection Options for PostgreSQL Driver

  • MicroStrategy ODBC Driver for PostgreSQL Wire Protocol
  • PostgreSQL Unicode (x64)
  • com.microstrategy.jdbc.postgresql.PostgreSQLDriver

Strategy Database Instance Configuration

Job Prioritization


Administrators can, based on priority, specify the number of warehouse connections that are required for efficient job processing. There are three possible priorities for a job: high, medium and low.
Administrators are not required to set medium and high connections, but must set at least one low connection, because low priority is the default job priority.
The optimal number of connections is dependent on several factors, however the main criterion to consider when setting the number of connections is the number of concurrent queries the warehouse database can support.

ka0PW0000001JllYAE_0EM4W00000297Ez.jpeg

Connection Management


The administrator could also manage database connection threads by some given fields. The settings which used to control database connection are mainly located on Database Instance level, see below screenshots:

ka0PW0000001JllYAE_0EM4W00000297F4.jpeg

How to Edit VLDB Settings

  • Project Level: Right click on the project name > Project Configuration > Advanced > Configure
ka0PW0000001JllYAE_0EM4W00000297F9.jpeg
  • Database Instance Level: Database Instance > VLDB Properties
ka0PW0000001JllYAE_0EM4W00000297FE.jpeg
  • Report Level: Data > VLDB Properties
ka0PW0000001JllYAE_0EM4W00000297Fn.jpeg
  • Unhide Parallel Options: Tools > Show Advanced Settings
ka0PW0000001JllYAE_0EM4W00000297Fx.jpeg

 


Comment

0 comments

Details

Knowledge Article

Published:

July 13, 2021

Last Updated:

March 21, 2024