Pivotal’s SQL Optimizer, GPORCA, does not handle GiST indexes, making any GPORCA generated plan extremely slow when the input grew large. In this blog post, we will look at what GiST indexes are, how we implemented them in GPORCA, and the resulting performance improvement.
GiST stands for Generalized Search Trees. These trees are a template structure that allows a user to create an index in a database on any complex data type, provided they define a set of seven methods. It is a balanced tree-structured access method that allows users to do more than just the standard less than “<” , equal to “=” or greater than “>” queries when doing an index scan. GiST indexes are particularly great for ranges as well as full text search. Furthermore, using the user-defined methods, GiST tries to cluster data in a way that creates as little overlap as possible.
In order to create a GiST index, the user must define 7 functions: the Consistent, Union, Compress, Decompress, Penalty, Picksplit and Same methods. Then GiST will do the rest of the underlying work required of an index, such as reindex-ing and vacuuming. More information about GiST indexes can be found here: http://gist.cs.berkeley.edu/ or at PostgreSQL 9.5 here: https://www.postgresql.org/docs/9.5/static/gist.html
The user must also define functions for the custom data type that would be used in the predicate. For example, PostGIS has a function called ST_DWithin that returns true given the two points are within a specified distance of each other. We could then use it in a query such as
SELECT * FROM foo, bar where ST_DWithin(foo.a, bar.b, 0.0005), which would give all the rows where point ‘a’ from foo and point ‘b’ from bar are within 0.0005 meter from each other.
Greenplum DB ships with operator classes for some data types (such as Point, Box or Polygon) that can use a GiST index but it is also possible to install extensions like PostGIS that include data types like geometry which can can be used in a GiST index.
In the Greenplum Database, there are two query optimizers: Planner and GPORCA (designed specifically for the MPP environment to help accelerate queries). Currently, Planner in Greenplum Database supports GiST indexes and can generate an optimal plan that efficiently uses the GiST index available. However, GPORCA - Pivotal’s SQL Optimizer - is not GiST aware and therefore selects a query plan that does not use any available GiST indexes. The result: a query plan that takes orders of magnitudes longer than a plan that uses the GiST index.
Say that we had two tables called foo and bar that each had a column called
geom of type geometry. Geometry is a GiST-indexable data type from PostGIS that is commonly used for spatial and geographical queries. We now want to find the number of points that are within 0.0005 meters of each other.
Since it is not GIST-aware, the optimal plan generated by GPORCA uses two Table Scans inside a nested loop join. This can be significantly slow in execution if the tables have a large number of rows.
EXPLAIN SELECT count(*) FROM foo, bar WHERE ST_DWITHIN(foo.geom, bar.geom, 0.0005); QUERY PLAN ---------------------------------------------------------------------------------------------------- Aggregate (cost=0.00..11647466.55 rows=1 width=8) -> Gather Motion 3:1 (slice2; segments: 3) (cost=0.00..11647466.55 rows=1 width=8) -> Aggregate (cost=0.00..11647466.55 rows=1 width=8) -> Nested Loop (cost=0.00..11647466.55 rows=26995466 width=1) Join Filter: foo.geom && st_expand(bar.geom, 0.0005::double precision) AND bar.geom && st_expand(foo.geom, 0.0005::double precision) AND _st_dwithin(foo.geom, bar.geom, 0.0005::double precision) -> Broadcast Motion 3:3 (slice1; segments: 3) (cost=0.00..431.18 rows=300 width=32) -> Table Scan on bar (cost=0.00..431.00 rows=100 width=32) -> Table Scan on foo (cost=0.00..443.11 rows=333654 width=32) Optimizer status: PQO version 2.65.1 Time: 226.372 ms SELECT count(*) FROM foo, bar WHERE ST_DWITHIN(foo.geom, bar.geom, 0.0005); count ------- 1921 (1 row) Time: 302930.381 ms
This plan generated by GPORCA takes a total of 303 seconds in execution, which is quite long for a simple nested loop join. In contrast, the same query run by the planner using the GiST index in its plan, produced the results in under a second.
EXPLAIN SELECT count(*) FROM foo, bar WHERE ST_DWITHIN(foo.geom, bar.geom, 0.0005) QUERY PLAN ---------------------------------------------------------------------------------------------------- Aggregate (cost=702871.96..702871.97 rows=1 width=8) -> Gather Motion 3:1 (slice2; segments: 3) (cost=702871.89..702871.94 rows=1 width=8) -> Aggregate (cost=702871.89..702871.90 rows=1 width=8) -> Nested Loop (cost=93.11..702871.64 rows=34 width=0) Join Filter: bar.geom && st_expand(foo.geom, 0.0005::double precision) AND _st_dwithin(foo.geom, bar.geom, 0.0005::double precision) -> Broadcast Motion 3:3 (slice1; segments: 3) (cost=0.00..18.00 rows=300 width=32) -> Seq Scan on bar h (cost=0.00..6.00 rows=100 width=32) -> Bitmap Heap Scan on foo (cost=93.11..753.92 rows=34 width=32) Recheck Cond: foo.geom && st_expand(bar.geom, 0.0005::double precision) -> Bitmap Index Scan on foo_geom (cost=0.00..93.09 rows=34 width=0) Index Cond: foo.geom && st_expand(bar.geom, 0.0005::double precision) Time: 1.988 ms SELECT count(*) FROM foo, bar WHERE ST_DWITHIN(foo.geom, bar.geom, 0.0005); count ------- 1921 (1 row) Time: 87.074 ms
As can be seen above, the plan generated by the planner is at least 3000x (87ms vs 302930ms) faster than GPORCA.
In order to find the fastest way to execute these SQL queries using GiST indexes, GPORCA needs to become GiST aware. To achieve this, GPORCA needs to first receive information regarding the GiST index and it needs to know how to generate plans using the index information.
When a SQL statement is given, the information is first translated into DXL (a system-independent XML representation of the query) and sent to GPORCA to be optimized. During this process, only the information necessary for the query and basic information about the involved tables are sent. This can include statistic information, whether or not the table contains an index, and what type of index it is. Since GPORCA did not implement support for GiST indexes, we did not send any GiST index information over at all. This meant that any table with a GiST index was sent to GPORCA as a table that contained no index at all.
The first step of this process: send information about GiST indexes to GPORCA. In Planner, GiST indexes are treated as a general index. What this means is that GiST indexes can follow either the Bitmap Index path or the B-Tree index path when creating a plan. That is, during the intermediate stages of planning, GIST indexes appear either as Bitmap Indexes or B-Tree indexes. But, when the plan is finally executed, , the executor recognizes (based off the index’s unique access method id) that the actual index to be used during execution is the GiST index and not the index type printed in the plan.
When sending index information, GPORCA requires a few key components: The index’s unique access method id, the index’s type and the columns the index is on. For Bitmap and B-Tree, which GPORCA is already aware of, the index type is, respectively, Bitmap and Btree.
Our next step was to determine whether a new index type was required for GiST indexes. We tried sending over GiST index information with the correct unique access method id, but with the index type as type Bitmap. We quickly realized that though this was feasible, there are certain conditions that are GiST specific. For example, Bitmap indexes can only be used if the predicate is a standard predicate. However, with GiST, standard predicates are almost never used. In order to make an ORCA generated plan using GiST while following the Bitmap Index path, we needed to either set the predicate type as a standard query (which is not ideal) or we needed to be able to differentiate when we were working with a GiST index versus a Bitmap Index. When sending the index over as a Bitmap type, we lost the ability to make such distinctions within GPORCA’s optimization process and the ability to generate a B-Tree path for GiST indexes. So, in order to deal with this, any solution to make GPORCA GiST aware would involve the creation of a new index type within GPORCA specifically for GiST so that a distinction could be made between different index types when necessary.
With the addition of the new GiST index type, we considered two implementation alternatives in GPORCA:
The first alternative is to mimic what the planner does. GPORCA could allow GiST indexes to take either the B-Tree or Bitmap path, generating alternatives for both before picking an optimal plan during costing.
|- Use of existing optimized paths||- Costing would be done based off path taken instead of a GiST specific cost model|
|- No additional changes necessary to be able to execute the plan generated||- GiST indexes would be disabled if bitmap and btree indexes are disabled.|
|- Similar to an implementation that has already proven to work (planner)|
|- Support for partitioned tables and Append-Only tables would alredy be implemented|
Instead of allowing GiST indexes to follow either the B-Tree or Bitmap path, GPORCA would have a separate path in the code base (much like how Bitmap and Btree do) that would be specific to GiST. This would allow a different alternative altogether separate from the B-Tree and Bitmap path with its own costing and transforms.
|- A GiST specific path that could be configurable via GUCs||- Duplication of existing code by creating new transforms/classes|
|- A cost model specific to GiST||- Addition of Executor nodes/or a translation back into existing nodes|
|- Adding support for partitioned tables and Append-Only tables would be slow and incremental|
When exploring the first alternative, we realized that the addition of the new index type and a few extra conditional checks, GiST would have full support in GPORCA. This includes partitioned tables as well as Append Only Row / Column Oriented tables. In contrast, research into the second alternative indicated that much of the Bitmap and B-Tree transforms would have been duplicated in the process of creating a GiST transform. An additional node would also need to be added to the executor for a GiST specific scan as well.
By choosing the first alternative we were able to take advantage of the existing paths for indexes in GPORCA, allowing for full GiST support while minimizing code duplication. Going back to our motivating PostGIS example, we see that plan generated by GPORCA now matches that created by the planner.
EXPLAIN SELECT count(*) FROM foo, bar WHERE ST_DWITHIN(foo.geom, bar.geom, 0.0005); QUERY PLAN ---------------------------------------------------------------------------------------------------- Aggregate (cost=0.00..344541.02 rows=1 width=8) -> Gather Motion 3:1 (slice2; segments: 3) (cost=0.00..344541.02 rows=1 width=8) -> Aggregate (cost=0.00..344541.02 rows=1 width=8) -> Nested Loop (cost=0.00..344541.02 rows=26995466 width=1) Join Filter: true -> Broadcast Motion 3:3 (slice1; segments: 3) (cost=0.00..431.18 rows=300 width=32) -> Table Scan on bar (cost=0.00..431.00 rows=100 width=32) -> Bitmap Table Scan on foo (cost=0.00..344015.35 rows=37963 width=1) Recheck Cond: foo.geom && st_expand(bar.geom, 0.0005::double precision) Filter: bar.geom && st_expand(foo.geom, 0.0005::double precision) AND _st_dwithin(foo.geom, bar.geom, 0.0005::double precision) -> Bitmap Index Scan on foo_geom (cost=0.00..0.00 rows=0 width=0) Index Cond: foo.geom && st_expand(bar.geom, 0.0005::double precision) Optimizer status: PQO version 2.65.1 Time: 265.687 ms SELECT count(*) FROM foo, bar WHERE ST_DWITHIN(foo.geom, bar.geom, 0.0005); count ------- 1921 (1 row) Time: 309.304 ms
Notice that GPORCA now uses a Bitmap Index Scan in the plan generated instead of a Table Scan. The use of a Bitmap Index Scan in the above plan indicates that the GiST index took the Bitmap path to create the plan. While the plan itself says Bitmap, when the query goes to execution, the actual index used is the GiST index.
The query execution time reduced to 309 milliseconds from 300 seconds, which is 1000x faster than what it was performing before GiST support. Meanwhile, GPORCA’s query optimization time stays the same (around 250 ms).
After an initial run of the
installcheck-good test suite for GPDB, we observed a clear performance improvement among the different test cases, even with the addition of 4 new tests.
|Test Name||Before||After||% Improvement|
|qp_gist_indexes2||196.23 sec||110.62 sec||44%|
|qp_gist_indexes3||19.83 sec||13.75 sec||33%|
|qp_gist_indexes4||67.67 sec||50.66 sec||25%|
While GiST is now supported in GPORCA, there is still more work to be done. In regards to GiST indexes themselves, they currently do not support partial indexes or index expressions (such as IS NULL or NOT). The cost model still follows that of the Bitmap/B-Tree indexes and further performance tests are necessary to determine the best cost model for GiST indexes.
Additionally, there are other indexes that are not yet supported in GPORCA such as GIN or Hash indexes. However, these can be implemented in a manner similar to GIST index.
GiST indexes are a versatile template index structure that allows for the creation of indexes on custom data types. In the Greenplum Database, GPORCA originally did not handle GiST indexes, making any GPORCA generated plan extremely slow when the input grew large. We compared two different alternatives and chose the path that avoided excessive code duplication. Our final fix took advantage of existing index paths in GPORCA to allow the creation of GiST index plans. This created no/minor differences in the time it took to optimize, but is 1000x faster to run than the original plan.
 Consistent returns false if, given a predicate on a tree page, the user query and predicate is not true, and returns maybe otherwise.
 Union consolidates information in the tree.
 Compress converts the entry into a suitable format for storage. This is usually what makes GiST indexes lossy.
 Reverse of compress.
 Penalty tells you the cost of inserting the entry into a path would be, it will pick the cheapest path.
 PickSplit helps decide which entries go to which page when an insert requires a page split.
 Same returns true if the two entries are the same.