Ever since Pivotal announced to make Greenplum Database Open Source, and we started merging newer PostgreSQL versions, we also decided to support the development of the upstream project. This happens in various ways: by developing and contributing new features, reviewing patches from other contributors, or sending bugfixes to upstream.
Not only is the PostgreSQL Project an awesome project with a thriving community, and it is fun to work with, in the long run this strategy has several advantages for us.
Eventually the Greenplum Database merge process might catch up with PostgreSQL releases, and future versions might stay close to the upstream version. Every feature which is not only in our product but also available in upstream makes the merge process easier for us.
This strategy further closes the gap between the original project and the fork, and makes it easier for users to use both projects in parallel.
Last but not least, bringing features into upstream PostgreSQL “exposes” these features to a broader range of reviewing developers as well as users, and the feature is tested on a wider range of platforms, compared to what is supported by Greenplum Database.
With PostgreSQL 11 entering the home stretch, it is time to look back and see what we contributed, and where we might be able to improve in the future.
This chapter lists three different kind of contributions:
That said, this list is not complete, because we do not count all the small fixes and patches we provided. Let’s just focus on the major items here.
This fixes a problem with alignment of memory chunks, which failed on several 32-bit systems.
The pg_atomic_init_u64 variable was not initialized before, this leads to a failure.
This is a complex patch from multiple authors, which allows adding optional compression in SP-GiST leaf tuples. Originally this functionality was left out intentionally, but looks like PostGIS can make good use of this.
When constants from a flattened subquery are used in a grouping set, the planner might merge these constants into outer expressions. The grouping set will then fail.
While not the actual author of this patch, the underlying framework was provided by us. This feature allows to use the parallel execution features in PostgreSQL to build or rebuild a btree index.
One can say that even Greenplum Database is no longer based on PostgreSQL 8.2, and therefore code which deals with the pre-8.2 module handling can be removed. In reality, PostgreSQL 8.2 is long EOL, and therefore such #ifdef’s can be removed. Found while merging a newer PostgreSQL version into Greenplum Database.
While the funcionality of server-side CRL and CA files was removed before, the documentation was not updated properly. This patch fixes that.
While earlier implementations used random directory names, and removed the test output files after the test, this patch moves to use static names and preserves the output of failing tests. This makes it easier to debug a failing TAP test.
A connection string can include spaces in items, not only between items. Therefore the items, or values, must be quoted properly. It is not enough to just quote the entire string.
Across the lexer code, sometimes a simple strcmp was used to match keywords, sometimes the more sophisticated pg_strcasecmp. The latter adds additional overhead, because the identifiers are already made lowercase. This patch changes the code base to always use strcmp to match identifiers in the lexer.
Previously the isolationtester had a limit of 1024 bytes for each SQL query. That seems a bit low these days, and people already ran into this limit. This patch removes this limit and makes the buffer resizable.
The documentation does not mention WaitForBackgroundWorkerShutdown(), this patch fixes that oversight, and also updates the documentation for WaitForBackgroundWorkerStartup().
For consistency checking of WAL pages, previously only the LSN was masked when the consistency check is performed. This fix adds the page checksum to the exclude list as well, because a changed LSN will change the consistency check even though the data in the page itself is not changed.
When a standby is renamed, there is a race condition that commands can be send async to the old name. This patch fixes that problem.
The PageGetLSN() is only supposed to be used when a process holds an exclusive lock on the buffer. If a process only holds a shared lock on the buffer, the BufferGetLSNAtomic() must be used instead.
A newly introduced assertion uncovered some places which do not follow this rule. This patch fixes all but one of the problems, and the remaining code place is verified as a non-issue.
Although we have a number of people on all kind of PostgreSQL related projects, the following people contributed to the development of PostgreSQL 11:
We have some ideas for upcoming PostgreSQL versions.
Greenplum Database has table partitioning for a long time, that is a feature which was only recently added to PostgreSQL. We want to share some of the experiences we made around this feature, and enhance the functionality in PostgreSQL.
Another existing limitation which we would like to loosen is the length of text type. Currently that type can hold roughly 1 GB, but for some use cases that is not enough.