
Hello! We are the development team of the Greengage DB open-source project, and we would like to invite you to join us as a contributor. In this article, we will discuss how you can become a part of the Greengage community and how our process of external patch review works. Together, we will see what happens to a patch before and after it is submitted for review, as well as explore the basic steps and rules that will help speed up the acceptance of your pull requests by the architectural committee of the project.
Before we start talking about Greengage DB, here are a couple of interesting figures that speak for themselves:
According to GitHub, up to 70% of employers consider contributions to open-source projects to be a plus when hiring.
According to StackOverflow, 85% of participants in open-source projects improve their coding skills.
So, being a contributor, you get at least a public portfolio and opportunities to enhance your skills through interesting tasks. Other possible advantages of participating in open-source can be mentorship and networking in the community, as well as a sense of belonging to a large project.
Greengage DB is an open-source massively parallel processing (MPP) DBMS based on Greenplum, the MPP fork of PostgreSQL designed for OLAP loads. Greengage DB is suitable for building data warehouses (DWH) and processing large amounts of information, estimated in terabytes of data.
The Greengage DB project was created as an open-source alternative to Greenplum, whose repository was archived by Broadcom in May 2024. A team of community members who have made the largest contributions to Greenplum united around the Greengage DB independent fork, available under the Apache 2.0 license. The new project was officially announced at the "New time — new Greenplum" conference on September 19, 2024.
The Greengage DB project was initially developed by the architectural committee and key developers of Greenplum. Currently, the CI/CD process for accepting third-party patches from external contributors is set up in the Greengage DB public repository. Building and publication of Docker images with the Greengage DB build is automated. This means that you can start participating in the development of Greengage DB, helping to improve our project and expanding your own open-source portfolio.
To work on Greengage DB, you will need a number of skills. The project has modules in various programming languages, including Python (used to create utilities for managing a cluster, its configuration, etc.). However, the main part of Greengage DB is C and C++ code, so we have prepared a short checklist of requirements specifically for C developers:
Git skills (the project is hosted on GitHub).
Experience in system programming in C language, proficiency in ANSI C standards.
Basic debugging (gdb) and profiling skills (Valgrind, perf, eBPF tools).
Basics of working in Linux environments and using command line tools.
English for documentation (describe and discuss pull requests).
Knowledge of SQL and skills in managing relational databases (PostgreSQL, MySQL/MariaDB, MS SQL, etc.).
If your experience roughly matches this description, it’s time to try your hand at it!
Select a task. It’s great if you already have an idea of what kind of assistance you would like to provide to the project. In the near future, we plan to create an open bug tracker and a special list of desired features, from which external contributors can choose tasks to work on. As soon as such a list and bug tracker become available, we will notify you about it on our LinkedIn page.
Check out the requirements for pull requests. Following the requirements for the description of pull requests (PRs) will help speed up accepting your PRs into the project. Review them in advance to avoid any possible mistakes when submitting your changes to the Greengage DB architecture committee. The file with requirements is available in the Contributing section of the project repository.
Accept the agreement for contributors. The rights of the contributor and the project are protected by an individual contributor license agreement (ICLA) or corporate contributor license agreement (CCLA). Accepting such an agreement is a standard practice in the open-source world. The agreements are available on our website. Send signed copies of agreements to secretary@greenagedb.org. In the future, we will use eSignature solutions to make the process easier for our contributors.
Read the Code of Conduct. The main principles of the Greengage DB community development and communication between its members are reflected in the Community Code of Conduct. We recommend that you familiarize yourself with this document — it is available on our project website in both Russian and English.
Create a local fork of the Greengage DB repository.
Build the project from the source code as described in the corresponding guide of our documentation.
Make your patch by editing or adding required files. Note that the changes should be located in a branch other than the main branch of the project.
Run related tests locally. Note that all new functionality that is contributed to Greengage DB should be covered by regression tests that are contributed alongside it. Before adding test files, pay attention to the corresponding section in Greengage Pull Request Submission Guidelines.
If all local test runs are successful, submit a pull request (PR) to the main repository. Once submitted, a PR appears in the list of pull requests on the Greengage side. After cursory review on our side, the build and automated-testing processes are initiated. If test runs are successful, code review starts.
On each PR and confirmation of changes by a reviewer (for the absence of malicious or suspicious changes), the build and automated-testing processes are initiated for the branch with the changes. The merge is blocked until these processes are successfully completed, even if the reviewer has selected the auto-merge option.
The testing process can and should be monitored. The general test plan is available on the Summary tab, where you can see what exactly is being tested and how. If you are interested in the content of a particular test, you can get test execution artifacts, as well as see the progress of the test run.
Now, a member of the architectural committee should review PR.
The review time depends on the complexity and size of the patch:
Complex and/or large PR — up to 8 weeks.
PR of medium complexity — up to 4 weeks.
Simple PR — up to 1 week.
Irrelevant PRs are ignored.
A reviewer checks PR description and all modified files, evaluates the benefits of the changes and their potential impact on the safety of the project users, and can leave comments on the code proposed to the project during the review.
For example, the reviewer can send some questions and suggestions as shown here.
In this case, the author was asked to modify the patch so that it takes into account all the specifics of the CREATE TABLE
command syntax.
During the review process, it is important for an author of the code to explain their position and used methods, in relation to which a reviewer has questions.
Once the patch author and the reviewer have agreed that the changes can be accepted into the project, another member of the architectural committee (usually a more experienced one) performs a final review.
So, your first patch has been successfully accepted into the project. This might be enough for your needs. However, if you are an active contributor and regularly devote your time to the project, you might want to influence its development direction.
In Greengage DB, active contributors will also have the opportunity to participate in the decision-making process and help shape the future of the project. Here is a list of roles we plan to introduce as the number of contributors grows:
Community member — any contributor whose code has been accepted into the project.
Technical lead — contributor responsible for monitoring and sorting incoming patches.
Member of the architectural committee — contributor who votes on important decisions.
For now, the Greengage DB team, as the maintainer of the project (repository owner), appoints members of the architectural committee. In the future, the architectural committee will be composed of active community members nominated by technical leads, who have been selected by the architectural committee. The image below illustrates this process.
As soon as we implement this practice and appoint the first community members to the architectural committee, we will notify you on our LinkedIn page.
We welcome your participation in the development of Greengage DB and your suggestions for improving the project and processes. If you would like to discuss your idea with our team before writing the code, describe it in detail in the Issues section of the Greengage repository. You can also leave feature requests there if you are already using Greengage DB. The path of a contributor always starts with personal experience of using a product — our tutorial video and Greengage DB documentation can help you get started. Happy contributing!