Thomas Koch | 16 Sep 11:53 2011
Picon

Improve ASF commit process / GIT

Hi,

I'm currently doing some university lab which involves contributions to the 
ZooKeeper project. For my report I summarized the process of getting a patch 
into an ASF project. You can skip the description and jump to my offer for 
help to make GIT available for the ASF.

<How-It-Is-Today>
The work flow to make changes in the ZooKeeper project is very time-consuming. 
First a patch file needs to be created. This patch then has to be uploaded to 
the issue tracker as well as to the review system. However one usually wants 
to upload the patch to the review system only after the continuous integration 
system did successfully test it. Such a test run is triggered by a cron job 
that scans the issue tracker every 5 minutes for new patches. The test run 
itself takes about 20 minutes.
Uploading the patch to the review system is again a multi step process: Open 
the web interface, select the project, select the patch file in the browser 
file dialog, select the reviewers group, select the corresponding issue, enter 
a summary.
These steps of course need to be repeated if the reviewer demands changes. 
Care must be taken to keep the patch files on the issue tracker and in the 
review system in sync to avoid confusion.
But even if no updates to the patch are demanded, it may be necessary to 
update the patch because changes have been made to the project's trunk in the 
meanwhile and the current patch can not be applied anymore.
If that wouldn't be complicate enough, the process does not work for binary 
files, since the patch files can not express them and is brittle in general. 
It seems for example that the continuous integration system may not always try 
to apply the patches on a clean checkout of the trunk but on some dirty state.
</How-It-Is-Today>

To solve the headaches described above, I propose the introduction of the 
review system Gerrit[1] together with GIT. Then the patch process for the 
committer would be reduced to a simple invocation of "git push gerrit".

[1] http://en.wikipedia.org/wiki/Gerrit_%28software%29

In the background the following things happen:
 * Gerrit either creates a new change to be reviewed or detects an update to 
an existing review
 * The change description is taken from the commit message
 * Gerrit scans the commit message for an issue number and can post a comment 
to the issue tracker
 * The jenkins-gerrit-plugin _immediately_ starts to test the change and posts 
its result as a comment to Gerrit
 * After a successful review Gerrit automatically merges the change in the 
trunk, using the merge capabilities of GIT. Only in rare cases would it be 
necessary to update patches because the same file has changed in trunk.
 * Problems with dirty checkouts in Jenkins or binary files don't exist since 
Jenkins uses GIT instead of handling patch files. Issue tracker and review 
system syncing is not a problem because patches are only in the review system. 
Votes are recorded in Gerrit.

I've set up a Gerrit-Jenkins combo on my own server for my work on ZooKeeper:
Gerrit: http://koch.ro:8081
Jenkins: http://koch.ro:8080

How can I help? I'm living in Switzerland. I'm not an Apache committer.

Best regards,

Thomas Koch, http://www.koch.ro


Gmane