Jan 21
Permalink

Finding Problems in Commit Messages and Tickets When Migrating to GitHub

Sometimes, you commit a change, but then you realize that you did not mention the ticket ID in your commit message. Or, worse: you use the wrong ticket ID.

The usual remedy is adding this info to the ticket by hand. On the Agavi Trac, when a ticket ID was incorrect in a commit message, we would typically remove the corresponding entry from the wrong ticket and move it to the correct one.

When migrating to GitHub and replaying all commits and ticket changes, that becomes a problem of course: the commit message still contains text like “fixes #123” when in fact it was supposed to reference ticket #124.

It is therefore necessary to programmatically find and mitigate such problem cases and addressing them before importing things on GitHub. Failure to correct just a single of these problems may break the whole import by having tickets remain open randomly even though they were supposed to be closed and so forth. Here is a list of situations that need to be addressed:

Commit Messages Not Showing Up in Their Tickets

Sometimes a reference to a ticket ID in a commit message does not show up in the corresponding ticket. There are several reasons why this may happen:

  1. The message was never supposed to reference a ticket (“Attempt #2 at this stupid merge”) - not a problem in Trac since there is no “refs” or other keyword in front of “#2”, but on GitHub, this commit would show up in ticket #2.
    => Such a commit message must be modified to prevent this situation.
  2. The message contains the wrong ticket ID and was removed from that wrong ticket (and inserted by hand on the correct ticket)
    => Such a commit message must be modified to prevent this situation.
  3. The Trac post-commit hook was not installed or malfunctioning
    => This usually does not require any action.

Differences in Behavior Between Trac and GitHub

Aside from the first point above, there are other subtle but important differences between how Trac and GitHub parse commit messages:

  1. A commit closes several tickets at once without repeating the keyword, e.g. “Add bacon, closes #4 and #5”. This will close both tickets in Trac, but on GitHub, only ticket #4 is closed while ticket #5 is referenced.
    => The commit message must be changed so a “closes” or “fixes” keyword appears before every ticket ID.
  2. A commit shows up just fine in the Trac ticket, but the reference would not be detected by GitHub, e.g. due to a missing space character (“This closes #5for real” would close #5 in Trac but do nothing on GitHub).
    => A space character must be inserted
  3. A commit closes a ticket that was already closed (this will cause the commit to not show up at all on GitHub).
    => The commit message must be changed, or a reopen command must be sent to the ticket before pushing the second commit.

Human Interference

People do stupid things in their ticket database, so aside from the situation we already discussed where a commit referenced the wrong ticket number, we’ll have to watch out for:

  1. A commit referencing a ticket that, at the time the commit was made, did not exist yet (a mistake frequently made by time traveling contributors making commits from the future :>).
    => The easiest fix here is to simply make sure the ticket is created before the commit is pushed.
  2. A commit message that shows up in the right (or wrong) ticket, but has been altered (for whatever reason and by whatever means).
    => The commit message must be altered accordingly (or not if that’s not desirable). Optionally, punch the person who fiddled with the message.

Largely Cosmetic Issues

Finally, a few things in commit messages or tickets may not be a problem for the import, but could still be nice to fix:

  1. Merges done using svnmerge have the nasty habit of repeating all merged commit messages and thus cluttering the history.
    => It might be nice to remove the clutter from the commit message so all the tickets mentioned in the merged revisions remain clean.
  2. A ticket was closed manually, but with a reference to the corresponding commit.
    => This could be fixed by putting the ticket ID into the commit message, which will take advantage of GitHub’s automatic referencing or closing of tickets in such a case.
Jan 03
Permalink

Basic Preparations for GitHub Ticket Migration

Before modifying the ticket and comment contents themselves (to convert them from TracWiki syntax to Github Flavored Markdown), a few more fundamental changes are necessary:

  1. Find missing ticket IDs (the Agavi Trac has 83 “gaps” in the ticket ID sequence; those were all spam tickets that got deleted, but not in time for a legit ticket showing up with a later ID in the database)
  2. Compose a list of all ticket reporters, ticket change authors (e.g. people leaving a comment) and attachment authors and create a mapping to their GitHub user names or plain display names that will be used during import
  3. Migrate all milestones to the GitHub issue tracker
  4. Migrate all versions (as labels) to the GitHub issue tracker
  5. Migrate all components (as labels) to the GitHub issue tracker
  6. Migrate all ticket types (as labels) to the GitHub issue tracker
  7. Do something about Priorities and Severities

1. Finding Missing Ticket IDs

That one is fairly easy:

$ sqlite3 trac.db 'SELECT id FROM ticket;' | php -r '
$ids = file("php://stdin", FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES);
echo implode("\n", array_diff(range(1, max($ids)), $ids)) . "\n";
'

The resulting list needs to be stored somewhere for later, when we replay all ticket changes - that script will produce empty dummy tickets for the missing IDs to make sure all ticket IDs line up between Trac and GitHub.

2. List Ticket Reporters/Authors

We only need SQL for this:

sqlite3 trac.db '
SELECT DISTINCT (user || " = ") FROM (
    SELECT reporter AS user FROM ticket
    UNION ALL
    SELECT author AS user FROM ticket_change
    UNION ALL
    SELECT author AS user FROM attachment
) ORDER BY LOWER(user) ASC;
'

The result can be put into a file that is then filled by hand with references to people. In our case, we chose to use either a GitHub username (if known, requires some research in certain cases), half of the person’s e-mail address, or a nickname in quotes:

anonymous = "anonymous"
david = @dzuelke
steve@REDACTED.com = steve@…

“REDACTED.com” is obviously not what was really in there; the ellipsis (“…”) is correct however.

This map will be used by a script later when replaying ticket changes via the GitHub API in order to embed information about the original authors. By using GitHub usernames with the “@someone” notation, a link will be created; we just have to get things right on the first attempt or some people will complain about notification spam ;)

3. Migrate Milestones

Very straightforward as the GitHub issue tracker supports milestones. They existing milestones in the Trac database are simply ported over to GitHub; milestone “AgaviForge” will be dropped however (and the corresponding tickets later reassigned).

4. Migrate Versions (as Labels)

Versions as registered in Trac will simply be imported as labels like “0.10.2” or “1.0.8”. For now, we will import all versions except for “HEAD”, “HEAD-1.0” and “HEAD-0.11” since those only cause clutter and thus far have rarely been used.

This will create a considerable amount of labels just for the versions; we might eventually merge them all into series labels (like “1.0.x” or “1.1-latest”, similar to e.g. Ruby on Rails). When doing that, we might also drop old series labels that are not maintained anymore, simply to cut down on clutter.

One last thing to decide upon of course is the color codes for the versions. No great ideas yet, as the colors are shared with components and ticket types (see below).

5. Migrate Components (as Labels)

Components will be migrated to labels with identical names, with the exception of “documentation” (will be moved to a separate repository), “website” (never has been part of the SVN repos, but will be moved to its own GitHub repository eventually) and “OTHER” (no label will be assigned to such issues).

Again, color coding is something that’s left to be decided (suggestions welcome).

6. Migrate Types (as Labels)

There currently are just three issue types: “defect”, “enhancement” and “task”. Those will be migrated as-is, with defects getting a flashy red color, enhancements a nice green, and tasks… well… turquoise maybe?

7. Priorities and Severities

Priorities will simply be dropped, they have never proven to be useful. Same for severities, although we might have a dedicated label for blockers, indicating that resolving such an issue may not be postponed until a later release.

Dec 31
Permalink

Porting Trac Ticket Attachments to GitHub

Some of the tickets in the Agavi Trac have attachments; these attachment files need special treatment as the GitHub issue tracker itself has no support for attachments.

We’ve created a repository (agavi/trac-ticket-attachments) for this purpose and simply imported the everything from trac/attachments/tickets, which contains just a bunch of folders with the ticket ID and the attachments of the ticket inside them. Some of the files have %20 in them as the uploaded files had spaces in their names; we changed those to actual space characters so the URL escape does not end up in the file name in the Git repository.

We will later rewrite the ticket history so the attachment links point to the corresponding files in this repository.

Dec 30
Permalink

Testing GitHub Tickets for the Migration

One thing we’ve always put a focus on is having tickets for every change we make, and using these tickets extensively in commit messages and from other tickets so a quick svn blame on a line not only shows a description of the change, but also gives a ticket number that can be looked up on Trac to follow a discussion of the change.

On numerous occasions, I looked at a bit of code, found it to be incredibly stupid, but then did an svn blame anyway to figure out when the change was done. Looking up the ticket on Trac then usually gives a list of all the commits for that change, and maybe even related tickets and discussions. Often enough, that line of code then turned out to not be so stupid after all, and the day was saved.

Good examples are tickets #502 or #1199, which link to related and duplicate tickets, have discussions in them and of course commits references.

All these tickets contain valuable information, so it is vital to preserve all this history. However, the GitHub ticket API does not allow a timestamp in updates, which means that any ticket and any comment will have the date and time of import, and not the old date and time. Commits from the repository, however, will contain the correct timestamps as git2svn preserves that information.

We tested how the GitHub issue tracker behaves with respect to timestamps, and found the following:

  • For commits referencing a ticket (via “refs #123” or “closes #456”) will use the timestamp of the commit for the reference, not the timestamp of the push
  • Comments and commit references in tickets are in the order of their creation in the tracker, not in the order of their timestamps
  • Tickets that do not exist are not updated retroactively when referenced in a commit message

The result is that we can now build a database of commits and ticket changes that are in the order of their timestamps, and replay them step by step, meaning that we will convert the repos, then push a commit, then replay a ticket update or creation, then push another commit and so forth. The result will be that a ticket might contain a comment (with a timestamp of, say, January 2 2012 13:04:12 because that’s the time when our API client replayed that change) followed by a commit reference that closed the ticket (with a timestamp of, say, August 4 2007 08:09:12, because that’s when that change was committed).

Ultimately, this means that we will be able to use GitHub’s native support for “refs #123” or “closes #456”, which will look much nicer than manually having to construct comments through the API about commits referencing the issue.

The remaining issue of course is that any comment and ticket will have the wrong date and time, so we’ll have to include that in the next (along with a notice saying that the ticket/comment was migrated from the old Trac, just so people know), but it simply seems like there is no way around that with the GitHub API.

Regarding severities and priorities, we might drop these altogether along with keywords; components can be done via labels, and so can ticket types (enhancement/task/defect) and versions. Milestones are supported natively by the GitHub issue tracker, so they won’t be a problem.

Certain changes in GitHub, such as changing milestones, labels or assignees, are not displayed in the web interface, even though the changes appear to be recorded internally and are available through the API, so the information itself is not lost.

Permalink

Moving to GitHub!

We have just begun working on migrating trac.agavi.org to GitHub; there is whole range of things to consider, from referencing users correctly over rewriting commit messages so the correct Git commit references are included to preserving the history of tickets in their right order.

We will add more blog posts as we move along to document the process in case other projects would like to do the same, and also make our migration scripts available if they are sufficiently generic to be useful to others.

Dec 22
Permalink

Agavi 1.0.7 released!

Agavi 1.0.7 is now available for download at www.agavi.org and through the PEAR channel.

The release contains just two minor changes over RC2, so I’ll simply quote from the final release notes:

The getCredentials() method on an Action is no longer called unconditionally (i.e. whether or not the isSecure() method returned false) but only if the Action is “secure”.

It is now possible to manually call shutdown() on any database adapter to close the underlying connection; another call to getConnection() will cause a reconnect.

AgaviFormPopulationFilter will now populate multiple forms in the order specified in the “populate” request attribute (when populating via an array with form IDs as keys and parameter holders as values) in namespace “org.agavi.filter.FormPopulationFilter”); however, if the value for a key in the “populate” array is boolean true (to re-populate from request data), this form will always be handled first, so error messages are inserted on that form first.

Testing is now compatible with both PHPUnit 3.5 and 3.6. If you want to specify code coverage filters, AgaviTesting::getCodeCoverageFilter() returns the correct instance (singleton or not depending on the PHPUnit version) for you to use. A base constraint class named “AgaviBaseConstraintBecausePhpunitSucksAtBackwardsCompatibility” can be used for constraints that work with both PHPUnit 3.5 and 3.6; if you implement the new matches() method introduced in PHPUnit 3.6 instead of the old evaluate(), it will automatically be called in the proper fashion depending on the PHPUnit version.

AgaviTesting::dispatch() can now call exit() with the appropriate shell status code (the same as returned by a vanilla PHPUnit run) to indicate success or failures/errors to the calling process. This behavior is triggered when the new optional second argument, defaulting to false, is set to true. If set to false, it returns the PHPUnit result object that may be used by custom code to perform further analysis of the test run.

Several other minor changes and fixes are included in this release as well; most notably, AgaviBooleanValidator’s casting and exporting logic has been repaired, and AgaviConsoleRequest now properly creates an AgaviUploadedFile object with STDIN contents (when configured to read those) instead of a plain array. The PHPTAL renderer now supports configuration of character encoding via parameter “encoding”.

The timezone database was updated to version 2011n.

As always, check the CHANGELOG for the full list of enhancements, changes and fixes.

Dec 14
Permalink

Agavi 1.0.7 RC2 released!

Agavi 1.0.7 RC2 is now available for download at www.agavi.org and through the PEAR channel.

A few minor fixes and changes have made it in since RC1; most notably, PHPUnit 3.6 should work fine now, and the Form Population Filter, when populating several forms (via an array as the value for “populate” with form IDs as keys) now gives precedence to boolean true values (so error messages are inserted on re-populated forms first) and processes forms in the order given in the “populate” array rather than in the order they appear in the document.

Please test this thoroughly and give feedback if necessary so we can roll a final version before the holidays.

Nov 20
Permalink

Agavi 1.0.7 RC1 released!

Agavi 1.0.7-RC1 is now available for download at www.agavi.org and via the PEAR channel.

As always, the CHANGELOG has the full story, but here is a summary of the most important changes:

  • The getCredentials() method on an Action is no longer called unconditionally (i.e. whether or not the isSecure() method returned false) but only if the Action is “secure”.

  • The PHPTAL renderer now supports configuration of character encoding via parameter “encoding”.

  • It is now possible to manually call shutdown() on any database adapter to close the underlying connection; another call to getConnection() will cause a reconnect.

  • AgaviTesting::dispatch() can now call exit()? with the appropriate shell status code (the same as returned by a vanilla PHPUnit run) to indicate success or failures/errors to the calling process. This behavior is triggered when the new optional second argument, defaulting tofalse, is set totrue. If set tofalse`, it returns the PHPUnit result object that may be used by custom code to perform further analysis of the test run.

  • Several other minor changes and fixes are included in this release as well; most notably, AgaviBooleanValidator’s casting and exporting logic has been repaired, and AgaviConsoleRequest now properly creates an AgaviUploadedFile object with STDIN contents (when configured to read those) instead of a plain array.

  • The timezone database was updated to version 2011n.

Jul 23
Permalink

Agavi 1.0.6 released!

We’re thrilled to announce that the final version of Agavi 1.0.6 is now available for download at agavi.org and through the PEAR channel.

There were no changes in this release over the release candidate, so please refer to the RC1 release announcement and of course once again the CHANGELOG to see what great new features and changes are included in this new version.

Jul 11
Permalink

Agavi 1.0.6 RC1 released!

We’re happy to announce that the first release candidate for version 1.0.6 is now available for download at agavi.org and through the PEAR channel.

This release adds a database adapter for Doctrine 2, a renderer for the Twig template engine, a native session storage for ext/sqlsrv and various enhancements such as support for Smarty 3 in the existing renderer and extended configuration abilities for the Doctrine 1 database adapter.

Doctrine 2 support was added, both for the ORM (AgaviDoctrine2ormDatabase) and the DBAL (AgaviDoctrine2dbalDatabase) libraries. The ORM adapter can optionally utilize a pre-configured DBAL connection instead of specifying connection details itself. Both adapters allow for easy extensibility with custom logic by exposing a prepareConfiguration() and a prepareEventManager() method that can be used to customize the configuration and the event manager, respectively. The API documentation for both classes contains a detailed list of all options; a typical ORM adapter configuration would look similar to this:

<database name="master" class="AgaviDoctrine2ormDatabase">
    <ae:parameter name="connection">
        <ae:parameter name="driver">pdo_mysql</ae:parameter>
        <ae:parameter name="host">127.0.0.1</ae:parameter>
        <ae:parameter name="user">root</ae:parameter>
        <ae:parameter name="dbname">test</ae:parameter>
    </ae:parameter>
    <ae:parameter name="configuration">
        <ae:parameter name="metadata_driver_impl_argument">%core.model_dir%/Entities</ae:parameter>
    </ae:parameter>
</database>

An example cli-config.php file demonstrating how to integrate Agavi with the Doctrine command line utilities can be found in etc/database/doctrine2.

The database adapter for Doctrine 1 now allows basic configuration of cache drivers for result and query caches, and added the ability to specify the class to use as the connection event listener - this allows a custom implementation to be used, where more advanced logic can be executed in the preConnect() event handler method, which is useful for modifying the connection or manager objects in ways not supported through configuration options.

It is now possible to use ext/sqlsrv as a session storage via AgaviSqlsrvSessionStorage as an alternative to interfacing with Microsoft SQL Server via PDO. The session table’s data column must be of type “varbinary”.

The Twig template engine is supported through AgaviTwigRenderer; includes and inheritance lookups are performed on the current directory first by default before falling back to the main template directory. The lookup sequence is configurable. The renderer will also attempt to load a localized template from a locale subdirectory first before picking a generic one if the template that initiated the include or extension was a localized template.

AgaviSmartyRenderer received minor adjustments to support Smarty 3 without triggering deprecation notices.

The Sample App now uses the regular “web” context to serve the WSDL for the product service; the WSDL contents can now be retrieved through the URL /products.wsdl.

The timezone database was updated to version 2011h.

A complete list of changes in this release can be found in the CHANGELOG. Please take it for a spin and report any issues before we release a final version soon!