WTP/Build/Accounting for history during builds
By using p2 (and related tools) during builds, it is possible to better account for history when delivering builds of features and bundles.
What this means, in short, is if a bundle has the exact same version as it did from some previous build, then that previously built bundle is supposed to be the same (or, equivalent), and could be assumed to be the same, and that previously built bundle reused and be delivered in current build, instead of the recently built one. This is kind of counter intuitive, but makes sense, since this is the way a p2 based install works; if someone already had a bundle installed, with the exact same version, then they would never get a new one installed. So one advantage of re-using the previously built bundle is that if it is not equivalent for some reason, then more likely to be discovered in time to fix the issue, instead of discovering the problem, perhaps months later, when someone was doing installs on top of previous versions.
Of course, it is well understood that sometimes the byte codes do change even though the source code doesn't change (and hence the version doesn't change, normally, if the source doesn't change). In general this happens because we version code based on our source, but other factors can effect the built code ... for example, a constant in a pre-req might change, the compiler or compiler settings might change, etc. For an overview and introduction of the general problem, see Andrew Niefer's blog entry.
In particular, for WTP, we have had a recurring problem, where a feature version (with its build-time computed suffix) stays exactly the same even when the contents of the feature has changed (and then, the problem is actually, it is not noticed until weeks or months later). The root problem is due to the computed suffix not reflecting enough significant digits to fully account for its contained bundles; see bug bug 208143. The only solution for this bug, or limitation, since we do not want to make the suffixes indefinitely long, is to detect the problem early, and take appropriate corrective action. While it'd be ideal to do that "real time", during the build (and correct the suffix), the next best alternative is to at least detect it shortly after a build, and correct it before a re-build, and, more importantly, before the code is delivered.
So, by using p2 (and related tools) during builds we now have the ability to do that early detection. This enhancement to WTP's build is tracked in bug bug 325181.
Process during WTP Builds
The following image will be used to explain the parts of this process as used by WTP builds.
The flow is basically left to right, over time or steps of the build.
buildmakes use of PDE's
p2.gatheringflag to create the
buildrepository, which is literally the byte codes as built, in the form of a handy p2 repository. (Bundles are complete at this point, conditioned, and signed when doing a production build).
- In a "create final repo" ant script in our builds, the
p2.mirroroperation is used to create the final
repository. It mirrors content from the
buildrepository, but if it happens to find an IU (bundle or feature jar) with the exact same version as in the
reference repositories, then it uses the previously built one from the
reference repositories(not the freshly built one from
- The mirror operation, as a side effect, performs the
comparatortask, which makes use of the jar comparator. The jar comparator is provided by the p2 project, but it is not related to p2, per se, and more related to knowing how to examine significant parts of byte codes, bundles, and features). If, while mirroring, two bundles are found with same version, then in addition to mirroring the previously built one for the final repository, it also inspects the presumably identical bundles for significant differences, and if any are found, it prints that information to the
- In an ideal world, that'd be it; we'd could just examine the
comparator.logto see if there were any significant messages, to discover if there were issues to correct. But the reality is there are actually lots of "differences" found ... most of which we do not care about. There are so many, even if we wanted to, we'd miss important ones, overlooking the trees for the forest. Therefore, we run a
releng testto filter out expected messages from unexpected messages and throw an error if there are any unexpected messages.
repositoryis considered the final delivered build output (even though some of original build might have been discarded, and replaced with content from a previous build). In other words, distribution zips are created and unit tests are ran with the final
repositoryshows a dashed line back to the
reference repositories, because once a build is declared, then it becomes part of the reference repositories. The reference repositories are a composite repository, made up of a previous release or milestone and subsequent declared builds. This currently takes a "manual" step to create the initial reference repo and to add a declared-build repo to it. The build scripts assume this reference repository exists at a certain location, it does not actually create or manage it. Note: on the production build machine, where all builds are accessible via the file system, the composite repository simply uses the file system for the location of child repositories. If someone were to do a "local" test build, then first, they would have to "manually" create the reference repository in the expected file-system location, and second, they'd likely want to create the reference repository ahead of time on their local system, for efficiency during the build itself, instead of using http locations back to download sites. But, it is important to note, if there is no directory found at the expected reference repository location, (such as for a local, test build), then the operation continues and essentially the final
repositoryconsists of what ever was in the
buildrepository, which is probably fine for most test builds.
Differences found for which no correction is needed.
Note: it will take some time an experience to fully understand all "differences" found and whether or not significant, so some cases will be documented here to share knowledge and increase our group learning curve.
All the "known cases" are codified in a property file called
comparatorfilter.properties that is used during the releng test to filter out some messages. If filtered out, the message is still printed in a file called "excludedMessages" on our test results page. This can be pretty much ignored ... but wouldn't hurt to examine it from time to time to see if anything is mistakenly being filtered out.
In about.mappings, the property "0" has different values: "<date1>" and "<date2>".
- One of the most common "differences" found is similar to following. The dates in about.mappings are literally the date of the build, and is a good example of when we would prefer the old bundle be used, to maintain the "original build" date, instead of changing it each time, even though the date was the only thing changed.
IOException comparing ... Error opening zip file
- We have some cases where "test jars" are purposely invalid, to make sure our code handles invalid input ... but they cause a message to be logged, so we ignore those messages. (It is unknown if the comparator successful does a valid compare on all other content).
Binary file build.xml: sizes differ by ... [10 bytes]
- There's several little differences found in "doc" bundles. Honestly, these have not been examined in detail to know exactly what they mean ... but, assuming for now its just dates or similar (especially for build.xml ... seems that should not even be in a built bundle?). [Note: the Eclipse Platform automatically excludes all "doc" bundles from comparator examination, presumably since there are frequently small unimportant changes].
How to add to the rules about "insignificant" differences
While it is possible to have a custom comparator to do some "tweaking" of what is considered a significant difference, and what is insignificant, the approach we currently have is to have a "releng test" that scans the list of all messages, and creates a new log of significant messages. It does this, by having a set of rules describing which messages are considered insignificant, so anything "left over" (not matching a rule) will be considered "unexpected". The test itself is in our usual org.eclipse.wtp.releng.tests project, but the test reads a stream specific file from releng/maps. The file is named comparatorfilter.properties.
Each rule is made up of three parts, defined by three properties, corresponding to the 3 parts of each comparator message: summary, comparison, reason. All the properties start with "comparator", and the next segment of the name is arbitrary ... just a "name" for the rule ... and is what groups the 3 parts into one rule. If any part of the rule is empty or null, it means "match anything". Otherwise, the property value is taken as a regex expression, that will be used to see if the message matches the rule. All three parts must match, for a rule to match. "Matching", in this case, means the message is unexpected, and considered insignificant.
Given that the format of the comparators messages are not "spec'd" or well-defined, some flexibility, and care, is needed on how the rules match the messages. But, in practice, so far, the "summary" line we usually use to determine which bundle is involved, we do nothing, so far, with "comparison" line (that is, we use "match anything"), and the reason line is a description of why the comparator considers them "different".
For example, one message might be the 3 lines:
canonical: osgi.bundle,org.eclipse.jst.common.frameworks.source,1.1.500.v201104081500 Difference found for canonical: osgi.bundle,org.eclipse.jst.common.frameworks.source,1.1.500.v201104081500 between file:/home/data/httpd/download.eclipse.org/webtools/downloads/drops/R3.3.0/R-3.3.0-20110607160810/repository/ and file:/shared/webtools/projects/wtp-R3.4.0-I/workdir/I-3.4.0-20110809003752/buildrepository/jst-sdk In about.mappings, the property "0" has different values: "20110414085808" and "20110809003752".
Our rule to exclude those from the significant list, is
comparator.aboutmappings.summary = comparator.aboutmappings.comparison = comparator.aboutmappings.reason = ^In about\\.mappings, the property \"0\" has different values.*$
This means for any bundle, if the reason matches that regex repression, consider it insignificant.
Another rule we have, at the time of this writing, is
comparator.docbuildxml.summary = .*(\ org.eclipse.wst.xsl.sdk.documentation|\ org.eclipse.wst.xsl.doc|\ org.eclipse.wst.server.ui.doc.user|\ org.eclipse.jst.server.ui.doc.user|\ org.eclipse.jst.ws.jaxws.doc.user|\ org.eclipse.jst.jsf.doc.user|\ org.eclipse.wst.jsdt.doc|\ org.eclipse.wst.common.project.facet.doc.api\ ).*$ comparator.docbuildxml.comparison =
This rule means to include all messages from these listed "doc" bundles (that is, include as "insignificant", exclude from "significant" list. This could be improved in future, but initial findings were these often change, presumably due to 'date built', or something.
To test the rules, the releng test can be ran from a workspace, against some local or explicit repositories, in which case the comparatorfilter.properties file in the project itself can be used to experiment with locally.
Significant, unexpected differences and what to do about them.
The "fix" for nearly any "real" problem found in comparator is to re-tag the bundle or feature so it appears different in version number, even if the source did not actually change.
The hard part, and the reason for long drawn-out explanation and graphic in a previous section of this page, is that if there is a difference to be investigated, then that bundle is no longer "in" the build distribution -- the "old" previously built one was put in our final repository and zips, so the "new" just-built bundle (with exact same version) must be obtained from the literal 'buildrepository' on the build machine (and before it is deleted for subsequent builds ... though that normally wouldn't happen until a build is declared).
differences in feature contents
Difference found for canonical: org.eclipse.update.feature,org.eclipse.wst.xml_sdk.feature,3.3.0.v201007311522-7A78-8DXJQUlJHRDD2LBB_qiiymz between file:/home/data/httpd/download.eclipse.org/webtools/downloads/drops/R3.3.0/S-3.3.0M2-20100923155521/repository/ and file:/shared/webtools/projects/wtp-R3.3.0-I/workdir/I-3.3.0-20101007023510/buildrepository/wst-sdk The entry "Feature: org.eclipse.wst.xml_ui.feature 3.3.0.v201007311522-7H7DFYzDxumThWc9oigOk5b6p2Mb" is not present in both features. The entry "Feature: org.eclipse.wst.xml_ui.feature.source 3.3.0.v201007311522-7H7DFYzDxumThWc9oigOk5b6p2Mb" is not present in both features.
This type of messages occurs for the case where a feature suffix doesn't get computed accurately enough to reflect the difference in content. See bug 327176 for a detailed explanation of this particular case.
Note the repositories referred to. In this case, one version was in M2 ... and that's the one that would have been "delivered" to the build's repository and zip files. So, to see the "differences" one has to get a copy of the "new" one in "buildrepository/wst-sdk". On the build machine, it is possible to drill down into a build's directories, using a web browser, but might be easier for some to use shell account or scp to copy down the file.
Some files always excluded
Just to make a note, as of this writing, some bundles are always excluded from the comparison tests, due to bugs in the comparator. This means if new builds of these bundles have same qualifier as old one, then the old one would be delivered, without any warning messages, if there happened to be significant differences. See bug 325158 and bug 325311.
org.eclipse.jpt.eclipselink.ui org.eclipse.jpt.ui org.eclipse.jst.jsp.core.tests
Things to improve
- Currently uses buildrepository from "projects"
- I lied a little above, when saying the buildrepository would not be deleted until declared. Notice the file URL is
- These "projects" directories are deleted for each build. It is copied/saved, temporarily, in a "committers" download directory, so could be found be "translating" the URL to its new download (yet still temporary) location. Such as, for this example:
- This should be changed do the task uses the "committers" location to begin with (partially so the URL will be correct but also so another build could, in theory, be started while this one finished up).
- Some of the complications of having to follow the comparator log with a releng tests could be avoided by having our own custom comparator task. It would, of course, use the current one as a starting point, if not directly inherit from it, but then fine tune the output as is now done in separate releng test. If nothing else, the format of the comparator log is currently not "standardized" which means our tests can break if we happen to run into a message that is formatted differently than we expect. Tracked in bug 326018.