Platform-releng/How to do miscellaneous releng tasks
There are times when some quirky thing out of the ordinary has to be done. This page captures some of them. They are likely easily to get out of date or details change from case to case, but ... thought they would at least serve as "hints" in case others have to ever do these things (and they serve as reminders to me :). They are likely too be quirky and volatile to be part of a FAQ, but might evolve to be a "procedures document". Thought I'd try to capture them every time I do one that I find confusing or hard to keep straight, or when someone asks about, so it is not a complete or exhaustive document, but again, hints. Note that most of these procedures require shell access to build.eclipse.org.
How to restart Hudson tests
[November 28, 2012]
Sometimes tests have to be restarted and reran. This is especially true if something goes wrong with Hudson, say, and its restarted during the middle of a test run.
Hudson tests can be re-ran directly from Hudson web interface; just provide the buildId and eclipseStream and the scripts will figure out where to get the build from the "download.eclipse.org". This works because there is a cronjob running that knows how to efficiently "look for results" and if finds any, will collect them up and summarize them on main download page.
The could be done programatically as well. This example is specific for a Kepler I-build, but idea would be same for others. It assumes the build is complete, and on "downloads" and this is just to run the tests on Hudson. It might be easier to do programmatically, if you had to restart all three tests, for example, otherwise, the webpages are pretty easy.
The file to start tests is in
To do the retest a file named startTests.sh is executed from command line.
But, this file needs two parameters, the buildId to test, and the eclipseStream that the build is from. You could edit the sh file directly, but best is to edit a file named buildParams.shsource. The contents of that file, for a Kepler re-test, would be something like
Then, then startTests.sh is executed, it will read the values from that file.
(I think startTests.sh should be ran from screen shell, or otherwise allowed to continue running even if you logoff or lose connection.)
How to re-collect Hudson tests
[November 28, 2012]
Occasionally the tests run fine and you can "see" the results on Hudson, but they are not summarized and integrated with the builds download page. This can happen for example, if Hudson is busy and returns a "502" error while trying to fetch the zip file of the results.
There is a way to "manually" retry this "fetch and summarize" job. We currently try only once, because occasionally there are test failures that result in "infinite reties" for some times of errors.
In brief, look in
There are data files there named 'testjobdata'<timestamp>'.txt'. There would be three of them generated for each build ... one for each platform tested. They are prefixed according to their state (no prefix meaning "has not been processed yet". For example,
testjobdata201211280639.txt RAN_testjobdata201211280639.txt RUNNING_testjobdata201211280639.txt ERROR_testjobdata201211280639.txt
There are also files named 'collection-out.txt' and 'collection-err.txt' which give some logs of standard out and standard error file the "collect and summarize" jobs ran. If you see a recoverable error in there (e.g. we received a 502 from Hudson) you can re-try the job just by renaming the file. For example, rename 'ERROR_testjobdata201211280639.txt' back to 'testjobdata201211280639.txt' (filenames matching "testjobdata*" are processed by the usual cron job, which runs every 10 minutes).
The cronjob that runs is /shared/eclipse/sdk/testdataCronJob.sh. In theory, a releng committer could run this directly (instead of waiting for cronjob, or if cronjob itself is broken) but occasionally, in past, I've been surprised that some permissions aren't right (and is seldom tested).
The scripts in '/shared/eclipse/sdk' are stored in git in '/org.eclipse.releng.eclipsebuilder/scripts/sdk' but there is no "checkout/checkin" going on automatically ... they are stored there for safety and history.
[December 10, 2012]
Our jobs on Hudson are collected in the Eclipse and Equinox view. Tests based on 3.x builds are prefixed with ep3 and Tests based on 4.x are prefixed with ep4. The view shows the status of the last job ran (or, current job running). To see history, you need to click on one job. To see which test job corresponds to which build, you need to "drill down" and look at "Parameters" of each job.
Below are some example screen shots.
The first shows the history of one job.
- progress bar shows a job in progress (and, its icon will be blinking).
: a yellow dot icon means the job finished but there were test failures (normal for our current tests)
: a grey dot means the job started but was cancelled (could have been cancelled on purpose, or might have been that Hudson was restarted).
: a red dot means there was an error that prevented the tests from running (such as they could not be installed).
The following screen shot shows the results of a normal job. You can see there tests ran, and had the usual 100 or so failures. You can click on "TestResults" on the left nav bar to see the whole list of tests ... we have about 80,000.
To see exactly which build was tested, you need to click on the "Parameters" link on the left nav bar to see the buildId and eclipseStream.