Skip to main content

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Jump to: navigation, search

Difference between revisions of "Platform-releng/How to do miscellaneous releng tasks"

m
(How to reschedule a build: Not using crontab for ages)
 
(25 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
There are times when some quirky thing out of the ordinary has to be done. This page captures some of them. They are likely easily to get out of date or details change from case to case, but ... thought they would at least serve as "hints" in case others have to ever do these things (and they serve as reminders to me :). They are likely too be quirky and volatile to be part of a FAQ, but might evolve to be a "procedures document". Thought I'd try to capture them every time I do one that I find confusing or hard to keep straight, or when someone asks about, so it is not a complete or exhaustive document, but again, hints. Note that most of these procedures require shell access to build.eclipse.org.  
 
There are times when some quirky thing out of the ordinary has to be done. This page captures some of them. They are likely easily to get out of date or details change from case to case, but ... thought they would at least serve as "hints" in case others have to ever do these things (and they serve as reminders to me :). They are likely too be quirky and volatile to be part of a FAQ, but might evolve to be a "procedures document". Thought I'd try to capture them every time I do one that I find confusing or hard to keep straight, or when someone asks about, so it is not a complete or exhaustive document, but again, hints. Note that most of these procedures require shell access to build.eclipse.org.  
  
== How to restart Hudson tests  ==
+
== Mailing list and bugzilla ==
  
''[November 28, 2012]''  
+
It almost goes without saying, but for someone "starting fresh" in Platform Release Engineering, the bugzilla "user" to subscribe to "listen to" is ''platform-releng-inbox@eclipse.org''. The mailing list to subscribe to is ''platform-releng-dev@eclipse.org''. Ideally should also monitor the "cross project" bugzilla component and mailing list: ''cross-project.inbox@eclipse.org'' and ''cross-project-issues-dev@eclipse.org'', respectively.
  
Sometimes tests have to be restarted and reran. This is especially true if something goes wrong with Hudson, say, and its restarted during the middle of a test run.
+
== How to restart Jenkins tests ==
  
Hudson tests can be re-ran directly from Hudson web interface; just provide the buildId and eclipseStream and the scripts will figure out where to get the build from the "download.eclipse.org". This works because there is a cronjob running that knows how to efficiently "look for results" and if finds any, will collect them up and summarize them on main download page.  
+
Tests can reliably be re-ran, even a "long time" after initial run (assuming the build is till on "downloads"), because we save all the relevant data on "downloads", and the test's "input parameters", as a whole, specify exactly what to run, and what to use to "publish" the results. Note: currently, we can only run "I-builds" with the two part time stamp. We can not run, for example, the tests from "S-4.5M4-201412151800" (But, normally, S-4.5M4-201412151800 corresponds exactly to I20141215-1800 so that would be the build to use to re-run tests from a milestone.  
  
The could be done programatically as well. This example is specific for a Kepler I-build, but idea would be same for others. It assumes the build is complete, and on "downloads" and this is just to run the tests on Hudson. It might be easier to do programmatically, if you had to restart all three tests, for example, otherwise, the webpages are pretty easy.  
+
<pre>
 +
https://ci.eclipse.org/releng/view/Automated%20tests/
 +
</pre>
  
The file to start tests is in
+
Hopefully the test "names" are self explanatory, such as 'ep45I-unit-win32' is for the unit tests for Windows 32 bit machine, for the Eclipse 4.5 I-builds.  All these test jobs are pretty much identical, but ran as different jobs for two reasons: 1) it improves the automatic history "book keeping". So for example if the number of tests failures increase or decrease, then you are comparing "apples to apples". 2) There are times that "machine restrictions" apply, for example, on Windows, we allow only one build to run, at a time, whereas on the Mac, we allow more than one to run.
 +
 +
=== Preparation ===
  
/shared/eclipse/eclipse4I/build/supportDir/org.eclipse.releng.eclipsebuilder/testScripts
+
To "re-run" a test, you need three pieces of information, the buildId, such as M20150204-1700, the 3-digit build stream, such as 4.4.2, and the "hash tag" of the aggregator for that build, such as 115d147f542bfcfeeba452946993c2f2578e85a8.  
  
To do the retest a file named startTests.sh is executed from command line.  
+
If the build ran once (i.e. is in "history") these values are in the "parameters" field of the existing test attempt. If even lost in history, the values can be obtained from the download directory.  
  
But, this file needs two parameters, the buildId to test, and the eclipseStream that the build is from. You could edit the sh file directly, but best is to edit a file named buildParams.shsource. The contents of that file, for a Kepler re-test, would be something like
+
=== To Re-run ===
  
buildId=I20120911-1000
+
You need to login to the "Releng Jenkins" (and, for that one, it is your committer ID, and password, not your email, as it is on JIPP instances). Click the "Build now" link, and you will be presented with a form to fill-in the 3 values from above. Click on "ok" (Labeled 'Build'), and check back to see if it's running!  (You should at least see it "queued up" if it can not run right away, due to the test machine being busy).
eclipseStream=4.3.0
+
  
Then, then startTests.sh is executed, it will read the values from that file.
+
== How to see tests results on Jenkins ==
  
(I think startTests.sh should be ran from screen shell, or otherwise allowed to continue running even if you logoff or lose connection.)
+
Our jobs on Hudson are collected in the [https://ci.eclipse.org/releng/view/Automated%20tests/ Automated tests] view.  The view shows the status of the last job ran (or, current job running). To see history, you need to click on one job. To see which test job corresponds to which build, you need to "drill down" and look at "Parameters" of each job.
  
== How to re-collect Hudson tests  ==
+
== How to change to web pages for builds or test results ==
  
''[November 28, 2012]''
+
This section is a brief outline of the files and steps involved in the web pages for "drops" (builds) and test results.
  
Occasionally the tests run fine and you can [[/How to "see" tests results on Hudson|"see" the test results on Hudson]], but they are not summarized and integrated with the builds download page. This can happen for example, if Hudson is busy and returns a "502" error while trying to fetch the zip file of the results.  
+
A key piece of the work is done by a custom ant task, which is found in TestResultsGenerator.java in the eclipse.platform.releng.buildtools repository. That jar file (bundle) along with several others and a feature is built on Hudson and put in a p2 repository on the build machine, under /shared/eclipse/buildtools. (Much of this is done and triggered manually -- it is not part of the regular build, and not completely automated, mostly because it is rarely done).  
  
There is a way to "manually" retry this "fetch and summarize" job. We currently try only once, because occasionally there are test failures that result in "infinite reties" for some times of errors.  
+
For the main drop page, the file index.template.php is "ran through" that ant task, which fills in specific artifacts to download, partially based on the testManifest.xml file.  
  
In brief, look in
+
For the Test Results page, a file named testResults.php "controls" what is displayed, but it does not act as a template for the Ant task. Instead, the ant task creates files which the testResults.php file "includes" (if they exist).
  
/shared/eclipse/sdk/testjobdata
+
The "drop page" (index.php) and the "compiler logs" (compilerSummary.html) are typically generated once, after a build is done. The Test Results summaries are re-computed several times in response to receiving a "done" signal from Hudson when each platform is done with its testing. Each "regeneration" assumes that all "test results" files are still available, even of already generated the summaries for that set previously.
  
There are data files there named 'testjobdata'&lt;timestamp&gt;'.txt'. There would be three of them generated for each build ... one for each platform tested. They are prefixed according to their state (no prefix meaning "has not been processed yet". For example,
+
The included files are mostly pure HTML with a minimum of "style" specified. Instead, the HTML elements inherit their style from a "static" file named ''resultsSection.css'' which is activated by being in the "resultsSecion" div.  
  
testjobdata201211280639.txt
+
That ''resultsSection.css'' is "included" by ''DL.thin.header.php.html''. That "thin header" is a special version of the "Solstice Theme" which provides a minimal amount of "extra" things, and is also good because it can display on a non-Eclipse.org downloads machine (such as the Eclipse build machine, or even a "local build" machine) whereas the full Solstice theme requires access to Eclipse.org databases.  
RAN_testjobdata201211280639.txt
+
RUNNING_testjobdata201211280639.txt
+
ERROR_testjobdata201211280639.txt
+
  
There are also files named 'collection-out.txt' and 'collection-err.txt' which give some logs of standard out and standard error file the "collect and summarize" jobs ran. If you see a recoverable error in there (e.g. we received a 502 from Hudson) you can re-try the job just by renaming the file. For example, rename 'ERROR_testjobdata201211280639.txt' back to 'testjobdata201211280639.txt' (filenames matching "testjobdata*" are processed by the usual cron job, which runs every 10 minutes).
+
The ''DL.thin.header.php.html'' is a highly customized version of one that can be obtained from Eclipse.org. See the [https://eclipse.org/eclipse.org-common/themes/solstice/docs/ Solstice documention] for more information. But in short, the thin header template can be obtained with
  
The cronjob that runs is /shared/eclipse/sdk/testdataCronJob.sh. In theory, a releng committer could run this directly (instead of waiting for cronjob, or if cronjob itself is broken) but occasionally, in past, I've been surprised that some permissions aren't right (and is seldom tested).  
+
wget -O DL.thin.header.php.NEW.html  <nowiki>https://eclipse.org/eclipse.org-common/themes/solstice/html_template/index.php?theme=solstice&layout=thin-header</nowiki>
  
The scripts in '/shared/eclipse/sdk' are stored in git in '/org.eclipse.releng.eclipsebuilder/scripts/sdk' but there is no "checkout/checkin" going on automatically ... they are stored there for safety and history.
+
From time to time, that file should be obtained and compared with our customized version, to see if anything has changed. It is recommended to use WTP to format it and also format or edit CSS files like the ''resultsSection.css'' file. In addition to providing consistent formatting, for easier comparisons, it has a built-in color editor that is handy on finding good values of our custom colors to use.
  
== How to "see" tests results on Hudson  ==
+
There are other, similar, files used for our download pages (i.e. "included") such as eclipseDownloadPage.css and eclipseDownloadPage.js but the hope is this bare outline would help someone get started when changes or fixes are needed in the future.
  
''[December 10, 2012]''
+
== Other releng tasks ==
  
Our jobs on Hudson are collected in the [https://hudson.eclipse.org/hudson/view/Eclipse%20and%20Equinox/ Eclipse and Equinox] view. Tests based on 3.x builds are prefixed with ep3 and Tests based on 4.x are prefixed with ep4. The view shows the status of the last job ran (or, current job running). To see history, you need to click on one job. To see which test job corresponds to which build, you need to "drill down" and look at "Parameters" of each job.  
+
See [[Platform-releng/Platform_Build_Automated#Routine_release_engineering_tasks_for_builds]] for other, more routine releng tasks.  
  
Below are some example screen shots.
 
  
<br>
+
[[Category:Eclipse_Platform_Releng| Eclipse_Platform_Releng]]
 
+
The first shows the history of one job. <br>
+
 
+
:progress bar shows a job in progress (and, its icon will be blinking). <br>: a yellow dot icon means the job finished but there were test failures (normal for our current tests)<br>: a grey dot means the job started but was cancelled (could have been cancelled on purpose, or might have been that Hudson was restarted). <br>: a red dot means there was an error that prevented the tests from running (such as they could not be installed). <br><br>
+
 
+
<br>
+
 
+
<br>
+
 
+
[[Image:ListOfTestJobsWin32.png|center|Job History for 4.x based Windows 32 bit tests]]<br>
+
 
+
<br>
+
 
+
The following screen shot shows the results of a normal job. You can see there tests ran, and had the usual 100 or so failures. You can click on "TestResults" on the left nav bar to see the whole list of tests ... we have about 80,000.
+
 
+
<br>
+
 
+
[[Image:OneTestJob.png|center|One Job]]
+
 
+
<br>
+
 
+
To see exactly which build was tested, you need to click on the "Parameters" link on the left nav bar to see the buildId and eclipseStream.
+
 
+
<br>
+
 
+
[[Image:JobParameters.png|center|Job Parameters]]
+
 
+
[[Category:Eclipse_Platform_Releng|Eclipse_Platform_Releng]]
+

Latest revision as of 12:49, 9 February 2022

There are times when some quirky thing out of the ordinary has to be done. This page captures some of them. They are likely easily to get out of date or details change from case to case, but ... thought they would at least serve as "hints" in case others have to ever do these things (and they serve as reminders to me :). They are likely too be quirky and volatile to be part of a FAQ, but might evolve to be a "procedures document". Thought I'd try to capture them every time I do one that I find confusing or hard to keep straight, or when someone asks about, so it is not a complete or exhaustive document, but again, hints. Note that most of these procedures require shell access to build.eclipse.org.

Mailing list and bugzilla

It almost goes without saying, but for someone "starting fresh" in Platform Release Engineering, the bugzilla "user" to subscribe to "listen to" is platform-releng-inbox@eclipse.org. The mailing list to subscribe to is platform-releng-dev@eclipse.org. Ideally should also monitor the "cross project" bugzilla component and mailing list: cross-project.inbox@eclipse.org and cross-project-issues-dev@eclipse.org, respectively.

How to restart Jenkins tests

Tests can reliably be re-ran, even a "long time" after initial run (assuming the build is till on "downloads"), because we save all the relevant data on "downloads", and the test's "input parameters", as a whole, specify exactly what to run, and what to use to "publish" the results. Note: currently, we can only run "I-builds" with the two part time stamp. We can not run, for example, the tests from "S-4.5M4-201412151800" (But, normally, S-4.5M4-201412151800 corresponds exactly to I20141215-1800 so that would be the build to use to re-run tests from a milestone.

https://ci.eclipse.org/releng/view/Automated%20tests/

Hopefully the test "names" are self explanatory, such as 'ep45I-unit-win32' is for the unit tests for Windows 32 bit machine, for the Eclipse 4.5 I-builds. All these test jobs are pretty much identical, but ran as different jobs for two reasons: 1) it improves the automatic history "book keeping". So for example if the number of tests failures increase or decrease, then you are comparing "apples to apples". 2) There are times that "machine restrictions" apply, for example, on Windows, we allow only one build to run, at a time, whereas on the Mac, we allow more than one to run.

Preparation

To "re-run" a test, you need three pieces of information, the buildId, such as M20150204-1700, the 3-digit build stream, such as 4.4.2, and the "hash tag" of the aggregator for that build, such as 115d147f542bfcfeeba452946993c2f2578e85a8.

If the build ran once (i.e. is in "history") these values are in the "parameters" field of the existing test attempt. If even lost in history, the values can be obtained from the download directory.

To Re-run

You need to login to the "Releng Jenkins" (and, for that one, it is your committer ID, and password, not your email, as it is on JIPP instances). Click the "Build now" link, and you will be presented with a form to fill-in the 3 values from above. Click on "ok" (Labeled 'Build'), and check back to see if it's running! (You should at least see it "queued up" if it can not run right away, due to the test machine being busy).

How to see tests results on Jenkins

Our jobs on Hudson are collected in the Automated tests view. The view shows the status of the last job ran (or, current job running). To see history, you need to click on one job. To see which test job corresponds to which build, you need to "drill down" and look at "Parameters" of each job.

How to change to web pages for builds or test results

This section is a brief outline of the files and steps involved in the web pages for "drops" (builds) and test results.

A key piece of the work is done by a custom ant task, which is found in TestResultsGenerator.java in the eclipse.platform.releng.buildtools repository. That jar file (bundle) along with several others and a feature is built on Hudson and put in a p2 repository on the build machine, under /shared/eclipse/buildtools. (Much of this is done and triggered manually -- it is not part of the regular build, and not completely automated, mostly because it is rarely done).

For the main drop page, the file index.template.php is "ran through" that ant task, which fills in specific artifacts to download, partially based on the testManifest.xml file.

For the Test Results page, a file named testResults.php "controls" what is displayed, but it does not act as a template for the Ant task. Instead, the ant task creates files which the testResults.php file "includes" (if they exist).

The "drop page" (index.php) and the "compiler logs" (compilerSummary.html) are typically generated once, after a build is done. The Test Results summaries are re-computed several times in response to receiving a "done" signal from Hudson when each platform is done with its testing. Each "regeneration" assumes that all "test results" files are still available, even of already generated the summaries for that set previously.

The included files are mostly pure HTML with a minimum of "style" specified. Instead, the HTML elements inherit their style from a "static" file named resultsSection.css which is activated by being in the "resultsSecion" div.

That resultsSection.css is "included" by DL.thin.header.php.html. That "thin header" is a special version of the "Solstice Theme" which provides a minimal amount of "extra" things, and is also good because it can display on a non-Eclipse.org downloads machine (such as the Eclipse build machine, or even a "local build" machine) whereas the full Solstice theme requires access to Eclipse.org databases.

The DL.thin.header.php.html is a highly customized version of one that can be obtained from Eclipse.org. See the Solstice documention for more information. But in short, the thin header template can be obtained with

wget -O DL.thin.header.php.NEW.html  https://eclipse.org/eclipse.org-common/themes/solstice/html_template/index.php?theme=solstice&layout=thin-header

From time to time, that file should be obtained and compared with our customized version, to see if anything has changed. It is recommended to use WTP to format it and also format or edit CSS files like the resultsSection.css file. In addition to providing consistent formatting, for easier comparisons, it has a built-in color editor that is handy on finding good values of our custom colors to use.

There are other, similar, files used for our download pages (i.e. "included") such as eclipseDownloadPage.css and eclipseDownloadPage.js but the hope is this bare outline would help someone get started when changes or fixes are needed in the future.

Other releng tasks

See Platform-releng/Platform_Build_Automated#Routine_release_engineering_tasks_for_builds for other, more routine releng tasks.

Back to the top