Jump to: navigation, search

Difference between revisions of "Common Build Infrastructure/Managing Hudson"

(Restarting a specific slave)
(9 intermediate revisions by 3 users not shown)
Line 1: Line 1:
==Restarting Hudson ==
+
= Coordinating Activity =  
  
Here's the process for authorized committers to restart Hudson. A restart is required if new / updated plugins have been installed or if a new version of Hudson is available.
+
In order to prevent multiple people from trying to restart the same slave, all such activity should be announced/coordinated on the cross-project mailing list.
  
If you're not authorized see https://bugs.eclipse.org/bugs/show_bug.cgi?id=257265#c34
+
Any specific 'tweaks'(or new jobs) should be documented on a bug(https://bugs.eclipse.org)
  
* open Hudson in a browser, log in, then go here: https://build.eclipse.org/hudson/manage
 
  
* click Prepare for Shutdown: https://build.eclipse.org/hudson/quietDown
+
= Restarting Hudson =
  
* ssh to build.eclipse.org
+
Here's the process for authorized committers to restart Hudson. A restart is required if the master node is failing, or multiple slaves are acting up.
  
$ sudo rchudson restart
+
1) Login to the web interface (https://hudson.eclipse.org/hudson ) NOTE: if Hudson is so dead that the web interface isn't responding, get Webmaster involved.
  
If that fails and you need to manually kill and restart hudson
+
2) Click the 'Manage Hudson' link in the left hand menu
  
* sudo su - hudsonbuild
+
3) Select 'Prepare Hudson for shutdown'  This will allow you to type a short message explaining why Hudson is to be shutdown, as well as prevent any new jobs from running.
  
* find the running process:
+
4) Either wait for any running jobs to finish or cancel them
  
$ ps xu | grep hudson | egrep -v "su|grep"
+
5) Under Manage Hudson, click 'Plugin Manger'
  
55011      962 21.0  0.8 644064 128764 ?      Sl  18:32  2:21 /shared/common/ibm-java2-ppc-50/bin/java -jar /opt/users/hudsonbuild/hudson.war --configure=/opt/users/hudsonbuild/
+
6) Select the 'Installed Plugins' tab
  
* kill that process
+
7) At the bottom of that page click the 'Restart when no jobs are running' button
  
$ kill 962
+
8) Wait for Hudson to restart itself.
  
* start Hudson:
+
= Restarting a specific slave =
  
  $ ~/startup &
+
The Windows slave requires Webmaster assistance to restart, however the unix slaves can be 'soft' restarted as follows:
 +
 
 +
1) Login to the Hudson web interface
 +
 
 +
2) Select the node from the list of nodes on Hudsons main page.
 +
 
 +
3) Click the 'Mark node temporarily offline' button in the upper right corner of the page.  Provide a short message about why you're about to restart the slave.  (The button then changes says to the reverse operation, "mark this node online".
 +
 
 +
4) Wait for any jobs to finish or cancel them
 +
 
 +
5) Click the 'disconnect' link in the left menu
 +
 
 +
6) Once the node is disconnected wait ~30s and click the 'Launch slave agent' button just under the node name in the main window.
 +
:: [User experience, (David Williams, circa 7/2012), I don't always see a "launch slave agent" button. Once I click "disconnect" (step 5) and confirm with a "reason message", the disconnect button goes away. I refresh occasionally and eventually the disconnect button comes back. At that point, I press the "mark this node online" (end result of step 3) and it all all starts up. But, another observation (perhaps it depends on platform?) sometimes the "disconnect" button does not come back (e.g. after 30 or 60 seconds) but if I then press "mark this node online" then there is a "Launch slave agent" button that appears, and I can then click on that to start things up.] 
 +
 
 +
7) Watch the login process and check that everything looks 'ok'(ie: no errors). The "logs" link in right nav bar is very interesting way to watch the login process.
 +
 
 +
= Creating a new job =
 +
 
 +
Before creating a new job you need:
 +
 
 +
#A job name
 +
#The id of a committer who will 'own' the job
 +
 
 +
Optional:
 +
 
 +
#Extra committer ids
 +
#A 'source' job to copy
 +
#A specific job type
 +
 
 +
1) Login to Hudsons web interface
 +
 
 +
2) Click the 'New job' link in the left hand menu
 +
 
 +
3) Provide the job name and 'default' project type(Build an open source project), if the project does not provided one.
 +
 
 +
3.a) If the project has provided a source job, select 'Copy existing job' and paste the source jobs name into the 'Copy from' text box
 +
 
 +
4) Press ok
 +
 
 +
5) Once the job config loads, scoll down to the 'Security' section.
 +
 
 +
6) Add the owning committer(and any extra committers one at a time) via the 'user group to add' textbox, and by clicking 'add' for each entry.  If you press 'enter' on your keyboard, you'll be doing this again.
 +
 
 +
7) Set the permissions for each user.  On most jobs that means everything except 'Extended read'.
 +
 
 +
8) Remove your id from the list.
 +
 
 +
9) Scroll to the bottom of the page and click save.
  
* log out
 
  
* open Hudson in a browser, log in, then go here: https://build.eclipse.org/hudson/manage to verify your changes
 
  
 
[[Category:Athena Common Build]]
 
[[Category:Athena Common Build]]
 
[[Category:Hudson]]
 
[[Category:Hudson]]
 
[[Category:Releng]]
 
[[Category:Releng]]

Revision as of 14:20, 18 July 2012

Coordinating Activity

In order to prevent multiple people from trying to restart the same slave, all such activity should be announced/coordinated on the cross-project mailing list.

Any specific 'tweaks'(or new jobs) should be documented on a bug(https://bugs.eclipse.org)


Restarting Hudson

Here's the process for authorized committers to restart Hudson. A restart is required if the master node is failing, or multiple slaves are acting up.

1) Login to the web interface (https://hudson.eclipse.org/hudson ) NOTE: if Hudson is so dead that the web interface isn't responding, get Webmaster involved.

2) Click the 'Manage Hudson' link in the left hand menu

3) Select 'Prepare Hudson for shutdown' This will allow you to type a short message explaining why Hudson is to be shutdown, as well as prevent any new jobs from running.

4) Either wait for any running jobs to finish or cancel them

5) Under Manage Hudson, click 'Plugin Manger'

6) Select the 'Installed Plugins' tab

7) At the bottom of that page click the 'Restart when no jobs are running' button

8) Wait for Hudson to restart itself.

Restarting a specific slave

The Windows slave requires Webmaster assistance to restart, however the unix slaves can be 'soft' restarted as follows:

1) Login to the Hudson web interface

2) Select the node from the list of nodes on Hudsons main page.

3) Click the 'Mark node temporarily offline' button in the upper right corner of the page. Provide a short message about why you're about to restart the slave. (The button then changes says to the reverse operation, "mark this node online".

4) Wait for any jobs to finish or cancel them

5) Click the 'disconnect' link in the left menu

6) Once the node is disconnected wait ~30s and click the 'Launch slave agent' button just under the node name in the main window.

[User experience, (David Williams, circa 7/2012), I don't always see a "launch slave agent" button. Once I click "disconnect" (step 5) and confirm with a "reason message", the disconnect button goes away. I refresh occasionally and eventually the disconnect button comes back. At that point, I press the "mark this node online" (end result of step 3) and it all all starts up. But, another observation (perhaps it depends on platform?) sometimes the "disconnect" button does not come back (e.g. after 30 or 60 seconds) but if I then press "mark this node online" then there is a "Launch slave agent" button that appears, and I can then click on that to start things up.]

7) Watch the login process and check that everything looks 'ok'(ie: no errors). The "logs" link in right nav bar is very interesting way to watch the login process.

Creating a new job

Before creating a new job you need:

  1. A job name
  2. The id of a committer who will 'own' the job

Optional:

  1. Extra committer ids
  2. A 'source' job to copy
  3. A specific job type

1) Login to Hudsons web interface

2) Click the 'New job' link in the left hand menu

3) Provide the job name and 'default' project type(Build an open source project), if the project does not provided one.

3.a) If the project has provided a source job, select 'Copy existing job' and paste the source jobs name into the 'Copy from' text box

4) Press ok

5) Once the job config loads, scoll down to the 'Security' section.

6) Add the owning committer(and any extra committers one at a time) via the 'user group to add' textbox, and by clicking 'add' for each entry. If you press 'enter' on your keyboard, you'll be doing this again.

7) Set the permissions for each user. On most jobs that means everything except 'Extended read'.

8) Remove your id from the list.

9) Scroll to the bottom of the page and click save.