Hudson-ci/features/Restart Within Hudson
Restart Within Hudson
Configuration changes that require Hudson restart to take effect should provide a
Restart link or button.
Unfortunately, the CLI restart or soft-restart methods, which call
Hudson.safeRestart, respectively, don't handle many of the necessary use cases by default. A Hudson restart link based on these methods would too often fail. It is one thing to leave an opening for plugins to implement, but quite another to depend on the kindness of plugins for basic Hudson operations. The default Lifecycle implementation needs to be in some sense universal, even if it requires the cooperation of system administrators.
In the following, "correct restart" means restart specifically tailored to the environment in which the Hudson instance is running. "System administrator" is a person who provisions and deploys the Hudson instance. "Hudson admin" is a person with admin privileges in Hudson. Sometimes these are the same people; sometimes not.
- The current Lifecycle API must be preserved, for compatibility with existing plugin extensions. The new default mechanism will only be invoked if the
hudson.lifecyclesystem property is not specified.
- It must be possible for a system administrator to preconfigure Hudson for correct restart. This is particularly important for "no-admin" uses of Hudson.
- Restart must require Hudson admin privileges (ACL.SYSTEM).
Two designs are being actively pursued.
The first, Restart Command, allows a system administrator to specify a command or provide a script during Hudson startup. This is most suited to Hudson running in containers like Tomcat, etc. which provide easy application restart from the command line. The second, Soft Restart, is more general and more likely to cause problems; it involves re-invoking a subset of the startup sequence to create new instances of Hudson classes, plugins, etc. rooted in a different classloader.
Correct restart is a multi-dimensional problem, different for each OS, service implementation and container (Tomcat, Jetty, GlassFish, etc.). The Lifecycle extension point is not well suited for multi-dimensional invocation, e.g., the OS is X AND the container is Y and (the container does not support single application restart OR the application name/war file location in the container is Z). While certain tricks, like replacing the Hudson WAR file, work to restart the application in many containers, to cover all the possibilites, Lifecycle would need an extension for every possible combination.
Yet, every system administrator already knows or can easily discover a command or script to restart any running Hudson instance. The best and most likely to be correct restart mechanism would leverage those commands/scripts and not try to replace it with Java code. The feature described below provides a generic Lifecycle that does.
Two new ways are provided for the system administrator to configure correct restart:
- by providing a
hudson-restartfile in the $HUDSON_HOME provisioned for the Hudson instance and
- by specifying a restart command on the command line when Hudson is invoked.
In a nutshell, restart within Hudson will invoke a system administrator--supplied command.
Since the command is presumed to restart Hudson, it may not return at all. If it does return, it is expected to return successfully. If the restart command fails, restart will fail.
If the restart command is not specified, the existing default Lifecycle mechanism will be invoked.
hudson.lifecycle system property is specified, the restart command, if any, will not be used.
Default Restart Script
To allow system administrators to pre-configure Hudson for correct restart, if a restart script is present in HUDSON_HOME the initial value of the restart command will be set to invoke the script. The script must be named
The script or program may have an extension, e.g.,
.bat on Windows or
.py, etc. on Unix, but it doesn't matter what it is; in all cases, it will be invoked as a program.
The script must be executable.
The script should be specific to the environment in which the HUDSON_HOME is used.
Restart System Property
Even if a restart script is present in HUDSON_HOME, a system administrator may change the restart command to do something else, effectively ignoring the script. This may be overridden by the command line option:
The value of the
hudson.restart system property will be used to initialize the restart command. The correct order of initialization is:
-Dhudson.restartis specified, use that.
- Otherwise, if a restart script is present, use a command that invokes the script.
- Otherwise, the restart command is not specified.
A new lifecycle,
hudson.lifecycle.RestartCommandLifecycle, will be added to Hudson.
hudson.lifecycle.Lifecycle will be modified to use the RestartCommandLifecycle for restart if:
- a restart script or command has been specified, and
hudson.lifecyclesystem property is not defined.
The implementation will log a warning message if both xx and a restart script or command have been provided.
Another way of looking at the restart problem is to observe that most of it deals with the container, service or command that launched Hudson. These issues would be sidestepped if Hudson could be restarted in the JVM it is currently running in.
A major advantage of such an approach would be that the fact that Hudson was restarted would not be detectible from the outside. The PID, servlet connection, etc. would be unchanged. This would life simpler for certain kinds of high availability environments.
The major disadvantage, of course, is it might cause massive memory leaks and thrown exceptions.
- For starters, if permgen cannot be successfully garbage collected, the JVM will likely fail after restart with a "java.lang.OutOfMemoryError: PermGen space" error. Or the influx of newly loaded classes and instances might force an OutOfMemoryError in the heap.
- The strategy depends on plugins:
- Not being able to find a way to lock themselves and their storage in memory,
- Obediently stopping whatever operations they have in progress and freeing resources like files and sockets.
This type of restart should not be attempted except during safe-restart.
The initial implementation will modify
org.eclipse.hudson.init.InitialSetup.invokeHudson() to create an outer class loader used to create all Hudson classes and instances not including those used by the initialization sequence up to that point. It will then use this class loader to create and invoke the thread that initializes the Hudson instance.
A soft-restart-plugin will be developed, with class
org.hudsonci.plugins.SoftRestartLifecycle extends hudson.lifecycle.Lifecycle. The plugin will be installed like any other, and used only if the
hudson.lifecycle system property is defined to name it prior to Hudson startup. Thus, the availability of soft restart will be entirely under the control of the system administrator. E.g.,
$ export hudson.lifecycle=org.hudsonci.plugins.SoftRestartLifecycle $ <launch Hudson>
This simple example shows how one might use the restart command to reliably restart Hudson no matter how it is started, with the added virtue that if Hudson exits for any reason, it will automatically restart. It involves four scripts,
hudson-restart, shown below in bash pseudo-code.
#!/bin/bash <command to run hudson.war or a server that runs hudson.war> # this script always fails! exit 1
until ~/start-hudson.bash; do echo "Restarting hudson" sleep 1 done
#!/bin/bash <command to kill running hudson process or stop server running hudson>
run-hudson script always restarts Hudson if it exits, simply causing Hudson to exit will restart it.
So add the following script to the HUDSON_HOME that Hudson will use when started by the
This technique of having another process restart Hudson is more reliable than having a child process of Hudson (the
hudson-restart script) kill Hudson and relaunch it. Also, although some servers allow you to restart a single application in the same JVM, it is risky to do so; unless all instances and classes of the previously running Hudson are garbage-collected, several restarts will exhaust
permgen. It is more reliable to relaunch the entire server, and therefore it is a good practice to run only one application per server, so other applications aren't affected.
Restart Links or Buttons
Restart links should not be shown unless
true. Otherwise, a message like "Restart required for changes to take effect" should be displayed.
A restart link should show the single word Restart and should call
The Plugin Manager will show a restart link if a) a plugin is updated or a plugin is loaded that requires restart, and b) if