Description of the headless booting sequence (Buckminster)
Below is a reasonably current version of an early design document for Buckminster Headless, describing reasons why the boot process, in particular, is somewhat convoluted. Note that some details in current implementation may differ from this description.
Goal: Buckminster must be possible to run from the command line, a.k.a. headless
Effectively, there should be a reasonable set of commands to do many of the relevant things you can do with Buckminster for two main reasons:
- To allow automation of things that you would otherwise be required to do interactively in the Eclipse IDE.
- Allow someone who absolutely does not wish to use the Eclipse IDE at all, a way to at least set up and work with a workspace with the help of Buckminster for further use with other tools (e.g. Emacs).
This mechanism will be supplied in two forms:
- An ordinary Buckminster feature/plugin set for installation into an Eclipse IDE instance will also more or less automatically allow the headless invocations to be done using that instance.
- A self-contained package of Buckminster with just as much of the Eclipse infrastructure and plugins/features required to run. I.e. this is a 'product' in Eclipse parlance.
Apart from the UI aspects present in the first form, the two forms should perform identically in core functionality. However, as both forms are bona fide Eclipse instances, optional plugins/features can be installed and make behavior and available mechanisms differ.
Not surprisingly, there are a number of challenges to overcome in order to make it work smoothly for the end user in various settings; differing platforms etc. Also, from a look-and-feel perspective the commands should behave exactly as any other command line tool - scriptable, transparent stdout/in/err handling etc.
Look and feel
Here we will focus on how the user perceives the command line.
A common pattern suggested for adoption is the 'launcher + command' pattern. This is evidenced by various product toolsets such as 'cleartool', 'p4', 'cvs' just to name a few.
Basically, there is the 'launcher', which is the actual executable. The launcher can accept a number of option flags, which controls the launch and/or provides settings for a common context that all installed commands can make use of. As the launcher parses its command line, it may eventually hit on a non-option argument - this should be treated as the command name. The command name can also have option flags, and/or arguments. As this is still a part of the full command line seen by the launcher, it is the launchers responsibility to internally look up the implementation for the command and dispatch control to it, providing the rest of the command line in a suitable fashion for command parsing.
Thus, a format for using the launcher is something like this:
launcher [launcher-options] [command] [command-options]
Beyond the split between launcher-options v.s. command-options, from a user perspective it should be essentially opaque as to which options are parsed/handled where. In the general case, there should be no required order options must be specified in.
For Buckminster, there should be a launcher that is accessed by simple typing 'buckminster'. This should work on any supported platform. An important point is that it also must be accessible from scripting languages/environments without the need to 'know' more (e.g. it must not require use of some platform specifics).
The format for how to give options will benefit from following some form of common convention, and above all, be consistent (equally applies to both launcher-options and command-options).
A suggestion is to go with the convention of using one letter option names with one dash, and arbitrarily long (but abbreviatable) option names with two dashes. These two variants can be used interchangeably (or the developer only adds recognition of one of them). There is no implied correspondence with the short form letter and the initial letter in the long version.
An example from 'ls':
C:\tmp>ls --help Usage: ls [OPTION]... [PATH]... -A, --almost-all do not list implied . and .. -a, --all do not hide entries starting with . -B, --ignore-backups do not list implied entries ending with ~ -b, --escape print octal escapes for nongraphic characters -C list entries by columns -c sort by change time; with -l: show ctime -D, --dired generate output well suited to Emacs' dired mode -d, --directory list directory entries instead of contents -F, --classify append a character for typing each entry -f do not sort, enable -aU, disable -lst --format=WORD across -x, commas -m, horizontal -x, long -l, single-column -1, verbose -l, vertical -C --full-time list both full date and full time
Option names are always case-sensitive though.
The ultimate objective is to reach the org.eclipse.buckminster.headless plugin, more specifically its 'application' extension point. It is this entry point that gains control and which will interpret and handle (many, but not all, see below) launcher-options and then recognize any command name and dispatch to that, presenting it with the command-options only.
There are a number of design limitations arising from the use of Eclipse as the environment to work in, making the final solution somewhat convoluted.
First, on Windows, any direct use of the 'eclipse.exe' is not possible. It has a few main flaws:
- It is linked as a 'windows' application. This means it will start as a GUI application, i.e. no stdin/stdout, as well as having the effect of returning control to the calling application (the shell, typically) as soon as it is running under the control of the GUI. Also, it will eventually spawn 'javaw' instead of 'java' which are similar in problems. All this is unacceptable for a command-line tool, obviously.
- Even if this flag is switched to being a 'console' application, it further exhibits issues in not properly managing stdin/out/err to any child process as well as not propagating a child process exit code to callers. These are also serious impediments to a command-line tool.
An alternative to using the Eclipse supplied executable is to call the same things it does. The really important thing it does is actually calling a java 'main' entry point in the supplied startup.jar. Thus, the same thing can easily be done directly, approximately like this:
java --jar startup.jar --application <the application extension point name>.
On the surface, this should work. However, once again problems arise:
- While eclipse.exe would help manage any specific java options to send to the VM, this form would require a user to manage this. Small problem to be sure.
- Some (many) options will be acted upon by the startup.jar code before dispatching to the application. Especially this means the
While that might be ok, the option is arguably a bit badly named - '-workspace' would have felt more natural. On a related note - it is generally impossible for a plugin to later change the workspace once startup.jar has established it (not exactly true, but close enough). This restriction implies that whatever the solution, selection of a workspace must be somehow specified before actually running startup.jar.
- What is absolutely worst however is that the startup.jar code appears to indiscriminately walk the command line and interpret any options anywhere it recognizes them and then remove them (ouch!). A typical example of this is to say the imaginary
buckminster build --clean
If this is done directly through startup.jar, end result is that '-clean' is acted on by startup.jar and when the 'build' command gets control, there no longer is any '-clean' to see.
Hence, to make this work the way it needs to, Buckminster code must be in control all the way. A suggested implementation:
- A Buckminster boot startup class (BOOT)
- This class is a regular Java class that is run completely without relying on anything in Eclipse (except for startup.jar). The objective for this class is to interpret the raw command line and preprocess it similar to what startup.jar does (but with Buckminster semantics and awareness of the issues described above) and then run startup.jar after the preprocessing. Specifically, this class and the internal APPEXT class below follow a common protocol to transfer various information without having it mangled by startup.jar
- This jar is started using the full incantation of 'java --classpath ...' etc. For convenience, a suitable 'buckminster' executable should be provided for various platform in order to give the user the experience of using 'just another tool'.
- The internal application extension class (APPEXT)
- This is the regular Buckminster plugin to which startup.jar eventually transfers control. Again, this knows about the protocol BOOT uses and will do what's necessary to get the real data, eventually transferring control to a COMMAND
- The set of installed commands (COMMAND)
- A command is dynamically installed through the use of an extension point. The APPEXT will present such commands with what seems to be a clean, nice command line for them to process in whatever manner they choose.
Each of these has it own set of options they will respond to, most likely modifying the initial raw command line as it goes. As BOOT will have to recognize all APPEXT options, and the parsing behavior should be the same, it is beneficial to ensure that BOOT and APPEXT share relevant code.
As the BOOT class provides a parser for the raw command line, it can do anything with it and hence, Buckminster can define all of the option names as needed.
There are a few separate classes of options/information:
- influencing the behavior of the BOOT code
- front ends to influence behavior of whatever BOOT is starting
- influencing the behavior of the APPEXT code
- transparently passed on to COMMAND
Class 1 & 2
Given the current Java BOOT implementation, it only understands one class 1 option: '-‑forcenewvm'. This pertains to the fact that BOOT defaults to running the startup.jar code in the same VM as itself. But if forceNewVM is specified, it spawns a completely new process.
Future additions should be made here in order to support transparent specialization of VM options (e.g. using '-vmargs --Xmx512M); using such flags requires it to internally force a new VM. This is especially necessary if the 'buckminster' executable is used; since that is merely an embedded command line starting java, it's difficult to insert special VM options there.
Also, it currently understands one class 2 option: '‑-workspace <path>'. This is merely a more sensible (?) name for the '-data' option that startup.jar knows about - thus, --workspace will be transformed to --data and inserted at the proper place.
|--bootlog||This causes the boot process to log what happens internally and then via internal protocol make the APPEXT aware of this log so that APPEXT can log it 'properly' using the core logging mechanisms.
This behavior is default, hidden from the user and useful in after the fact debugging. Can be turned off with --nobootlog.
|--boottrace||This causes the boot process to trace what it's doing directly to stdout. This is not default; useful in debugging things if it never even gets to the point of starting APPEXT (i.e. not even the bootlog is present).|
|--forcenewvm||Regardless of what the decision is, always force a new VM.|
Common to class 1 options are that they will be removed from the command line as they are acted upon by BOOT. Note that this only happens up to the first argument on the command line that doesn't start with a dash (signifying that it's actually a command name) - a COMMAND is free to use these option names if they so wish.
|--workspace <path>||Will be transformed to \--data when sent to startup.jar.|
|--vm <path to java executable>||This is useful if the user specifically wants to run with another VM than the one selected by using just 'java'. Will force a new VM regardless.|
|--vmarg <some regular VM arg> [--vmarg <another VM arg>\] ...||This allows the user to customize the VM settings for the new process. These are location dependent in that they will be fed to the new VM (use of these flags forces a new VM regardless) in the order they're seen on the cmdline.|
Common to class 2 options are that they will be removed from the command line as they are acted upon by BOOT, generally transformed into something else that BOOT starts. Note that this only happens up to the first argument on the command line that doesn't start with a dash (signifying that it's actually a command name) - a COMMAND is free to use these option names if they so wish.
The only reason for BOOT to recognize these is to be able to correctly find the supposed COMMAND name (instead of thinking that an option argument is the command name).
BOOT never touches these; it's sent along unchanged to APPEXT.
The primary need here is to transmit an arbitrary command line from BOOT to APPEXT without startup.jar destroying it or inappropriately acting on something not intended for it. BOOT solves this by simply writing class 3 & 4 data (i.e. after removing class 1 & 2 data) to a temporary file, one line per command line argument. It then sends the name of the command line file to APPEXT.
APPEXT in turn knows it will receive a file name, and thus simply reads the file, and treats the result as if it was a regular command line.
However, there are some minor things APPEXT must handle:
- If (as noted, it's optional) BOOT has produced a boot log file, it will send along that filename as well so APPEXT can properly log it. Note that this happens before the cmdline file is read, so the information can not be placed there.
- If the application extension point is invoked by the PDE, i.e. interactively through the Eclipse IDE, it always sends along '-pdelaunch'. APPEXT must be able to parse this (and simply discard it or do something whatever with it).
APPEXT takes a number of options. Part of the APPEXT responsibilities is to provide a common context for all commands, i.e. somewhat like metadata settings applicable to all commands. Some settings sets up things that a COMMAND can ignore (intentionally or non-intentionally - e.g. a default progress monitor is set up depending on settings and a COMMAND is expected to use that if they need a progress monitor) and others simply sets internal things that all COMMAND's unwittingly become subject to (e.g. turning on logging so that System.out is trapped and logged).
Common to both APPEXT and COMMAND is that they are currently subjected to automatic recognition of '-?' and '-help', i.e. to print out help on their respective subjects.
Options in general
The APPEXT options can be loosely grouped in debugging, logging and context setting options. Current options given below, bold are defaults:
|-displaystacktrace||To avoid frightening the average user with really ugly output for sometimes very harmless errors, APPEXT won't print full stacktraces. This can be turned on with this option.|
|By default makes available instances of a progress monitor implementation somewhat suitable for console use. If turned off, produces NullProgressMonitor instances. Individual commands must ask the context for an instance when needed.|
|-tmpdir <path>||Defaults to the value of java.io.tmpdir, but can be set by the user to directory tmp files to another location. Individual commands must ask the context for the value and use it.|
|-log||This is shorthand for turning on all logging and send it to stdout.|
|-logtofile||Similar to --log but directs logging to a specific file.|
|-logconfiguration <properties or xml file name>||Allows the user to specify a specific log4j configuration rather than the defaults|
|Only applicable when --logtofile is used - either appends to the file or overwrites it.|
|Turns on/off trapping of System.out|
|-trapoutloggername <name>||In case the user has some exotic log configuration set by --logconfiguration, the name used to log System.out traps can be customized|
|Turns on/off trapping of System.err|
|-traperrloggername <name>||In case the user has some exotic log configuration set by --logconfiguration, the name used to log System.err traps can be customized|
|Turns on/off trapping of the regular Eclipse log|
|-trapeclipselogloggername <name>||In case the user has some exotic log configuration set by --logconfiguration, the name used to log Eclipse log traps can be customized|
|-pdelaunch||(as described above)|
Command names follow a namespace mechanism, as usual to avoid clashes in case more than one command wants to be known as 'foo'. The namespace for a given command is usually the same namespace as the enveloping plugin, but a plugin can also add levels to the namespace in case it wants to group commands inside it - this is all described in the extension point.
But also as usual, users dislike typing so much - thus, APPEXT will attempt to match command names bottom up; as long as there's only one possible match, APPEXT is happy and will dispatch. But, if collisions arise, the user may need to become specific. E.g. the commands com.domain_a.pluginX.foo and com.domain_b.pluginY.foo can't be reached as 'foo'; they will have to be used as 'pluginX.foo' and 'pluginY.foo' respectively (or, obviously using the full names).
Finally, a command can in the extension-point define one or more alias names; such names are also subject to the namespace.