Update Site Optimization
The Eclipse Install/Update design concept includes grouping artifacts called features which are published on an Update Site located on a remote server. A feature consists of the feature manifest file and other resources placed in a single JAR archive. When directed at the update site, Eclipse Update Manager must download each of these JARs and parse the manifest in order to perform activities such as site browsing, searching, dependency checking etc.
This approach works reasonably well for moderate update sites, but does not scale well for large sites like Callisto. Each of the feature JARs is small, but opening a connection and downloading this small JAR is costly and adds up. Even worse, users need to pay this price BEFORE they even decide if they want to install anything from the site. A solution is needed to reduce the number of connections simply to browse or search the update site.
Once the features to install have been selected, Update needs to physically download plug-in JARs onto user's machine. At this point, payload size ceases to be trivial - a full Callisto download is several hundred megabytes. A technique to reduce the payload size would benefit users who are downloading the full Callisto set.
The solution comes in two parts: the site digest, and the use of Pack200. The site digest is produced by merging all the information needed for browsing and searching a site into one file that is archived for size and can be downloaded using one connection instead the many separate connections needed to download the features. Pack200 is a jar compression utility that is part of J2SE 5.0 that will reduce the size of the jars significantly.
Both these solutions require enhancements of the Install/Update code to make Update capable of consuming these artifacts. However these performance enhancements are optional and Install/Update should continue to perform as normal in their absence.
Builds, Update Sites and the Site Optimizer
There are two sides to this solution, steps that must be taken during a component's build, and steps that are taken on the update site itself.
To ensure that the jars downloaded from an update site are the same as jars downloaded in a zip distribution, the jars need to be normalized (or repacked) during the build process (see the Pack200 wiki page). This is especially true if the jars will be signed. If the jars are being sent to the Eclipse Foundation to be signed, then this repacking will be done at that time. The actual build of the digest and packing of the jars can be considered a separate step and can be done on the update site itself.
The Site Optimizer
The org.eclipse.update.core bundle provides an application extension named org.eclipse.update.core.siteOptimizer which can be invoked from the command line.
java -jar /eclipse/startup.jar -application org.eclipse.update.core.siteOptimizer [options]
The site optimizer application exposes the digest builder and the jar processor. The digest builder is the tool that creates the actual site digest, the jar processor is a tool that can repack, sign, pack or unpack a jar and all its nested jars recursively.
The site optimizer can be used during a build to do the repacking of the jars. Exactly when it should be called depends on how the build is organized. If the build first builds update jars that are repackaged into the download zips, then the optimizer should be run on those update jars before they are repackaged. If the build produces the download zips first, then the optimizer should be run on the download zips. In both cases, we have either a zip full of jars, or a zip full of directories that contain jars. The site optimizer can take this zip as input and output a similarly shaped zip containing the repacked (and optionally signed) jars:
java -jar /eclipse/startup.jar -application org.eclipse.update.core.siteOptimizer -jarProcessor -repack -outputDir ./out sdk.zip java -jar /eclipse/startup.jar -application org.eclipse.update.core.siteOptimizer -jarProcessor -repack -sign sign_script.sh -outputDir ./out sdk.zip
See the jar processor page for details on the options available for the jar processor.
The Update Site
If the update site is going to contain packed jars, then the site.xml file should specify that it supports pack200 by setting the pack200 attribute:
<site pack200="true">. This lets the Update Manager know that the site contains packed jars, and it will look for a .jar.pack.gz file beside the .jar file that it would normally download. If the .jar.pack.gz file is found, it will be downloaded and unpacked, otherwise the .jar file is downloaded as normal.
The site optimizer is used on the update site to build the digest and do the actual packing of the jars:
java -jar /eclipse/startup.jar -application org.eclipse.update.core.siteOptimizer -digestBuilder -digestOutputDir=/eclipse/digest -siteXML=/eclipse/site/site.xml -jarProcessor -pack -outputDir /eclipse/site /eclipse/site
This command will build the digest and traverse the /eclipse/site directory structure and pack all the jars it finds. The output of a pack is a .pack.gz file, so the result is that beside each jar, there will be a jar.pack.gz file.
What if I don't have Java 5?
If the client being updated is not running Java 5.0 and the unpack200 executable cannot be found by other means, then the Update client will not attempt to retrieve the *.pack.gz files.