Skip to main content

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Jump to: navigation, search

Difference between revisions of "Equinox/p2/Omni Version"

< Equinox‎ | p2
(Version Formats)
(Version Formats)
Line 74: Line 74:
 
* <digits> - integer segment
 
* <digits> - integer segment
 
* '<characters>' - a string segment
 
* '<characters>' - a string segment
* maxn
+
* maxn - the symbolic value MAX INTEGER
* maxs
+
* maxs - the symbolic value MAX STRING
  
 
Example: OSGi 1.0.0.r1234 is expressed as raw:1.0.0.'r1234'
 
Example: OSGi 1.0.0.r1234 is expressed as raw:1.0.0.'r1234'

Revision as of 13:42, 17 December 2008

W.I.P - Being Reworked after discussion Dec 15, 2008.

Introduction

This page describes a proposal for adding support for non OSGi version and version ranges in Equinox p2. This page was created as a result of the discussion on the p2 call on Dec 1, 2008. See bug 233699 for discussion.

Background

There are other versioning schemes in wide use that are not compatible with OSGi version and version ranges. The problem is both syntactic and semantic.

Example of semantic issue

Many open source projects do their versioning in a fashion similar to OSGi but with one very significant difference. For two versions that are otherwise equal, a lack of qualifier signifies a higher version then when a qualifier is present. I.e.

1.0.0.alpha 
1.0.0.beta
1.0.0.rc1
1.0.0

The 1.0.0 is the final release. The qualifier happens to be in alphabetical order here but that's not always true.

Example of syntax issue

Here are some examples of versions used in Red Had Fedora distributions.

KDE Admin version 7:4.0.3-3.fc9
Compat libstdc version 33-3.2.3-63
Automake 1.4p6-15.fc7

These are not syntactically compatible with OSGi versions as they use colon, and dash as leading separators.

Some versioning schemes place the qualifier first.

Current implementation in p2

The current implementation in p2 uses the classes Version and VersionRange to describe the two concepts and these are implementations handling only OSGi version type.

Proposed Solution

Equinox p2 should have one implementation of Version and one of VersionRange (lets call them OmniVersion, and OmniVersionRange) capable of capturing the semantics of various version formats. The advantages over previous proposal are that there is no need to dynamically plugin new implementations, and new formats can be more easily be introduced.

Even if the finished solution only requires a single implementation (the OmniVersion), there are other factors to consider. The current p2 SimplePlanner uses the OSGi planner, and it can only understand OSGi versions. There is work being done on SAT4J to enable it being used instead of the OSGi planner (work to handle "explanations" could also be used to handle "attachments" (now being done with OSGi planner). To be able to develop the OmniVersion solution, it needs to coexist with the current SimplePlanner, and thus the current direct use of the classes Version and VersionRange needes to be modified to use IVersion and IVersionRange. (Such an implementation is available in bug 233699 as a patch).

  • The interfaces IVersion and IVersionRange should be used throughout the code instead of directly using the corresponding Version and VersionRange classes.
  • An IVersion is obtained by calling a factory method such as VersionFactory.create(String versionString)
  • An IVersionRange is obtained by a similar factory method
  • The version string and version range has a URI scheme like prefix to indicate the version type
  • The factory API can naturally contain some options where scheme and version strings are either separate or canonical
  • When a version or version range is present without the version type prefix, the default is to use OSGi version type (this preserves backwards compatibility).

It is now possible to replace the SimplePlanner's use of OSGi planner with a similar planner that handles OmniVersion. This can be done by someone that needs support for different versions formats before the SAT4j solution is available.

A note on terminology Although not technically correct, "the type of the version" or "version type", should be read as referring to the "version format".

OmniVersion

The OmniVersion implementation uses a segment Array of Object that is kept in order of significance. The Array can hold Integer, String, Date, and the special marker classes MaxString, and MaxDate.

The OmniVersion implements Comparable. Segments are compared with the following rules:

  • if types are not equal the result is based on the significance of the type; Integer > MaxString > String > MaxDate > Date
  • if the type is equal - they are compared and result is returned
  • If all segments are equal up to end of the shortest array, the shorter version is considered smaller

In order to be able to present the version in human readable form, the original version string is also kept. An OmniVersion can be instantiated from a "version string" as described in the Version Format section below.

Version Formats

There are two basic formats - when no format has been specified the version string is an OSGi version string (this ensure backwards compatibility), or format which fully specifies the format. A ':' separates the format from the version text.

format string description
The default OSGi version type i.e. "1.0.0"
raw The raw canonical form consists of a period separated list containing the following entries:
  • <digits> - integer segment
  • '<characters>' - a string segment
  • maxn - the symbolic value MAX INTEGER
  • maxs - the symbolic value MAX STRING

Example: OSGi 1.0.0.r1234 is expressed as raw:1.0.0.'r1234' When this format is used, both publisher and user must know the correct way to create the raw version.

format(<version_format>) Specifies a version format consisting of a transformation pattern.

The transformation pattern can contain the following formats:

  • '<character(s)>' - matches a single character or sequence of characters - the matched result is not included in the resulting canonical vector (i.e. it is not a segment). A '\\' is needed to include a single '\'. The sequence of chars acts as one delimiter.
  • <non-alphanum character> - matches any non alpha-numerical character (including space) - the matched result is not included in the canonical vector (i.e. it is not a segment). A non alphanumerical character acts as a delimiter. Special characters must be escaped when wanted as delimiters.
  • a - auto - a sequence of digits creates a numeric segment, a sequence of alphabetical characters creates a string segment. Segments are delimited by any character not having the same character class as the first character in the sequence, by the start of the next format, or by range delimiters. A numerical sequence ignores leading zeros.
  • d - delimiter; matches any non alpha-numeric character (except terminating range delimiters).
  • s - a string group matching any character except terminating range delimiters, and any following explicit/optional delimiter
  • n - a numeric (integer) group. Leading zeros are ignored.
  • ( ) - indicates a group
  •  ? - zero to one occurrence of the preceding format
  • * - zero to many occurrences of the preceding format
  • + - one to many occurrences of the preceding format
  • {min[,max]} - min to max occurrences of the preceding format. The 'max' part is optional. An empty max specification means the same number as specified by min. Max must be >= min.
  • [ ] - short hand notation for an optional group. Is equivalent to ()?
  • =<processing>; - an additional processing rule is applied to the preceding format. The processing part can be:
    • an integer - use this as a default value if input is missing
    • a string in single quotes - use this as default value if input is missing
    • max - max default value - can be applied to 's' and 'n' to mean MAX-STRING, and MAX-INT respectively.
    • maxs - max default string value - can be applied to 'a' and groups
    • maxn - max default integer value - ucan be applied to 'a' and groups
    • ignore - if input is present do not turn it into a segment (i.e. ignore what was matched)
  • \ - escape must be used to escape all special characters in s or a segments or to include these characters as delimiters. A '\\' is needed to include a '\'. Escaping a non special character is superflous but allowed.

Additional rules:

  • if the format produces null segments, they are not placed in the vector e.g. format(ndddn):10-/-12 => 10,12.
  • Processing (i.e. default values) applied to a group has higher precedence than individual processing inside the group if a) the group was not successfully matched.

Note about timestamps The earlier proposed 't' format was deprecated because of the complexity. Instead, the creator of an IU should simply use 's' or 'n' and ensure comparability by using a fixed number of characters when choosing 's' format.

Named Version Formats

Named version formats makes it easier to enter version strings. There should be a number of predefined names as shown in the table below.

type name pattern comment
osgi n[.n=0;[.n=0;[.s]]]] The default osgi type e.g. "format(n(n[.n=0;[.n=0;[.s]]]]):1.0.0", "osgi:1.0.0" and "1.0.0" are all equivalent.
triplet n(.n=0;[.n=0;[.s=max;]]] A variation on OSGi, with the same syntax, but where the a lack of qualifier > any qualifier.
tripletSnapshot n[.n=0;[.n=0;[-n=max;.s=max;]]] Format used when maven transforms versions like 1.2.3-SNAPSHOT into 1.2.3-<buildnumber>.<timestamp> ensuring that it is compatible with triplet format if missing <buildnumber>.<timestamp> at the end (format produces max-int, max-string if they are missing).
rpm [n:]a+[-n[ds=ignore;]] RPM format matches [EPOCH:]VERSION-STRING[-PACKAGE-VERSION], where epoch is optional and numeric, version-string is auto matched to arbitrary depth >= 1, followed by a package-version, which consists of a buildnumber separated by any separator from trailing platform specification, or the string 'src' to indicate that the package is a souce package. This format allows the platform and src part to be included in the version string, but if present it is not used in the comparisons. The platform type vs source is expected to be encoded elsewhere in such an IU.

An example RPM version is "33:1.2.3a-23/i386" which creates the vecor 33, 1, 1, 3, 'a', 23

string s Perhaps superflous, but makes this version format appear in a selectable list of formats.

The version range delimiters are: '(', ')', '[', ']' and , ',' (comma).

Defining named formats

W.I.P - the below does not handle conflicting formats names from different repositories. Can probably use java package name to fully qualify versions. And then handle clash on full name as a hard error when writing to a repo. Last part may be used as a shothand if this uniquly identifies the format. etc. T.B.W

An IU can define new named formats. The named formats are defined by using a key/value list:

org.equinox.p2.version.formats={'formatname','formatstring', ...}

These named formats may then be used in the IU.

When an IU is stored in a repository, the following processing is done:

  • The defined formats are extracted from the IU
  • If the fomat name does not already exist in the repository, it is added to the repositories list of contained formats.
  • If the format name already exists in the contained formats list and the format pattern is the same - nothing needs to be done
  • If the format name already exists in the contained formats list and the format pattern for the contained name is different - the name of the new pattern is changed by concatenating a ".n" where n is an integer incremented until the format name is either matching a contained format, or the name is not contained. When the name is changed, the IU is re-factored to use the new names before being added to the repository.

When using a non standard format name in an IU:

  • The used format must be stored in the IU before it can be used

Attempting to redefine pre-defined formats:

  • The pre-defined formats have higher precedence
  • a warning should be issued if a new format is introduced with incompatible pattern. Same pattern is ok as a popular format may become pre-defined.

The user interface can:

  • collect all defined formats from all known repositories and present them when the user is defining a version or range
  • have a function to define a new format which is stored in the current profile (and thus becomes available for use)

This scheme allows format names to spread virally. The possible downside is potential clashes.

Examples using format

A version range with format equivalent to OSGi

format(n[.n=0;[.n=0;[.s]]]):[1.0.0.r12345, 2.0.0]

At least one string, and max 5 strings

format(s[.s[.s[.s[.s]]]]):vivaldi.opus.spring.bar5
format(s(.s){0,4}):vivaldi.opus.spring.bar5  => 'vivaldi', 'opus', 'spring', 'bar5'

At least one alpha or numerical with auto format and delimiter

format(a+):vivaldi:opus23-spring.bar5  => 'vivaldi', 'opus', 23, 'spring', 'bar', 5

The texts 'opus' and 'bar' should not be included:

format(s['.opus'=ignore;n['.bar'=ignore;n]]):vivaldi.opus23.bar8   => 'vivaldi', 23, 8

Classic SCCS/RCS style:

format(n+):1.1.1.1.1.1.1.4.5.6.7.8

Max depth 8 of numerical segments (limited classic SCCS/RCS type versions):

format(n{1,8}):1.1.1.1.1.1.1.4

Numeric to optional depth 8, where missing input is set to 0, followed by optional string where 'emtpy > any'

format(n=0;{1,8}[a=maxs;]):1.1.1.4:beta   => 1,1,1,4,0,0,0,0,'beta'
format(n=0;{1,8}[a=maxs;]):1.1.1.4   => 1,1,1,4,0,0,0,0,MAXSTRING

Uninterpreted single string range

format(s):[andrea doria,titanic]

Internationalization

The proposed types using alphanumerical segments are assumed to use vanilla string comparison. This does not work so well if versions are expressed in a language where lexical ordering is different. Language specific collation could be supported by combining version type name with the name of a ISO 639 Language code (see java.util.Locale) and where the default would be English. The language could be encoded with a separating '-' e.g. 'format-pt' for collation in Portuguese.

This opens up another can of worms (decomposition strength, comparison of locale and non locale specified types, etc.), and it is probably best to implement just basic string comparison. It is also questionable if internationalization is wanted at all, as "known tools" does not support this, and "correct collation" would thus yield a different result.

Support for internationalized collation is not recommended.

Version Range

Version range uses the osgi range syntax, but prefixed with version format.

Examples:

  • [1.0.0,2.0.0] equal to osgi:[1.0.0,2.0.0]
  • format(s):[andrea doria,titanic]
  • rpm:[7:4.0.3-3.fc9,8:1] - an example KDE Admin version 7:4.0.3-3.fc9 to 8:1
  • triplet:[1.0.0.RC1,1.0.0]

Factory API

The factory API could be something simple like this:

public class VersionFactory
{
    IVersion createVersion(String versionString);
    IVersion createVersion(String versionType, String versionString);
}
public class VersionRangeFactory
{
    IVersionRange createVersionRange(String versionRangeString);
    IVersionRange createVersionRange(String versionType, String versionString);
}

Hard to say how much indirection is required - methods could just be static to keep things simple.

If we want to support the pattern based type, the factory methods needs the pattern as well. To make this generic, it could be seen as a paramter to the version type.

public class VersionFactory
{
    IVersion createVersion(String versionString);
    IVersion createVersion(String versionType, String versionString);
    IVersion createVersion(String versionType, String versionTypeParameter, String versionString);
}
public class VersionRangeFactory
{
    IVersionRange createVersionRange(String versionRangeString);
    IVersionRange createVersionRange(String versionType, String versionString);
    IVersionRange createVersionRange(String versionType, String versionTypeParameter, String versionString);
}

When creating a pattern based version, the versionTypeParameter must be supplied. When creating a pattern based version range, the pattern is optional - the pattern of the individual candidates would then be used to create the canonical form of the upper and lower bounds.

IVersion and IVersionRange API

Basically follows the current Version and VersionRange classes.

Applicability

The generalization of version type applies to objects that by nature may have a different versioning scheme than OSGi. This includes:

  • Installable Unit
  • Provided Capability
  • Required Capability

This needs to be discussed:

  • Artifact key versions

These does (probably) not need to be generalized:

  • File format version numbers (content.xml, artifact.xml, etc)
  • Update Descriptor
  • Touchpoint version numbers and touchpoint action versions
  • Publisher advice versions
  • Artifact key versions

Implementation Steps

  1. Enablement
    1. Change use of Version and VersionRange to use IVersion and IVersionRange
    2. Implement the VersionFactory and VersionRangeFactory by just creating instances of Version and VersionRange.
    3. Make sure UI validates human input based on factory result
  2. Basic Extensions
    1. Implement the non pattern based version types
    2. Modify the factories to create correct instances of the types
  3. Pattern based implementation
    1. Implement the pattern based version types (format and any)
    2. Modify the factory
    3. Modify the comparator(s)
    4. Adjustments to UI ? (should not be needed)

Back to the top