COSMOS Design 238492
Support xml:base in SML validator
This is the design document for bugzilla 238492.
|David Whiteman||January 14, 2009|
|John Arwe||January 16, 2009|
|Process||Sizing||Names of people doing the work|
|Design||1||John Arwe, David Whiteman|
|Code||2||John Arwe, David Whiteman|
|Build and infrastructure||0|
|Code review, etc.*||1|
'* - includes other committer work (e.g. check-in, contribution tracking)
This enhancement will provide a basic implementation of xml:base per the SML-IF 1.1 specification. SML-IF requires producers to support xml:base, and allows consumers to support it, as a mechanism for transforming relative references to absolute URIs RFC 3986.
Relative references may be used in many places in an SML-IF document, e.g. SML references (as sml:uri content), smlif:baseURI content (to establish model and/or document base URIs), rule bindings, and schema bindings.
- Calculate the base URI, at any element in the SML-IF document, that will be used if necessary to transform a relative reference into an absolute URI.
- A single document may use a combination of xml:base and smlif:baseURI to establish base URIs at various points in the document.
- SML-IF prescribes that when both mechanisms can be used to calculate a base URI for the same element, xml:base wins.
- SML-IF allows separate base URIs to be established for each document, using either smlif:baseURI, xml:base, both, or a combination of mechanisms.
- Be very careful about boundary cases, which are very easy to get wrong.
- xml:base might be specified on sml:uri (i.e. within an SML reference element)
<my:element sml:ref="true"> <!-- The SML reference element --> <sml:uri xml:base="http://www.example.org/a/different/base">foo/bar</sml:uri> </my:element>
- xml:base might be specified to establish a base URI for a relative reference found in smlif:baseURI at the document level, e.g.
<smlif:docInfo xml:base="http://www.example.org"> <smlif:baseURI>foo/bar</smlif:baseURI> </smlif:docinfo>
- Since aliases and SML references can contain relative references, they can no longer be compared "as-is" since doing so implicitly treats them as absolute URIs. Any relative reference must be transformed into an absolute URI prior to comparison.
- Calculating the base URI using xml:base requires that the entire ancestor axis, not only the element containing the relative reference, be available when the code needs to transform a relative reference.
- AbstractDataBuilder adds a stack to keep track of the effective base URI at any given point in the parsing process, so the existing databuilders logically inherit this code by calling the super class operations when they override element methods.
- FoundationBuilder adds the code to manage calculation of the model base URI since the "final" answer cannot be known until after the model identity section has been fully processed. "Final" is quoted because, when xml:base is used, the value can be logically overridden within the descendant/sibling axes.
- URIReference adds substantial code to encapsulate the concepts of alias and model base URI. Since this code is papering over implementation differences between what SML needs (RFC 3986) and what java.net implements for URI parsing (RFC 2396), some of the code needs direct access to the URIReference components and internal operations. The new code also takes over some of the logging responsibilities that were previously distributed in the callers, and tries to be more precise in the error messages being issued.
- Concrete Databuilders are updated to ensure they call the super class operation when overriding startElement and endElement, to call the relevant URIReference method(s) before attempting to process potential relative references, and to make transformed (i.e. absolute) URIs available to down-stream validators.
- Validators and their associated input artifacts are updated to use the absolute URIs now used in artifacts built by the databuilders.
- test-resources are updated to exercise some boundary conditions and additional cases related to xml:base.
- expected-results are updated to reflect changed error/warning messages and to reflect absolute URIs
- Deref still assumes the model base URI is always used to transform relative references, as URIReference formerly did. None of the existing test cases utilize a different base URI, so this limitation was accepted as a way to limit the size of this (already substantial) patch.
- Rule evaluation may invoke deref, so it is subject to the same discussion.
- Identity constraint checking may invoke deref, so it is subject to the same discussion.
These limitations could probably be solved in the future by using the DOM cache, asserting a document URI for each cache entry based on the computed SML-IF document base URI (regardless of its markup origin), and then invoking getBaseURI() on the deref context node. Preliminary tests make this appear to be a viable approach.
Impacts of this enhancement
- The repository FileOperation class was hit to supply a base URI for SML-Unit models (sets of model documents that lack an encapsulating SML-IF document instance) equal to the root directory of the SML model units on the local machine. This allows existing SMLModelUnit test cases to continue working unchanged.
- No updates were made to import/export metadata to handle document-level base URIs etc. That code path might well need other enhancements for spec compliance if full compliance is desireable there.