Skip to main content

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Jump to: navigation, search

Difference between revisions of "COSMOS Design 237921"

(Open Issues/Questions)
(Change History)
Line 10: Line 10:
 
|06/23/2008
 
|06/23/2008
 
|<ul><li>Initial creation</li></ul>
 
|<ul><li>Initial creation</li></ul>
 +
|-
 +
|David Whiteman
 +
|07/30/2008
 +
|<ul><li>Added notes to Open Issues section</li></ul>
 
|}
 
|}
  

Revision as of 14:18, 30 July 2008

Support PSVI in SML Validation

Change History

Name: Date: Revised Sections:
Ali Mehregani 06/23/2008
  • Initial creation
David Whiteman 07/30/2008
  • Added notes to Open Issues section

Workload Estimation

Rough workload estimate in person weeks
Process Sizing Names of people doing the work
Design .5 Ali Mehregani
Code 6 Ali Mehregani / David Whiteman
Test 2.5 Ali Mehregani
Documentation 0
Build and infrastructure 0
Code review, etc.* 0
TOTAL 9

Terminologies/Acronyms

The terminologies/acronyms below are commonly used throughout this document.

Term Definition
SML Service Modeling Language
SML-IF Service Modeling Language - Interchange Format
PSVI Post Schema Validation Infoset

Purpose

This document is associated with bugzilla 237921 and bugzilla 237872.

The purpose of the feature is to use PSVI when constructing structures required for validating SML constraints. Currently the validator parses through each definition document to determine element types, their derivation, and any associated SML constraints. Manual parsing of definition documents negatively impacts performance and memory consumption required by the validator. Another disadvantage of manual parsing is the inability to cover all possible cases. It's preferred to rely on an established interface as opposed to manually parse through each schema document.

What is PSVI?

Post Schema Validation Infoset (PSVI) is the ability to access schema-level information when parsing an XML document. The interface set is available via DOM or SAX. The snippet below describes how the PSVI provider can be retrieved when SAX parsing a document:

SAXParserFactory saxParserFactory = SAXParserFactory.newInstance();
saxParserFactory.setFeature("http://apache.org/xml/features/generate-synthetic-annotations", true);
saxParserFactory.setFeature("http://xml.org/sax/features/validation", true);

SAXParser newSaxParser = saxParserFactory.newSAXParser();
newSaxParser.setProperty("http://java.sun.com/xml/jaxp/properties/schemaLanguage", "http://www.w3.org/2001/XMLSchema"); 
newSaxParser.setProperty("http://java.sun.com/xml/jaxp/properties/schemaSource", <LIST OF SCHEMA INPUT>);					
			
PSVIProvider psviProvider = (PSVIProvider)newSaxParser.getXMLReader();

Assuming the presence of psviProvider, an element or an attribute declaration can be retrieved using psviProvider.getElementPSVI() or getAttributePSVI(...). The following snippet of code determines the base type for the current element:

ElementPSVI elementDeclaration = psviProvider.getElementPSVI();
XSTypeDefinition typeDefinition = elementDeclaration.getTypeDefinition();
System.out.println("The base type of the current element is: " + typeDefinition.getBaseType().getName());

This snippet demonstrates the use of PSVI in determining the value of the sml:acyclic attribute. The code retrieves the annotation of the type associated with the current element to determine if the attribute sml:acyclic is set:

ElementPSVI elementDeclaration = psviProvider.getElementPSVI();
XSTypeDefinition typeDefinition = elementDeclaration.getTypeDefinition();
		
XSObjectList annotationList = ((XSComplexTypeDefinition)typeDefinition).getAnnotations();
	
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();            
Document domDocument = factory.newDocumentBuilder().newDocument();
		
for (int i = 0, annotationCount = annotationList.getLength(); i < annotationCount; i++)
{
   XSObject annotation = annotationList.item(i);			
   ((XSAnnotation)annotation).writeAnnotation(domDocument, XSAnnotation.W3C_DOM_DOCUMENT);
   Node acyclicAttribute = domDocument.getFirstChild().getAttributes().getNamedItemNS(ISMLConstants.SML_URI, ISMLConstants.ACYCLIC_ATTRIBUTE);
   if (acyclicAttribute != null)
   {
      System.out.println(acyclicAttribute.getNodeValue());
   }
}			

Implementation Detail

All data builders used to construct structures based on definition documents are expected to be replaced with PSVI. There are currently three phases to the validation process:

  1. Constructing the data structures required by each validator
  2. Executing validators to verify SML constraints
  3. Checking schematron constraints

A validator is currently used to register a set of data structures it requires for validating a constraint. The data builders associated with a validator are content handlers that are invoked when parsing through an SML-IF document.

The first and second phases will be affected by this enhancement. The SMLMainValidator will be modified to parse through an SML-IF document using three different content handlers:

  1. HeaderContentHandler
  2. DefinitionContentHandler
  3. InstanceContentHandler

HeaderContentHandler is used to determine the identity, rule binding, and the schema binding of the SML-IF document. The structures build by this content handler are used later during the parsing process to bind definition documents with instance documents. DefinitionContentHandler is used to gather all schemas that are to be used when validating instance documents. InstanceContentHandler schema parses each instance document to build the data structures required for validating the SML constraints. Once the document content is parsed, SMLMainValidator invokes each validator to check the state of each constraint.

Figure 1.1 depicts the validator's flow when processing an SML-IF document:

237921-0.png
Figure 1.1 - SML Validator's flow

Data Builders

The following data builders will need to be modified/removed:

  • AbstractDeclarationBuilder.java - Abstract class for classes such as GroupDeclarationBuilder
  • AcyclicDataTypesList.java - Extract complex types that have sml:acyclic set to true
  • ComplexTypeElementBuilder.java - Stores complex type declaration
  • ElementDeclarationBuilder.java - Stores global element declaration
  • ElementSchematronCacheBuilder.java - Stores schematron associated with an element/type declaration
  • ElementTypeMapDataBuilder.java - Stores the relationship between the element names and their associated type.
  • GroupDeclarationBuilder.java - Stores group declarations
  • IdentityConstraintDataBuilder.java - Stores the identity constraints associated with elements
  • SchemaBindingDataBuilder.java - Used for schema binding
  • ElementSourceBuilder.java - Stores the source for definition/instance documents
  • SMLValidatingBuilder.java - Stores elements
  • SubstitutionBuilder - Used for substitution groups
  • TargetSchemaBuilder.java - Element declarations with target* constraints
  • TargetSchemaBuilder.java - Stores type declarations
  • TypeInheritanceDataBuilderImpl.java - Keeps track of type inheritance

Task Breakdown

The following section includes the tasks required to complete this enhancement

  1. Modify SMLMainValidator to invoke the three content handlers
  2. Create HeaderContentHandler
  3. Build the structures for HeaderContentHandler
  4. Create DefinitionContentHandler
  5. Build the structures for DefinitionContentHandler
  6. Create InstanceContentHandler
  7. Use the structures created by HeaderContentHandler and DefinitionContentHandler to invoke the data builders associated with each validator
  8. Build the structures required for sml:acyclic
  9. Complete the validator for acyclic
  10. Build the structures required for target* constraints
  11. Complete the validator for target* constraints
  12. Build the structures required for identity constraints
  13. Complete the validator for identity constraints
  14. Test to make sure all existing test cases pass

References

Open Issues/Questions

  • How do we handle validating SML-IF documents that contain only schemas?
    After conferring with Sandy Gao of the SML workgroup, the following approach was decided:
    • For SML-IF documents containing no instance documents, we need to create a dummy element to parse the Schemas with.
    • Using PSVI, we can then retrieve an object of type XSModel which basically represents all schemas in a validation set
    • The XSModel object can be examined by validators to ensure the schemas are syntactically correct


All reviewer feedback should go in the Talk page for 237921.


Back to the top