Skip to main content

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Jump to: navigation, search

PsychoPathXPathProcessor/UserInterface

The meaning of a user interface is generally taken as a methodology for the user to interact with the program, usually via a graphical user interface (GUI) or a command line interface. However, as PsychoPath has been designed as a library, in this context, the user interface is defined as the public methods exposed within the library which may be invoked by the user in a defined manner so that a result may be obtained in a specific way. This user interface should be competently and extensively designed for two primary reasons:

  1. The user interface is the only section of the library visible to the user under normal circumstances and should be designed in a logical grouped manner for ease of use.
  2. In a library, the user interface usually follows and reflects the architectural design of the underlying implementation. Thus, a poorly designed user interface is indicative of a poor underlying structure.

High level Overview

Processing an XPath 2.0 expression can be decomposed into the following set of sequential and largely uncoupled operations:

  1. Load the XML document.
  2. Optionally validate the XML document.
  3. Initialize static and dynamic context in respect to the document root.
  4. Parse the XPath 2.0 expression.
  5. Statically verify the XPath 2.0 expression.
  6. Evaluate the XPath 2.0 expression in respect to the XML document.

This decomposability has allowed PsychoPath to be designed in a highly modular manner with almost no coupling between the packages. This has the potential for future extendibility and even for the re-implementation of existing packages with few modification needed in other packages.

We have chosen to use the external DOM package, Xerces, to handle the first two tasks, namely, loading the XML document and performing the optional XML Schema verification. PsychoPath makes use of Xerces in a largely package independent manner; only the XML Schema extraction and manipulation has been based off Xerces’ implementation.As Xerces is used as the default DOM package within Java 1.5, we believe this minimal coupling is acceptable, but if for some reason, the user desires a different DOM package to be used, a common wrapper can be easily implemented so as to not break functionality in the other packages.

The initialization of static and dynamic contexts is performed internally, entirely within the DynamicContext class. This subpackage fully initializes the relevant properties to the default values, adds Post Schema Validation Information (PSVI) according to the XML Schema of the document and additionally, handles the registration of data-type namespaces and function libraries.

We have used the external packages JFlex and CUP to generate a XPath 2.0 parser which PsychoPath uses to parse the XPath expression and represent it as an Abstract Syntax Tree (AST).PsychoPath’s usage of this parser is entirely package independent so a replacement, specifically designed for XPath 2.0, parser can be used instead with no break in functionality elsewhere.

Static verification of the AST represented XPath expression is performed internally using a Visitor pattern to traverse the AST. Other Visitors (e.g. an Optimizer visitor to simplify and optimize the XPath expression) can be implemented for additional functionality.

Finally, the evaluation of the XPath expression is also performed internally.


Example Usage

Loading the XML Document

The very first step in using PsychoPath is to load the relevant XML document. PsychoPath has been designed for use with the DOM package Xerces and although the XML Schema usage and manipulation is coupled to Xerces, a common wrapper can be easily implemented allowing use of any specification adhering external DOM package.

If using Xerces, this entire step is achieved by initially creating an InputStream from the XML document and initializing the Xerces DOM loader in the following manner:

InputStream is = new InputStream(XMLdocument);
DOMLoader domloader = new XercesLoader();

Now is the time to perform optional XML Schema checking to verify the structure and integrity of the XML document. This is done by setting a flag within the DOMLoader object:

domloader.setvalidating(true);

Finally the XML document needs to be loaded its Document Object Model (DOM) root is stored:

Document doc = domloader.load(is);

Initializing static and dynamic contexts

The static context in PsychoPath is initialized automatically, so the user is required only to set the dynamic contexts in respect to the schema information of the document (may be null for schema-less documents). If Xerces was used to load the XML document, the schema must first be extracted from the DOM root of the XML document. This extraction and initialization is shown below:

ElementPSVI rootPSVI = (ElementPSVI)doc.getDocumentElement();
XSModel schema = rootPSVI.getSchemaInformation();
DynamicContext dc = new DefaultDynamicContext(schema, doc);

There are two other essential initializations within this step. The first is the registration of the namespaces of the XPath 2.0 predefined data-types (the ‘xs’ and ‘xdt’ namespaces) as shown below. Any user defined namespaces should also be registered at this point in the same manner.Automatic registration of namespaces defined in the document and its Schema has never been implemented, but should be in the future. All code is namespace aware, so the addition of a routine to extract namespace information should be very easy.


dc.addnamespace("xs","http://www.w3.org/2001/XMLSchema");

The second essential initialization is the registration of the predefined XPath 2.0 functions. PsychoPath groups these functions into a standard library to simplify this registration step. Any user defined functions should also be registered at this point in a similar manner.

// The default fn library
dc.addfunctionlibrary(new FnFunctionLibrary());

// Constructor functions for Schema types
dc.addfunctionlibrary(new XSCtrLibrary());

Parsing the XPath expression

The XPath 2.0 expression must now be parsed and represented as an Abstract Syntax Tree (AST) which is the internal format that PsychoPath uses. PsychoPath includes a parser created using JFlex and CUP that performs this step and its usage is as follows:

XPathParser xpp = new JFlexCupParser();
XPath path = xpp.parse(StringPath);

Static type checking

The XPath 2.0 expression obtained must be statically type checked to verify its structural validity, and check for possibly undefined names. PsychoPath uses a class implementing the Visitor pattern for traversing and checking the AST and is used as follows:

StaticChecker namecheck = new StaticNameResolver(dc);
namecheck.check(path);

Evaluating the XPath expression

Finally, the time for evaluating the XPath 2.0 expression has arrived. This is shown below and the result is of the evaluation is stored in the ResultSequence:

Evaluator eval = new DefaultEvaluator(dc, doc);
ResultSequence rs = eval.evaluate(path);

Extracting the results

XPath 2.0 defines everything to be a sequence of items, including the arguments to expressions and the result of operations. Thus, the overall result of a XPath expression evaluation is also a sequence of items. PsychoPath uses the class ResultSequence as a Collection wrapper to store these sequences and therefore, the result of an evaluation is of this type also.

Extraction of certain or next items from the ResultSequence class is fully described in figure ??. However, all the items extracted will have the type of the base class AnyType. They will then need to be cast into the correct concrete type in order to be used.Certain operations always return a particular type and using this knowledge, the extracted item may immediately be casted. An example is the “if expression” which always returns a boolean type and can safely be cast as such:

XSBoolean xsbool = (XSBoolean)(rs.first());

The actual result can now be extracted from this XSBoolean in the following manner:

boolean bool = xsbool.value();

Alternatively, a String representation of the value can also be extracted from the XSBoolean as shown below:

String sbool = xsbool.stringvalue();

However, if the expected return type is unknown or multiple types are possible, the types hierarchy depicted in figure ?? may be traversed in a breadth first manner making use of the Java instanceof operator to ascertain the actual type.The first query would be to determine if the type is derived from NodeType or AnySimpleType:

AnyType at = rs.first();
if(at instanceof NodeType)
  checkNodeTypes(at);
else if(at instanceof AnySimpleType)
  checkSimpleTypes(at); 

The result of this query would determine which subsequence queries should be performed, eventually reaching a leaf type satisfying these two criteria:

  1. The type is not abstract
  2. No other types are derived from this type

At this point, the actual type has been determined and can be safely casted to in order to extract the final result. For example, if the type has been progressively narrowed down to NumericType, the next query will determine the type to be a leaf node at which point, it can be safely casted:

if(at instanceof XSInteger) //leaf type
  XSInteger result = (XSInteger)at;
else if(at instanceof XSDecimal) 
  XSDecimal result = (XSDecimal)at; ...

User Interface Evaluation

PsychoPath has a well defined and logically assembled user interface. This arises due both due to the easily decomposable and sequential nature of the act of XPath 2.0 processing and also due to the extensive time we spent on iteratively designing and evaluating our system architecture.

The final cast of the result extracted from the ResultSequence is the weakest aspect of the user interface but is unavoidable due to Java’s lack of support for templates in versions 1.4.x.

Back to the top