Skip to main content

Notice: This Wiki is now read only and edits are no longer possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Jump to: navigation, search

SMILA/Documentation/XML storage

< SMILA‎ | Documentation
Revision as of 05:23, 2 September 2008 by Unnamed Poltroon (Talk) (New page: == Introduction == The XML Storage shall be used from several components within the EILF. At this time they are: * IRM * BPEL * Queue * Blackboard Service Concept The main use case ...)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Introduction

The XML Storage shall be used from several components within the EILF. At this time they are:

  • IRM
  • BPEL
  • Queue
  • Blackboard Service Concept

The main use case shall be to store and retrieve XML Documents as well as to obtain a set of documents by an XPath/XQuery. The first API draft shall define the basic CRUD operations. In-place modifications of sub nodes are not yet needed (Prio 2 or 3).

It is suggested to publish the needed functionality as an OSGi Service with the possibility ro tun multiple instances which may or may not be running in the same VM. The latter case shall be covered by using SCA which handles this mater transparently to the user but imposes a few constraints in the API - at least in the Tuscany implementation. These Constraints are:

  • Return values and parameters of methods must be serializable
  • Overloading of methods is not allowed


Xml Storage Service

The intended usage of the XML Storage is very much that of a service or server (eg. like a real DB Server such as MySql, Oracle, etc.) as opposed to a library type implementation. Hence the implementation shall be done as an OSGi Service that is wired up with Declarative Services.

The service itself must support multiple requests at the same time and therefore needs to be multi threaded. The intention is to use a connection-type approach as is the case for SQL DBs. That entails that multiple clients may connect to the service and each client may open possibly multiple connections that are used to query/store XML documents concurrently.


An OSGi service is still run and called within the same JVM. This is in contrast to normal DB services that typically run in their own process and hence communication is done via TCP/IP, pipes etc. In the end we need to be able to access the Xml Storage Service remotely as well. This shall and can be done which SCA making thus that matter transparent to the client and moving this aspect into configuration of the setup/installation.

TODO :

  • Figure out how this is done
  • performance hit/overhead of SCA


Xml Storage Use

  • Retrieval of a doc my either be done by a string key or formulating an XQeury which returns a Sequence of XML Nodes (Types) and as such may return whole documents or part of a document

Because the storage scope is that of whole documents we should also work with these as a whole. Although it is possible to convert an element node that you got via XQuery into a document (involves extracting the element and all its content as text and then to parse this into a DOM ) this process is obviously lengthy/costly. As such, we should store sub sections of XML documents that we use oftain on their own (ie. w/o their parent/containing document context) as an own entity. Obviously they need to be linked (internally in the Storage API?) so, we can clean them up properly.

  • common API uses might be good to encapsulate in it's own layer, so that each client doesn't have to perform all low level functions itself
  • original documents with the EilfID
  • Store EilfID itself -> key for it's retrieval could be an md5 hash
    • Needs normative calculation
    • The md5 hash could be cached @ the key itself so it doesn't have to be cached each time
  • Other use cases/api needed?


Binary Storage

Although it is possible to save binary objects in Berkley DB XML and possibly other Xml DBs it is better to provide separate OSGi Services for these distinctly different storage types. Apart from this, according to Ralf Schuman who investigated this matter, it seems that the performance for larger binary objects is not good with BDB.


API

Back to the top