Context Data Model Background

From Eclipsepedia

Jump to: navigation, search

Contents

Motivation

Information fragmentation is a pervasive problem. Even seemingly simple activities depend on information from a number of heterogeneous sources. The information may be fragmented by physical location, device, application, middleware, data storage or platform. By providing a common data model (the Context Data Model), data from multiple locations and systems can be unified.

There is a great deal of interest among Web developers in solving interoperability and providing data portability. See, for example, http://DataPortability.org and many other related efforts. In this quest, the Context Data Model can provide powerful enabler for interoperability of identity-related information across the "silos."

Why a Common Model?

There are other approaches to data unification than providing a common data model. However every unification strategy involves choosing some kind of lowest common denominator. It is all a question of how low is low. The lower the level, the easier to do the unification, but the more lossy. For example, consider raw text. It's easy to index, search, and copy/paste but very lossy. Or consider XML, which offers a common syntax for describing a series of attributes of a given object and values for each of the attributes, although still without any defined semantics.

Kinds of Data

The data model's focus is on the unification of identity-related data. We need to be able to create rich, contextualized representations of people, groups and organizations. These objects have attributes that range from simple literals identification attributes, authentication data attributes, names, email addresses and telephone numbers, to complex attributes that are essentially links to other objects, people, groups, documents, calendar events, music preferences, and so on. These relationship attributes might be "friend", "manager", "likes", "owns", etc.

A key innovation in the model is the a Higgins correlation attribute. If object a has a correlation link to object b, this implies that both a and b are representations of the same person, organization, thing or concept that exists outside of the Higgins model.

More about Interoperability

Saying we desire interoperability can mean many different things. At the least it should mean that we can navigate through and inspect data objects and their associated attributes/relationships within any Context through the Higgins API. This is part of what motivates Context Data Model Goals [2], [3] and [4]. At this level of interoperability we may not understand the meaning of the objects and the attributes, but we can know that they are there.

Moving further along the interoperability spectrum, if we add the requirement that every attribute/relationship is globally uniquely identifiable (see [5] in Context Data Model Goals) then we can use the Higgins IdAS API for more than a shallow syntactic parse of the data in various Contexts. We can, for example, assemble (join) attribute information about about two Nodes held in two separate Contexts, and perhaps implemented by separate providers, without collision and data loss. Along these lines, Higgins itself needs to implement certain kinds of cross-Context attribute data flows for correlated Nodes.

Beyond inspection and navigation, Higgins aspires to support applications that can also edit Context data. We envision Higgins-based applications with user interfaces that can manipulate data contained in any Context from any Context Provider bound into Higgins. This implies two things:

  1. We require that the semantics of the attributes of objects be defined in a single well-defined (unambiguous) manner. If the model has more degrees of freedom than the absolute minimum necessary, ambiguity will arise where different Context Providers express the same semantic in different ways. For more about this see [6] in Data Model Goals.
  2. The specific schema of a Context's use of the abstract Higgins data model must be exposed at the Context Provider (SPI) and IdAS (API) levels. This exposure allows an application to know what the valid degrees of freedom in the structure of the data are, and the values of its data fields may assume. The application can learn from the schema what datatypes are used to describe the value of a given attribute (e.g. a string, a non-zero number or a date, etc). It can learn what kinds of attributes may optionally be added to an object (and which may not), etc. And it can learn or what kinds of required and/or optional relationships are allowed with objects of various kinds. For more about the need for a common schema language see [7] in Data Model Goals.

See Also

  • Context Data Model Goals provides an an enumeration of the top level design goals that ultimately led to the decision to using an RDF/OWL-based metamodel.