Context Data Model Goals

Note: in the following the terms attribute and relationship are used generically, not with any specialized meaning. We could have said field instead of either one.

[1] The model is extensible; attributes/relationships can be added later

It must be possible to define new data fields (attributes, relationships, etc.) with breaking existing parsers, APIs. To allow attributes/relationships to be added later, implies that all attributes are uniquely named (which implies [5] below). The data models used in each Context are implementer-defined and thus open ended, therefore the model must be extraordinarily abstract and fundamentally extensible.

[2] All objects can be identified uniquely.

All objects can be identified either by:

an absolute identifier or
a relative identifier that is unique within a given Context.

[3] Objects have attributes and/or relationships with other objects. Attributes and/or relationships may be grouped into sets or sequences.

Objects have attributes whose value is another object. More precisely, objects have what are called complex-valued attributes whose value is another object.

[4] All objects and their attributes/relationships are addressable, navigable.

Context objects and their associated attributes/relationships can be addressed using a simple, consistent indexing/navigation scheme.

[5] Attributes/Relationships are identified by globally unique URIs.

NOTE: This goal has been removed for CDM 2.0.

This enables the ability to assemble (join) attribute information about about two Entities held in separate Contexts, and perhaps implemented by separate providers, without attribute collisions and/or data loss. Along these lines, Higgins internally needs to implement certain kinds of attribute data flows across contexts for correlated Entities across Contexts.

This [5] in combination with [2] above means that an application using the Identity Attribute Service API can inspect each attribute/relationship of any object.

[6] There is a single, well-defined way to express the semantics of attributes/relationships.

At the implementation level a Context Provider can choose to represent things as either attributes (e.g. member slots on an object instance), or relationships (e.g. pointers to other objects) or a combination of both. That is its prerogative.

At the data model level what's required is that there be a single canonical language to express the semantic intent behind the data structure.

[7] Common schema descriptions. These schemas must describe the fine-grained constraints on the structure and values of data objects. The schema must describe the range of allowed values, cardinality, etc. for each attribute/relationship of an instance of a class, as well as allowed inter-object relationships including instances, classes and sub-classes.

Any given object may be governed by any schema descriptor. And at the attribute/relationship level, schema descriptors can be used to govern certain aspects of data. i.e. a SSN may only hold one value, a surname may hold multiple values. These two levels are independent--an xyz://foo/bar/country attribute behaves the same whether it's held by a person object or a device object.

Given access via a Context Provider to data described in the Higgins data model as well as access to the schema(s) used, an application can, without a priori knowledge, understand a data structure well enough to display, transform, search, filter and even perform some kinds of edits. [Note: whether the edit is allowable under the security policy of the Context and/or whether the update is ultimately rejected is another story, of course]

Context Providers that implement a Context are responsible for returning the schema description in a data stream in response to a schema 'get' operation on that Context.

Note: The Higgins demo app demonstrates the need for a common schema description. In the app the ProfileShare Context Provider declares a simple "vCard"-like schema for its Entities. This declarative, processible schema description enables the app to dynamically generate a fully functioning user interface to view and/or edit the vCard data without any prior knowledge of the underlying data structure, schema. No logic related to any specific class of data object is coded into the app. With this approach the app can manage and edit identity data within Contexts that are dynamically bound into the framework.

[7b] Contexts must declare the schema(s) used to define their use of the Higgins data model. Schemas must be composable (nestable).

Since one Context may be fabricated from the composition of disparate data sources or other Contexts, it must be able to use any number of individual or even nested schemas (schemas that include other schemas).

The schema governing an object's data elements (attribute/relationships) is discoverable given that object and/or attribute's identifier.

[7c] Context Providers may choose to support the ability to update one or more of its Context's schema

It is anticipated that almost all Context Providers will choose not to support this functionality. Most provider implementations involve a complex intertwining of logic and data structure such that external updates to the schema are impossible to support.

[8] Multiple Contexts.

Contexts are (not necessarily disjoint) sets of objects.

[9] Contexts are uniquely identifiable.

Note: In practice we may have to qualify this by adding the words "...known to any particular instance of Higgins..." after the word "Contexts". Contexts may or may not be discoverable.

[10] Contexts may be directly associated 1:1 or 1:M with other Contexts.

These 1:1 or 1:M relationships represent direct relationships between Contexts (as opposed to implicit relationships between Contexts that are a side effect of relationships between objects across context boundaries as described in [11] below).

Whether these Context-to-Context relationships are used hierarchically depends on the semantics of the consuming application and applicable policies. For example, should we characterize Higgins as a sub-Context of Eclipse in an organizational sense? If so, does this mean that all policies applicable to Eclipse are also applicable to Higgins? Does this apply to membership? Access lists? The answer is that the strict hierarchy probably doesn't apply to everything.

Context relationships are a kind of relationships and thus according to [5] above are also uniquely identified by a URI. Some of these kinds of Context-to-Context relationships do involve hierarchy. For example, organizational structure or geographic containment (NC, UT, MA are states within the USA), and so on. The model allows for any number of hierarchies or graphs to be concurrently modeled. One could (potentially) have some access control policy applied to one hierarchy, and membership applied to another.

[11] An object may have direct, unidirectional, 1:1 or 1:M associations with an object(s) in other Contexts.

This is necessary to support Entity correlation and aggregation. As an example, the same person may be represented as N Entities in N different Contexts. The same person may be represented as yet another Entity in one final Context. The Entity in this final Context would have a set of Entity Relations to the other N Entities. This final Entity could act as an archetypal source of Attributes for the other N, and Higgins may support Attribute propagation along the directed reference links to "push" copies of Attribute values to the N subordinate Entities.

In the future Higgins may also support Attribute flows in the reverse direction where the final Entity that is acting as parent to the N children can effectively inherit attribute values from its children. In addition to up and down attribute flow, there are use cases that involve side-to-side flow. For example, when Higgins mediates the opening of a target Context from a base Entity, it may search every Entity reachable from the base Entity in search of the necessary Attributes necessary to authenticate in accordance with the security policy of the target Context.

[12] The schema description must be decidable.

For a particular task a logic is decidable if it is possible to design an algorithm that will terminate in a finite number of steps (ie. the algorithm is guaranteed no to run forever). This means, for example, that the the schema language must be chosen to not allow the construction of logical contradictions (that would cause automated reasoning to potentially infinite loop).

This goal is being reconsidered

Breadcrumbs

Notice: This Wiki is now read only and edits are no longer possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.