Skip to main content

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Jump to: navigation, search

Context Data Model Goals

Note: in the following the terms attribute and relationship are used generically, not with any specialized meaning. We could have said field instead of either one.

[1] The model is extensible; attributes/relationships can be added later

Any Context Provider can define their own data fields (attributes, relationships, etc.) with breaking existing parsers, APIs. To allow attributes/relationships to be added later, implies that all attributes are uniquely named (which implies [5] below). Since Higgins is extensible through Attribute Providers and the specific data models used by each are implementer-defined and thus open ended, the Higgins model must be extraordinarily abstract and fundamentally extensible.

[2] All objects can be identified uniquely.

Note: we have discussed achieving this by a combination of a context identifier (at the contain level) and within a context a Contextually Unique Id (CUID).

[3] Objects have attributes and/or relationships with other objects. Attributes and/or relationships may be grouped into sets or sequences.

[4] All objects and their attributes/relationships are addressable, navigable.

Context objects and their associated attributes/relationships can be addressed using a simple, consistent indexing/navigation scheme.

[5] Attributes/Relationships are identified by globally unique URIs.

This enables the ability to assemble (join) attribute information about about two Digital Subjects held in separate contexts, and perhaps implemented by separate providers, without attribute collisions and/or data loss. Along these lines, Higgins internally needs to implement certain kinds of attribute data flows across contexts for correlated DigitalSubjects across contexts.

This [5] in combination with [2] above means that an application using the Higgins API can inspect each attribute/relationship of any object.

[6] There is a single, well-defined way to express the semantics of attributes/relationships.

We have found this out the hard way. Our current model has both Attributes and SubjectRelationships. As a consequence, Jim and I went back and forth each trying to argue the merits of expressing relationships as just another kind of attribute (where the value is an object reference) vs. using an IdentityRelationship object. We both argued aesthetics and pragmatics. In a sense we are both right. And that is the problem, there is more than one way to express the same intention.

We need to be able to make statements like "Jim isInterestedIn 'B-movies'" and all know what we mean by "Jim", "isInterestedIn" and "B-movies" without arguing (as has happened on this list) about how this should be represented (e.g. Jim is an object that has an "attribute" whose value is the string "B-movies" vs. Jim is an object that has a "relationship" with a "B-movies" object.)

Of course, any specific data schema (e.g. as implemented by a Context Provider) can and does choose to represent things as either attributes (e.g. member slots on an object instance), or relationships (e.g. pointers to other objects) or a combination of both. That is its prerogative. What's required is that there be a single canonical language to express the semantic intent behind the data structure.

[7] Common schema descriptions. These schemas must describe the fine-grained contraints on the structure and values of data objects. The schema must describe the range of allowed values, cardinality, etc. for each attribute/relationship of an instance of a class, as well as allowed inter-object relationships including instances, classes and sub-classes.

Any given object may be governed by any schema descriptor. And at the attribute/relationship level, schema descriptors can be used to govern certain aspects of data. i.e. a SSN may only hold one value, a surname may hold multiple values. These two levels are independent--an xyz://foo/bar/country attribute behaves the same whether it's held by a person object or a device object.

Given access via a Context Provider to data described in the Higgins data model as well as access to the schema(s) used, an application can, without a priori knowledge, understand a data structure well enough to display, transform, search, filter and even perform some kinds of edits. [Note: whether the edit is allowable under the security policy of the Context and/or whether the update is ultimately rejected is another story, of course]

Context Providers that implement a Context are responsible for returning the schema description in a data stream in response to a schema 'get' operation on that Context.

The Higgins demo app demonstrates the need for a common schema description. In the app the ProfileShare Context Provider declares a simple "vCard"-like schema for its DigitalSubjects. This declarative, processible schema description enables the app to dynamically generate a fully functioning user interface to view and/or edit the vCard data without any prior knowledge of the underlying data structure, schema. No logic related to any specific class of data object is coded into the app. With this approach the app can manage and edit identity data within contexts that are dynamically bound into the framework.

[7b] Contexts must declare the schema(s) used to define their use of the Higgins data model. Schemas must be composable (nestable).

Since one Context may be fabricated from the conglomeration of disparate data sources or other Contexts, it must be able to use any number of individual or even nested schemas (schemas that include other schemas).

The schema governing an object's data elements (attribute/relationships) is discoverable given that object and/or attribute's identifier.

[7c] Context Providers may choose to support the ability to update one or more of its Context's schema

It is anticipated that almost all Context Providers will choose not to support this functionality. Most provider implementations involve a complex intertwining of logic and data structure such that external updates to the schema are impossible to support.

[8] Multiple Contexts.

Object space is subdivided into separate Contexts.

[9] Contexts are uniquely identifiable.

Note: In practice we may have to qualify this by adding the words "...known to any particular instance of Higgins..." after the word "Contexts". Contexts may or may not be discoverable.

[10] Contexts may be directly associated 1:1 or 1:M with other Contexts.

These 1:1 or 1:M relationships represent direct relationships between Contexts (as opposed to implicit relationships between Contexts that are a side effect of relationships between objects across context boundaries as descibed in [11] below).

Whether these Context-to-Context relationships are used hierarchically depends on the semantics of the consuming application and applicable policies. For example, should we characterize Higgins as a sub-Context of Eclipse in an organizational sense? If so, does this mean that all policies applicable to Eclipse are also applicable to Higgins? Does this apply to membership? Access lists? The answer is that the strict hierarchy probably doesn't apply to everything.

Context relationships are a kind of relationships and thus according to [5] above are also uniquely identified by a URI. Some of these kinds of Context-to-Context relationships do involve hierarchy. For example, organizational structure or geographic containment (NC, UT, MA are states within the USA), and so on. The model allows for any number of hierarchies or graphs to be concurrently modeled. One could (potentially) have some access control policy applied to one hierarchy, and membership applied to another.

[11] An object may have direct, unidirectional, 1:1 or 1:M associations with an object(s) in other Contexts.

This is necessary to support Digital Subject coorelation and aggregation. As an example, the same person Entity be represented as N Digital Subjects in N different Contexts. The same Entity may be represented as yet another Digital Subject in one final Context. The Digital Subject in this final Context would have a set of references to the other N Digital Subjects. This final Digital Subject can act as an archetypal source of attributes for the other N, and Higgins may support attribute propagation along the directed reference links to "push" copies of attribute values to the N subordinate Digital Subjects. Higgins may also support attribute flows in the reverse direction where the final Digital Subject that is acting as parent to the N children can effectively inherit attribute values from its children. In addition to up and down attribute flow, there are use cases that involve side-to-side flow. For example, when Higgins mediates the opening of a target context from a base Digital Subject, it may search every DS reachable from the base Digital Subject in search of the neccessary attributes/claims necessary to authenticate in accordance with the security policy of the target Context.

[12] The schema description must be decidable.

For a particular task a logic is decidable if it is possible to desgin an algorithm that will terminate in a finite number of steps (ie.g the algorithm is guaranteed no to run forever). This means, for example, that the the schema language must be chosen to not allow the construction of logical contradictions (that would cause automated reasoning to potentially infinite loop).


Comparison to earlier "preliminary" goals

The following is a comparison to the "existing" goals here Preliminary Data Model Goals

  1. Multiple Contexts: captured in #8 above. The existing goal #1 introduced the notion of a Digital Subject. In the above goals Digital Subjects are not described in the model, they would be handled as a defined class. In this a Context contains many kinds of objects only some of which may be of the Digital Subject type/schema. This allows, for example, a Digital Subject to have a "has" relationship with an event (say, their birthday), where the event is just another object that in turn that has a type/schema.
  2. Context Relationships: captured in #10 above
  3. ContextRelationships are 1:1 or 1:M: captured in #8 above
  4. Digital Subjects have type: Since every object is uniquely identifiable (#2 above), an object's identity is its type
  5. Digital Subjects have attributes: captured in #1 above but generalized to all objects in a context
  6. Attributes are typed...: captured in #5 above
  7. Attributes values can be literals or compound objects...: subsumed by the more general #3
  8. Attributes have Source/metadata: As described in #3, all objects have attributes/relationships. Specific kinds of metadata (e.g. our notion of Source) can be relegated to specific schemas.
  9. Digital Subject Relationships: As described in #3 all objects have attribute/relationships. Specific kinds of relationsihps can be relegated to schemas.
  10. SubjectRelationships are typed: Subsumed by the more general #5.
  11. SubjectRelationships have arbitrary metadata: As described in #3, all objects have a attribute/relationships. Specific kinds of metadata can be relegated to specific schemas.
  12. SubjectRelationships are 1:1 or 1:M: Subsumed by #3 and by #11
  13. Addresssing: subsumed by #3 above
  14. Digital Subject indirect referencing: this requirement need not strictly be in the model, it can be handled as a convention on how these references are handled.

Back to the top