Skip to main content

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Jump to: navigation, search

Difference between revisions of "EntityId Requirements"

(#4: Multi-Part Keys)
(Definitions for Higgins 1.1)
 
(48 intermediate revisions by 4 users not shown)
Line 1: Line 1:
{{#eclipseproject:technology.higgins}}
+
{{#eclipseproject:technology.higgins|eclipse_custom_style.css}}
 
[[Image:Higgins_logo_76Wx100H.jpg|right]]
 
[[Image:Higgins_logo_76Wx100H.jpg|right]]
  
 
== About ==
 
== About ==
This page is for working out the requirements and design decisions for any changes to Higgins [[EntityId]]s in the migration from the [[Context Data Model 1.0]] to the [[Context Data Model 1.1]].
+
This page is for working out the requirements and design decisions for any changes to Higgins [[EntityId]]s in the migration from the [[Context Data Model 1.0]] to the [[Context Data Model 1.1]]. Some background discussion is here: [[IdAS EntityId Requirements Discussion Summary]].
 
+
== Context Data Model Requirements ==
+
In terms of the underlying graph model, following is a summary of the abstract requirements based on recent threads on the email list (~2008-09). '''Please post a note if you disagree with any of the following:'''
+
 
+
# An [[Entity]] is a node in the graph described by the Higgins [[Context Data Model]]. The CDM needs a consistent way of representing arcs referencing that node.
+
# There MAY be 0..n such arcs referencing the node.
+
# An arc MAY theoretically be represented as either:
+
## A unique identifier (single-part key).
+
## A set of [[Attribute]]s of that [[Entity]] (multi-part key) - none of which itself is required to be a unique identifier. --[[User:Paul.socialphysics.org|Paul.socialphysics.org]] 14:16, 17 September 2008 (UTC): Even if we do move to having 0..1 (canonical) EntityId plus 0..n synonyms I disagree that we can support this second option as a first class "arc" --doing so would place impossible burdens on future versions of IdAS that will be capable of "deep" (graph) operations instead of today's shallow queries. In other words IdAS itself will be able to "walk the arcs" to respond to a query. So whereas WITHIN a Context developers may think of arcs implemented by multi-part keys they should expect no support for these links from IdAS itself.
+
# If the arc is represented as a unique identifier:
+
## It MUST be a Contextually Unique ID (CUID), i.e., locally unique within the [[Context]].
+
## It MAY be globally unique identifier (GUID) (note that all GUIDs are by definition CUIDs provided that the Context recognizing them as IDs).
+
# With regard to mutability:
+
## At least one identifier for an [[Entity]] in a [[Context]] SHOULD be immutable, i.e., serve as a persistent reference to the Entity within that Context (forever).
+
## However because Higgins does not control Contexts or Context policies, the CDM must be prepared that an identifier for an Entity MAY be mutable, i.e., may be reassigned in that Context to reference a different Entity.
+
 
+
== IdAS API Requirements ==
+
Following are the key design decisions we need to make. ''We are posting votes as they are made in email. Feel free post your votes/comments directly (with your wiki signature).''
+
 
+
=== Q1: Unique Identifier vs. Attribute Set ===
+
Must a Higgins [[EntityId]] be a single-part CUID or GUID, or could it be a multi-part key consisting of a set of [[Attribute]]s?
+
 
+
* Jim: Yes - it must be a CUID or GUID.
+
* David: No - ''I prefer a multi-part key where the parts of the key might also be unique in a context. An example is a EntityID made up of a uniqueName, uniqueId, nativeName, nativeId. Any part of the of the Entity ID could be used to identify the object.''
+
* Drummond: Abstain - ''Single-part IDs are easier, but multi-part keys are useful too.''
+
* Tony: No.
+
* Tom: Yes - it must be a CUID or GUID.
+
* Paul: Yes - it must be a CUID or a GUID. With the synonyms proposal (see below) we can give David the multi-part keys he needs (each key-part is a synonym)
+
 
+
=== Q2: Representation of an EntityId as a Unique Identifier ===
+
If an [[EntityId]] is a unique identifier, should this be represented as:
+
# A type of [[Attribute]]?
+
# An inherent property of an [[Entity]] that MAY be exposed as an [[Attribute]]?
+
 
+
* Jim: #2
+
* David: #2
+
* Drummond: #2
+
* Tony: #2
+
* Tom: #2
+
* Paul: roughly #2
+
 
+
=== Q3: Cardinality ===
+
What is the cardinality of [[EntityId]]? (The answer may depend on the answer to #2.)
+
# 0..n?
+
# 0..1?
+
# 1 (whose value may be null)?
+
# None of the above?
+
 
+
* Jim: Abstain - ''I tend to want simple.''
+
* David: #1 or #2 - ''0..1 if the EntityId is mutlipart as in Q1.  0..n if it is a string, and then it needs a type.''
+
* Drummond: #2 or #3 - ''For comparison's sake, you need to always get the same identifier value. But there should also be a way to get all synonyms.''
+
* Tony: #1
+
* Tom: Abstain - ''+1 to Jim's feedback.''
+
* Paul: #2 unless I see a real world use case that requires #1. Presuming such a use-case exists, I can't see any alternative to having 0..1 "canonical" EntityId AND 0..n synonyms. There must be a way to link the synonyms together--the (preferably immutable) canonical EntityId is the way to do this. I don't see how you can have a data model 0..n ids that are all perfectly equal. The most natural thing in the CDM model would be to have 0..1 EntityId and then define an Attribute type in CDM called "synonym" and have all these "other" ids be higgins:synonyms or Context-defined sub-attributes of this
+
 
+
=== Q4: Mutability ===
+
Is the EntityID of an Entity immutable?
+
# Yes?
+
# No?
+
# Depends?
+
 
+
* Jim: Yes - ''I believe it must be as soon as we start tying policy to EntityIDs.  Either that, or we need to require a way to ensure referential integrity for places where EntityIDs are stored in policy statements.''
+
* David: Depends - ''My vote on Q1 was multipart where the decomposition could contain both mutable (uniqueName) and immutable (uniqueId) parts. They both have their use cases. If the EntityID is a string, then 1..n is needed to accomodate mutable, immutable types and if the id can be used in other protocols (compatability with legacy systems).''
+
* Drummond: Depends - ''Both immutable and mutable SHOULD be possible. Best practice is to assign 1 immutable in any context and then allow 0..n synonyms (mutable or immutable). But Higgins does not control contexts so it seems like it must be open to either. However there should be a way to ask for an immutable identifier, or ask if an given identifier is mutable.''
+
* Tony: No position yet.
+
* Tom: Yes - ''+1 to Jim's position.''
+
* Paul: In the case where we have 0..1 EntityId I'd say #1 (yes). If we have 0..1 EntityId plus 0..n synonyms then I'd say only the entityID must be immutable, the 0..n synonyms may be mutable
+
  
 
== Current EntityId Definition in Context Data Model 1.0 ==
 
== Current EntityId Definition in Context Data Model 1.0 ==
Line 78: Line 11:
 
# Is always exposed as an [[Attribute]].
 
# Is always exposed as an [[Attribute]].
 
# Exposes no information about mutability.
 
# Exposes no information about mutability.
 +
 +
== Proposed Definitions for Higgins 1.1 ==
 +
 +
Entity:
 +
# An [[Entity]] is a node in the graph described by the Higgins [[Context Data Model]].
 +
# An [[Entity]] is identified by 0..n [[EntityId]]s (vs. 0..1 in Higgins 1.0)
 +
# At least one EntityId of an [[Entity]] SHOULD be immutable, i.e., serve as a persistent reference to the Entity within that Context (forever). However because Higgins does not control Contexts or Context policies, the CDM must be prepared that an identifier for an Entity MAY be mutable, i.e., may be reassigned in that Context to reference a different Entity.
 +
 +
EntityId:
 +
# An EntityID is of type String (if the EntityId is not an Attribute of the Entity) else of type IAttribute (if the EntityId is also an Attribute of the Entity)
 +
# An EntityId MUST be locally unique within the [[Context]].
 +
# An EntityId MAY be globally unique (GUID)
 +
# An EntityId MAY be exposed as an [[Attribute]]. If it is the Attribute Type MUST be marked as a higgins:synonym
 +
# An Entity MAY have a single ''cannonical'' EntityId that MUST be immutable
  
 
== Proposed Changes in Context Data Model 1.1 ==
 
== Proposed Changes in Context Data Model 1.1 ==
Line 96: Line 43:
  
 
[well, in practice the range would likely be a syntax restriction on xsd:string, not a plain old xsd:string, but fixing that would complicate the example]
 
[well, in practice the range would likely be a syntax restriction on xsd:string, not a plain old xsd:string, but fixing that would complicate the example]
 +
 +
=== #3: Changes to EntityID definition ===
 +
* The canonical EntityId (if it exists) is immutable
  
 
== Proposed Changes to IdAS API for Higgins 1.1 ==
 
== Proposed Changes to IdAS API for Higgins 1.1 ==
  
=== #1: Add hasMutableEntityId() Method ===
+
=== public Object[] IEntity.getEntityIds(); ===
The proposed change is to add a '''hasMutableEntityId()''' method to IContext that returns a Boolean indicating whether [[EntityId]]s in that [[Context]] are mutable or not. True = mutable.
+
This method returns an array of EntityIds that uniquely identify the Entity within the Context. Each Object is either
 +
* a String (if the EntityId is not an Attribute of the Entity)
 +
* an IAttribute (if the EntityId is also an Attribute of the Entity)
  
=== #2: Add getSynonyms() Method ===
+
=== public Object IEntity.getCanonicalEntityId()===
The proposed change is to add a '''getSynonyms()''' method to IEntity that returns all Attributes that are sub-properties of higgins:synonym.  
+
This method returns the "canonical" EntityId, i.e. the preferred one. The returned object is either a String or an IAttribute.  The context provider guarantees that this EntityId is immutable. Returns null if this Entity has no Cannonical EntityId
  
These synonym Attributes can also be accessed as "regular" Attribute (e.g. using getAttribute(), etc.)
+
=== public IEntity IContext.getEntity(String); ===
 +
This method already exists today. There is no change to it. It looks up an IEntity based on a String which is not an Attribute of the Entity.
  
== Example using proposed changes ==
+
=== public Iterator IContext.getEntities(IFilter); ===
The following diagram shows three Entities. To ordinary Entities and one Class Entity (higgins:Person).
+
This method already exists today. There is no change to it. It looks up IEntitys based on an IFilter, which can select them by Attribute Values.
[[Image:Multiple-identifiers2.png]]
+
  
In this case the Context Provider developer has defined three Attributes: SSN, mobile, and shoe-size. Although it is not shown in the above diagram, the developer has defined SSN and mobile attributes as being Synonym Attributes. Also, the developer has chosen the option to "repeat" the EntityId value as the value of the "SSN" Attribute.  
+
=== public IAttributeModel.isEntityId(); ===
 +
Returns true, if IAttributes that use this IAttributeModel also act as EntityIds. These IAttributes may be returned by the above IEntity.getEntityIds() method.
  
=== Using IdAS... ===
+
=== public IAttributeModel.isMutable(); ===
* Calling IEntity.getEntityId() on the Entity at the left will return 033561186.  
+
Returns true, if IAttributes that use this IAttributeModel are mutable, i.e. if its IAttributeValues can be changed/added/removed.
* Calling IEntity.getEntityId() on the Entity at the top will return 034898786.
+
* Calling IEntity.getSynonyms() on the Entity at the left will return a list of these two attributes:
+
** SNN with value 033561186
+
** mobile with value +16175137924
+
  
== Still Under Discussion ==
+
== Example Using Proposed Changes ==
 +
The following diagram shows three Entities: two ordinary Entities and one Entity Class (higgins:Person):
  
=== Multi-Part Keys ===
+
[[Image:Multiple-identifiers5.png]]
The proposal is to keep it simple by requiring multi-part keys to be serialized into a composite identifier, which can then be used as an inherent EntityId or exposed as an Identifiers value.
+
  
--[[User:Paul.socialphysics.org|Paul]] 14:10, 17 September 2008 (UTC): I don't see why each key-part can't be a synonym Attribute and thus we don't have to complicate our model by adding "multi-part" literal values
+
* The Context Provider developer has defined three simple Attributes: "ssn", "mobile", and "shoe-size".  
 +
* The developer chose to use the SSN as the canonical EntityId
 +
* The developer chose the option to "repeat" the canonical EntityId value as the value of the "ssn" Attribute.  
 +
* Although it is not shown in the above diagram, the developer has defined two of these (SSN & mobile) as being Synonym Attributes.
 +
* Just to show off to his boss, the developer defined a complex Attribute called "knows" and used it to link the entity on the left with the entity on the right by referring to the right-most entities canonical (and hopefully immutable) entityId.
 +
 
 +
=== Using IdAS... ===
 +
* Calling IEntity.getCanonicalEntityId() on the Entity at the left will return the (canonical) EntityId value 033568888.
 +
* Calling IEntity.getCanonicalEntityId() on the Entity at the right will return the (canonical) EntityId value 034898786.
 +
* Calling IEntity.getEntityIds() on the Entity at the left will return a list of these two IAttributes:
 +
** mobile with value +16175137924
 +
** ssn with value 033568888
 +
* Calling IEntity.getAttribute(<knows>) on the Entity at the left will return the Entity on the right [some liberties taken here for brevity]

Latest revision as of 15:13, 14 May 2009

{{#eclipseproject:technology.higgins|eclipse_custom_style.css}}

Higgins logo 76Wx100H.jpg

About

This page is for working out the requirements and design decisions for any changes to Higgins EntityIds in the migration from the Context Data Model 1.0 to the Context Data Model 1.1. Some background discussion is here: IdAS EntityId Requirements Discussion Summary.

Current EntityId Definition in Context Data Model 1.0

  1. Is of type [need info here].
  2. Has cardinality 0..1
  3. MUST be Context-unique; MAY be globally unique.
  4. Is always exposed as an Attribute.
  5. Exposes no information about mutability.

Proposed Definitions for Higgins 1.1

Entity:

  1. An Entity is a node in the graph described by the Higgins Context Data Model.
  2. An Entity is identified by 0..n EntityIds (vs. 0..1 in Higgins 1.0)
  3. At least one EntityId of an Entity SHOULD be immutable, i.e., serve as a persistent reference to the Entity within that Context (forever). However because Higgins does not control Contexts or Context policies, the CDM must be prepared that an identifier for an Entity MAY be mutable, i.e., may be reassigned in that Context to reference a different Entity.

EntityId:

  1. An EntityID is of type String (if the EntityId is not an Attribute of the Entity) else of type IAttribute (if the EntityId is also an Attribute of the Entity)
  2. An EntityId MUST be locally unique within the Context.
  3. An EntityId MAY be globally unique (GUID)
  4. An EntityId MAY be exposed as an Attribute. If it is the Attribute Type MUST be marked as a higgins:synonym
  5. An Entity MAY have a single cannonical EntityId that MUST be immutable

Proposed Changes in Context Data Model 1.1

#1: Not Require EntityId to be Exposed as an Attribute

The proposed change is to make it OPTIONAL to expose EntityId as some kind of Attribute. Contexts that do not want to expose the EntityId can omit it from the list of Attributes for an Entity. Note: if the EntityId is mutable, it SHOULD be exposed as an Attribute so it can be modified.

#2: Add a higgins:synonym Attribute to higgins.owl

For those Context Provider developers who wish to explicitly tag certain Attributes as being capable of being used as an alternative identifier for this Entity (i.e. it uniquely at LEAST within the containing Context identifies this Entity).

For example, if the developer wished to declare a "mobile" telephone number attribute as being a synonym to whatever kind of identifier getEntityId() returns, they would, in their Attribute Definition define their new mobile attribute as a sub-property of higgins:synonym. For example:

:mobile
     a       owl:DatatypeProperty ;
     rdfs:range xsd:string ;
     rdfs:subPropertyOf higgins:synonym .

[well, in practice the range would likely be a syntax restriction on xsd:string, not a plain old xsd:string, but fixing that would complicate the example]

#3: Changes to EntityID definition

  • The canonical EntityId (if it exists) is immutable

Proposed Changes to IdAS API for Higgins 1.1

public Object[] IEntity.getEntityIds();

This method returns an array of EntityIds that uniquely identify the Entity within the Context. Each Object is either

  • a String (if the EntityId is not an Attribute of the Entity)
  • an IAttribute (if the EntityId is also an Attribute of the Entity)

public Object IEntity.getCanonicalEntityId();

This method returns the "canonical" EntityId, i.e. the preferred one. The returned object is either a String or an IAttribute. The context provider guarantees that this EntityId is immutable. Returns null if this Entity has no Cannonical EntityId

public IEntity IContext.getEntity(String);

This method already exists today. There is no change to it. It looks up an IEntity based on a String which is not an Attribute of the Entity.

public Iterator IContext.getEntities(IFilter);

This method already exists today. There is no change to it. It looks up IEntitys based on an IFilter, which can select them by Attribute Values.

public IAttributeModel.isEntityId();

Returns true, if IAttributes that use this IAttributeModel also act as EntityIds. These IAttributes may be returned by the above IEntity.getEntityIds() method.

public IAttributeModel.isMutable();

Returns true, if IAttributes that use this IAttributeModel are mutable, i.e. if its IAttributeValues can be changed/added/removed.

Example Using Proposed Changes

The following diagram shows three Entities: two ordinary Entities and one Entity Class (higgins:Person):

Multiple-identifiers5.png

  • The Context Provider developer has defined three simple Attributes: "ssn", "mobile", and "shoe-size".
  • The developer chose to use the SSN as the canonical EntityId
  • The developer chose the option to "repeat" the canonical EntityId value as the value of the "ssn" Attribute.
  • Although it is not shown in the above diagram, the developer has defined two of these (SSN & mobile) as being Synonym Attributes.
  • Just to show off to his boss, the developer defined a complex Attribute called "knows" and used it to link the entity on the left with the entity on the right by referring to the right-most entities canonical (and hopefully immutable) entityId.

Using IdAS...

  • Calling IEntity.getCanonicalEntityId() on the Entity at the left will return the (canonical) EntityId value 033568888.
  • Calling IEntity.getCanonicalEntityId() on the Entity at the right will return the (canonical) EntityId value 034898786.
  • Calling IEntity.getEntityIds() on the Entity at the left will return a list of these two IAttributes:
    • mobile with value +16175137924
    • ssn with value 033568888
  • Calling IEntity.getAttribute(<knows>) on the Entity at the left will return the Entity on the right [some liberties taken here for brevity]

Back to the top