Notice: This Wiki is now read only and edits are no longer possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.
Persona Data Model 2.0
A data model for people and their relationships with other people and businesses. Builds on Higgins Data Model 2.0.
Contents
- 1 Person entities, attributes, links and contexts
- 2 Vocabularies
- 3 Proxies
- 4 Context Issuer/Authority and Access Control
- 5 Connection Context Pairs
- 6 Website Facade Connections
- 7 Supporting Contexts
- 8 Social Graphs
- 9 Inbox Context
- 10 Naming Conventions
- 11 Examples
- 12 Attribute Metadata
- 13 Open Issues
Person entities, attributes, links and contexts
A natural, human person is represented as a graph of p:Person
entities (nodes, or vertices) interconnected by links (edges). Each node represents a different facet of the user (person). Each of these facets is held in a separate (graph) container called a Context shown below as a round cornered rectangle.
Each Person entity node is a set of attributes and values. These attributes may be simple literals (e.g. the user's first name) or they may be other entities (which we call complex attributes). These latter attributes are shown in diagrams as links to other entity nodes.
Typically each node in the person graph is located in its own context. The root node lies in a special context (for each user) called the root context.
All of the Person entities can be reached by traversing links of the following kinds, (although other links may also exist (e.g. foaf:knows
, etc.):
h:correlation
- A link from an entity representing person A to (i) an entity that also represents person A or (ii) to an interstitial Proxy whose
p:resource
link points to an entity that also represents person A h:relation
- A link from an entity representing person A to (i) an entity that represents a person other than person A or (ii) to an interstitial Proxy whose
p:resource
link points to an entity that represents a person other than person A h:indeterminate
- A link from an entity representing person A to (i) an entity that represents a person that may or may not represent person A or (ii) to an interstitial Proxy whose
p:resource
link points to an entity that represents a person that may or may not represent person A proxy:resource
- A link from a Proxy to an entity in another context.
Vocabularies
Vocabularies for Describing People
Contexts describe their contents (i.e. person entity attributes) using in the Persona vocabulary which in turn imports the following well known vocabularies (aka ontologies):
...and the following Higgins-defined vocabularies:
Not imported by the Persona vocabulary but recommended where relevant to the developer's problem space:
- OpenSocial2 vocabulary - additional social Person attributes, Messages, Organization etc.
- SchemaOrg vocabulary - additional attributes for Person, Organization, Place, Event
- Payment vocabulary - credit cards, products purchased, etc.
- Interest vocabulary - general interests - subclasses of online-behavior:InterestTopic
- I-Card vocabulary - OASIS IMI InfoCard cards
- Places vocabulary - a database of cities, regions, countries
Supporting Vocabularies
The following vocabularies are used to support the PDS application itself:
- Flat Persona vocabulary - a flattened, simplified subset useful for querying persona.owl-based data stores
- Template vocabulary - for describing template contexts that are instantiated as regular contexts. Also uses these vocabularies:
- View-builder vocabulary - for describing how to hierarchically organize the contents of a context for presentation (e.g. in a UI)
- App-data vocabulary - for describing active, JavaScript content that is either stored in a template or fetched from an external service
- Mapping vocabulary - a set of rules used to map between persona.owl and vocabularies used by external sites and services
- Template-meta vocabulary - metadata about connection templates; used to create a registry of templates
- Event vocabulary - for describing attribute changed and attribute disclosure events
Proxies
A Proxy is an object that contains a link (proxy:resource) to an entity (usually a Person) in another context. A proxy allows lazy loading (e.g. by user interfaces) of the entity to which it points. The UI code can rapidly load cards and display them visually. Loading of the resource's context can be delayed and/or happen in a background process.
To simplify diagrams of the persona data model we can hide card/proxies by using the following shorthands:
For details about proxies see Proxy vocabulary.
Context Issuer/Authority and Access Control
As we've described above, contexts contain person entities each of which is comprised of a set of attributes. Each context has an issuer attribute that indicates whom is authoritative over the entire contents of the context. If the user is named as the issuer of the context then the access control policy allows the user to edit and update the entire contents of the context as they see fit. Contexts for which the user is the issuer are physically located within the PDS--the ADS to be precise). The access control policy is contained within a special control context associated with each (regular) context. For more information about control contexts see the section below on supporting contexts.
Connection Context Pairs
A connection is a relationship between the PDS user and an external site/business or a friend's account on their PDS. There are two sides to these relationships, but not in the usual sense of things. One side is the face that the user wishes to present to the other party. The other side is what the other party says about the person. Each "side" is represented as a p:Person entity. Each p:Person entity lives in its own connection context. Since both p:Person entities are about the same person, the two person entities are interconnected with h:correlation links.
We refer to one of these connection contexts as the definer and the other as participant. In every relationship one party is defining the ground rules for the relationship, and the other is consenting to play within these rules. In a person-to-business relationship the user plays the role of participant, and the business plays the role of definer. In a person-to-person relationship the user could play either role.
The definer-created template that governs the connection relationship identifies which attributes the definer provide (i.e. is authoritative over) v.s. which attributes it requests from the participant (i.e. the participant is authoritative over). However, the actor playing the definer role writes to the definer context and the actor playing the participant role writes to the participant context. As a consequence, any given attribute (whether definer-authoritative or participant-authoritative) may be written either context; or both.
If the user is playing the role of participant, the identifier of the person entity in the participant context is "<contextid>#me" by convention (see the Naming Conventions section below for more details). The id of the person entity in the definer context is a globally unique identifier of the form "<contextid>#localentityid" where localentityid is usually a URI-friendly normalization of the user's username on the external system.
At this point an example might be helpful. Let's take the example of a relationship between the user and the New York Times:
The attributes of the person entity in the participant context are the set of statements that Alice makes about herself in the context of their relationship with the NYTimes. It is the face or persona that she wishes to present to that business. Examples might include her, first name, last name, email address, home delivery address, etc. Alice can make these statements by directly editing them in the participant context using her PDS client. However, she could also express the same intent by interacting with the NYTimes website directly. If she did so the NYTimes agent would write the updated values of these attributes into the definer context.
The attributes of the Person entity in the definer context are the set of statements that the NTimes wishes to make about Alice in the context of that user's relationship with the NYTimes. Examples might include Alice's subscriber id. These two Person entities are bi-directionally linked with h:correlation links.
The access control policy of the participant context allows Alice to read and write attributes, and the NYTimes to read them. The access control policy of the definer context allows Alice to read attributes, and the NYTimes to read and write them.
In the user interface (in the Higgins portal) these twin contexts are integrated together and displayed as a single semi-editable view. We discuss attribute integration further in a separate section below.
Attribute Integration
Both the definer and the participant contexts contain p:Person entities with a set of attributes. These two attribute sets are not necessarily disjoint (i.e. there may be N>1 attributes that are common to both p:Persons). The integration algorithm is as follows:
- For attributes that exist only on one or the other (but not both) of the two interlinked persons, take their values from whichever person entity they are found.
- For attributes that exist on both persons, take the values from the person whose containing context's modified date-time is more recent.
Let's examine this algorithm using an example of Alice's connection to the NYTimes website. The parameters of this connection were defined by NYTimes, specifically, by a NYTimes-minted ConnectionTemplate. The relationship involves two disjoint sets of attributes: the set of attributes for which the definer is authoritative, and the set for which the participant is authoritative. In this example Alice is authoritative over three: her first name, last name, and email address. The NYTimes is authoritative over one: Alice's subscriber id.
Alice plays the role of participant. Alice's PDS's connection viewer/editor reads attributes from both contexts, integrates them according to the algorithm above, and displays a UI showing these all four of these attributes. Since Alice is authoritative over first name, last name and email address, these are displayed using editable UI widgets. Since the NYTimes is authoritative over her subscriber id, this is displayed in a non-editable widget. If Alice updates any values of any of the three editable attributes, these updated values are written into the participant context (and the context's 'modified' timestamp is updated). As described in the next paragraph, the definer context may contain updated values for none, some or all of the attributes over which Alice is authoritative. Thus these attributes may ultimately exist in both contexts. Per the integration algorithm, the UI takes the values of the common attributes from the most recently updated context. If the definer context has been more recently updated, then it reads these Alice-authoritative attributes from the definer context and writes them into the participant context.
The NYTimes plays the role of definer. We ignore here the technical details (e.g. network protocols, and/or APIs.) of how this data connection works, and just look at the attribute integration logic. The NTYimes has read/write access to the definer context and read access to the participant context. It can also read the modified date-time values of each. The NYTimes is authoritative over the subscriber id value and under no circumstance (either with the PDS or on the NYTimes site) can Alice update or change this value. The NYTimes writes the value of the subscriber id value into the definer context. However, for the other three attributes over which Alice is authoritative, Alice may update their values on the NYTimes site. If she does, the NYTimes writes the updated values of these 3 attributes into the definer context (and its modified value is updated).
Website Facade Connections
Until the day when businesses natively support bi-directional data connection APIs and open protocols (e.g. perhaps things built on top of OpenID Connect, etc.) we can create a connection another way. The Higgins PDS project includes an optional browser extension (aka HBX) that can fill attributes from the PDS to the site, and scrape data from the web pages of the site into the user's PDS.
The data model to implement this involves only one half of the participant/definer context pair described in the previous section. In this case we instantiate a single participant context of a special kind called a WebsiteFacade. The template for this website facade includes scripts, mapping rules and sometimes custom JavaScript to allow the HBX to read/write attributes from/to the site and update them in the user's ADS account. In addition to being editable using the PDS web client UI, the HBX can execute JavaScript that edits it. See Website Facade Connection Example for more details.
Supporting Contexts
Each regular context (e.g. each of the contexts shown above) has the following links:
- 0..1 ctxt:template
- 0..1 h:control
- 1..1 h:vocabulary
Template Context
A template context acts as a template for a (non-template) context. It contains information common to all instances instantiated from it. Each non-template context may have up to one associated template context (pointed to by p:template attribute).
ConnectorTemplates are templates that describe and govern the relationship between a user and an external party such as a business or a friends's PDS. A ConnectorTemplate describes:
- The set of attributes that each "end" of the relationship (e.g. participant vs. definer) agree to provide
- Vocabulary/schema mapping rules to transform the "other" party's attributes into and out of the persona data model
- In the case of connections to websites (as opposed to web services or other PDSes) it may include scripts (e.g. JavaScript) to read/write to/from the site
- Future: a legal contract (agred to by both parties) that governs how each party's attributes may be used.
For more information about templates see Template vocabulary.
AppTemplates are templates for instantiated applets (PDS add-ons) that have read (and potentially write) access to a specific set of attributes within the PDS.
Control Context
Each regular context is associated with one "control" context (linked to by h:control). A control context is associated with one regular context. The control context contains meta information including:
- date-time when the regular context was created and modified
- access control lists:
- list of parties (currently PDS account ids) that may read the regular context
- list of parties that may write the regular context
- list of parties that may append to the regular context
Vocabulary Context
Each regular context has an h:vocabulary link to a context holding the vocabulary it uses to describe its contents. Multiple regular contexts may the same vocabulary context. The value of this link is usually a reference to the context holding persona.owl (see Persona vocabulary).
Social Graphs
h:relation
HDM defines a h:relation
complex attribute that is used in PDM to link one Person
node to another where each Person
node represents a different person. No symmetry is implied in this thus the statement (A h:relation
B) is akin to saying person A "knows of" person B.
Shown below are two social graph examples. One uses foaf:knows
links and and (unrelated to this) shows each node in its own context. The other uses h:relation
links and (unrelated) shows all person nodes in a single context. In the Work context we see that the user knows three colleagues but doesn't know how they know one another. In the Home & Family context we see that the user knows two people and that everyone knows one another. The foaf:knows
links are shown in both directions although logically this is redundant since foaf:knows
is what is a called a symmetric relation.
Entities that represent the user are shown in purple. Nodes representing a person other than the user are shown in red.
foaf:knows
To indicate that a person A "knows" person B where some level of reciprocated interaction between the parties is implied, we use foaf:knows.
Since foaf:knows is a broader concept than h:relation, foaf:knows is not a sub-attribute of h:relation. Thus if we had the statement "A h:relation B" then we might later add a second statement "A foaf:knows B" to add the stronger, broader (and symmetric) concept of "knowing."
h:indeterminate
HDM also defines h:indeterminate
link attribute on node A to indicates that its value(s) may or may not represent the same thing as is represented by A.
Implementation Note
Consumers of the HDM may traverse h:relation
, h:correlation
and h:indeterminate
attribute links and (despite ignoring all other links) traverse the entire graph of Person
nodes.
Inbox Context
In order to bootstrap sharing, each PDS user has an inbox context that is globally append-able. This allows users to append invites to other users. See the Data Sharing With Alice And Bob scenario.
Naming Conventions
Context Naming
User Context Naming
User contexts inside an ADS are are named according to the following pattern:
http://<servername>/<username>/<context-name>
If the context is part of a connection context pair then the context-name uniquely identifies the "other" party in the connection. If the other party is a website then context-name is the domain name of the site (e.g. "staples.com").
Examples wherein servername (PDS/ADS operator) is my.azigo.com:
http://my.azigo.com/ptrevithick/awp
- anonymous web profilehttp://my.azigo.com/ptrevithick/staples.com
- paul's profile at staples.comhttp://my.azigo.com/ptrevithick/browsing
- browsing history
Reserved Usernames
Any username with 4 or less characters is reserved. Examples of reserved usernames:
- sys
- root
- blog
If the username is 4 or less characters this is the id of a system context (see next section)
System Context Naming
http://<servername>/<reserved-username>/<meta-type>/<context-name>
The <meta-type> may be one of these values:
- template
- ontology
- data
Example
http://my.azigo.com/sys/template/awp
- the template for a user's regular "awp" contexthttp://my.azigo.com/sys/ontology/tracker-catalog
http://my.azigo.com/sys/data/trackers
Entity Naming
The entity representing the user in most contexts has a local name of "me".
Example:
If the contextId is http://my.azigo.com/ptrevithick/awp and the local entityId is "me" then the fully qualified entityId is: http://my.azigo.com/ptrevithick/awp#me
Examples
Imagine a root context containing a p:Person entity locally named "me". This root node could have h:correlation links pointing to the root "me" entities in two contexts, a web profile context, and a alice-staples context.
The web profile context might look like this:
Attribute Metadata
To construct a data-driven presentation of the contents of contexts whose data is described using the Persona data model, metadata about the attributes within context are needed. See View-builder vocabulary#Cascading_Metadata for a discussion of where these metadata attributes are stored (i.e. which context) and how metadata attributes are evaluated when mapping rules are involved.
For a given attribute, A, the following metadata attributes (as described in Higgins Data Model 2.0#Attribute_Definitions (with the exception of categories which are not used in PDM 2.0)) comprise A's definition:
- UI widget label
- This is stored in an internationalized string value of the skos:prefLabel metadata attribute. An example of a UI label might be the string "Zipcode" for the person's postal-code attribute.
- Example value
- The example value is the value of the skos:example attribute. For example "name@domain.com" might be an example of an email value.
- Hover/Tooltip text
- The string description of the attribute is the value of the skos:description attribute.
- Type
- The type of an attribute is the value of the rdf:type attribute
- Allowed values
- The allowed values of an attribute is defined by the value of its rdfs:range metadata attribute. An rdfs:range may be an XML schema datatype such as xsd:nonNegativeInteger or it may be object valued in which the value of the rdfs:range attribute is the name of an entity class. If this class is a subclass of p:DiscreteRange, then the allowed values are the rdfs:label values of all instances/members of the class.
- Cardinality
- The min..max (inclusive) cardinality of an attribute is specified using owl:minCardinality and owl:maxCardinality. These two meta attributes are properties of a specific class of entity that is the domain of the attribute, not the attribute's own definition. In other words cardinality is expressed within the context of a class/set of individuals.
- Syntax restrictions
- We follow the latest OWL2 convensions. The value of the rdfs:range attribute may be rdfs:Datatypes augmented with owl:withRestrictions that include XML Schema facets (e.g. rdf:langRange xsd:length xsd:maxExclusive xsd:maxInclusive xsd:maxLength xsd:minExclusive xsd:minInclusive xsd:minLength xsd:pattern ) as described here.
We have recently introduced a convention that the context id of metadata attribute M must be the same as the context id of A. If the currie form of A is ctxt:attname then the currie form of M must have a prefix (i.e. namespace) of ctxt. For example if the attribute is fp:postalCode then metadata statements about fp:postalCode must be in the Flat Persona vocabulary context (fp being a prefix for this vocabulary) along with the definition of fp:postalCode itself. See also View-builder vocabulary.
Open Issues
- To support connector contexts for which a WebsiteFacade is used for the definer side along with its associated JavaScript, it may be useful to add a "date-time-modified" timestamp to every context. This would allow sync operations via a set of N WebsiteFacade JavaScript programs to be decoupled from (and asynchronous to) real-time edit operations by the user. A more sophisticated approach would involve caching as a set of commands (transactions) the changes made to any context and allowing other contexts (well, their associated JavaScript) to subscribe to these transactions.