Skip to main content

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Jump to: navigation, search

Difference between revisions of "Data Models 1.X"

(Motivation)
 
(57 intermediate revisions by 6 users not shown)
Line 1: Line 1:
==Overview==
+
{{#eclipseproject:technology.higgins|eclipse_custom_style.css}}
The Higgins data model provides a common representation for identity, profile and relationship data to enable interoperability and data portability across heterogeneous sites and systems.
+
[[Image:Higgins_logo_76Wx100H.jpg|right]]
  
==Motivation==
 
  
Information fragmentation is a pervasive problem. Even seemingly simple activities depend on information from a number of heterogeneous sources. The information may be fragmented by physical location, device, application, middleware, data storage or platform. By providing a common data model, data from multiple locations and systems can be unified.
+
The [[Data Model]] provides a common representation for identity, profile and relationship data to enable interoperability and data portability across heterogeneous sites and systems. The model is described in these sections:
  
There is a great deal of interest among Web developers in solving interoperability and providing data portability. See, for example, http://DataPortability.org and many other related efforts. In this quest, the Higgins data model can provide powerful enabler for interoperability of identity-related information across the "silos."
+
=== Information Cards ===
 +
The Information Card (aka I-Card) metaphor includes the end-user concept of [[I-Card]]s and an [[Identity Selector]] to manage them
  
There are other approaches to data unification than providing a common data model. However every unification strategy involves choosing some kind of lowest common denominator. It is all a question of how low is low. The lower the level, the easier to do the unification, but the more lossy. For example, consider raw text. It's easy to index, search, and copy/paste but very lossy. Or consider XML, which offers a common syntax for describing a series of attributes of a given object and values for each of the attributes, although still without any defined semantics.
+
=== Tokens and Claims ===
 +
Higgins supports identity service concepts such as Claim, Digital Identity, Security Token and other objects used by Identity Providers, Relying Parties, Service Providers and Identity Selectors
  
The kinds of data we wish to unify are very roughly classified as ''identity'', ''profile'' and ''relationship'' data. ''Identity'' information is related to identification, authentication, etc. ''Profile'' information can be preferences, interests, and associated objects like events and things, wishlists. ''Relationships'' are links to other [[Digital Subject]]s--they can be used to represent friends and other kinds of associations with other [[Digital Subject]]s. A key kind of relation introduced in the model is the a Higgins ''correlation''--a link between different representations of the same real world object (e.g. you) in different contexts.
+
=== Context Data Model ===
  
==Kinds of interoperability==
+
The [[Context Data Model 1.0]] describes a data model that can makes portable and interoperable data from heterogeneous data sources such as enterprise directories, databases, communications networks, and social networks
  
Saying we desire interoperability can mean many different things. At the least it should mean that we can navigate through and inspect data objects and their associated attributes/relationships within any [[Context]] through the Higgins API. This is part of what motivates [[Data Model Goals]] [2], [3] and [4]. At this level of interoperability we may not understand the meaning of the objects and the attributes, but we can know that they are there.
+
[[Category:Higgins Data Model]]
 
+
Moving further along the interoperability spectrum, if we add the requirement that every attribute/relationship is globally uniquely identifiable (see [5]in [[Data Model Goals]]) then we can use the Higgins IdAS API for more than a shallow syntactic parse of the data in various Contexts. We can, for example, assemble (join) attribute information about about two Digital Subjects held in two separate Contexts, and perhaps implemented by separate providers, without collision and data loss. Along these lines, Higgins itself needs to implement certain kinds of cross-Context attribute data flows for correlated [[Digital Subject|Digital Subjects]].
+
 
+
Beyond inspection and navigation, Higgins aspires to support applications that can also edit [[Context]] data. We envision Higgins-based applications with user interfaces that can manipulate data contained in any [[Context]] from any [[Context Provider]] bound into Higgins. This implies a number of things. First, we require that the semantics of the attributes of objects be defined in a single well-defined (unambiguous) manner. If the model has more degrees of freedom than the absolute minimum necessary, ambiguity will arise where different [[Context Provider|Context Providers]] express the same semantic in different ways. For more about this see [6] in [[Data Model Goals]].
+
 
+
Second, the ''specific'' schema of a [[Context|Context's]] use of the abstract Higgins data model must be exposed at the CPI and API levels. This exposure allows an application to know what the valid degrees of freedom in the structure of the data are, and the values of its data fields may assume. The application can learn from the schema what datatypes are used to describe the value of a given attribute (e.g. a string, a non-zero number or a date, etc). It can learn what kinds of attributes may optionally be added to an object (and which may not), etc. And it can learn or what kinds of required and/or optional relationships are allowed with objects of various kinds. For more about the need for a common schema language see [7] in [[Data Model Goals]].
+
 
+
==Design Goals==
+
[[Data Model Goals]] provides an an enumeration of the top level design goals that ultimately led to the decision to using an RDF/OWL-based metamodel.
+
 
+
== Higgins Data Model Definition ==
+
 
+
Rather than invent a new metamodel from scratch, the model is based on the W3C's Resource Description Framework (RDF) and Web Ontology Language (OWL 1.0). We used RDF and OWL to express a very abstract base ontology called higgins.owl (aka HOWL) that in turn describe the domain of identity information. The "Lexicon" project within the Identity Gang defined a set of identity domain concepts/terms that have been directly formalized in HOWL. These domain concepts include:
+
# [[Context]] 
+
# [[ContextId]]
+
# [[SubjectId]]
+
# [[Digital Subject]]
+
# [[Entity]]
+
# [[Identity Attribute]]
+
# [[Relation]]
+
 
+
Their semantics (with the exception of [[Entity]] which is not modeled) have been expressed in higgins.owl that is summarized in the [[Higgins Ontology]] page. The [[Higgins Ontology]] pages define the semantics of HOWL.
+
 
+
== Extending HOWL ==
+
HOWL is a base ontology. To be useful in real-world applications developers must develop specialized ontologies based on HOWL that describe a specific concrete domain.
+
 
+
For example, if a developer wanted to describe a CRM database, she would create an OWL ontology that would describe the data objects in the CRM database. This CRM database is called a [[Context]] in Higgins. If, for example, the database contained records about customers and those customers had full-names and email addresses, then the developer would define "Customer" as a sub-class of [[Digital Subject]] and "full-name" and "email" as kinds of [[Identity Attributes]].
+
 
+
Here are some HOWL-based Ontologies:
+
* [[test-person Example Context Ontology]]
+
* [[Person-with-address Example Context Ontology]]
+
* [[Person-with-friend Example Context Ontology]]
+
 
+
== HOWL and IdAS ==
+
 
+
The [[Identity Attribute Service]] (IdAS) provides a Java API that exposes read/write-able data from a wide variety of external data sources in the common Higgins model. The IdAS API implements but does not define the semantics of the Higgins data model.
+
 
+
[[Context Provider]] plug-ins to IdAS are used to adapt external system, site, database or other data source to the IdAS API. These [[Context Provider]]s are responsible for data transformation between the Higgins model and their own internal data model. Higgins does not constrain the [[Context Provider|Context Provider's]] choice of data representation; it could be XML-based, object-oriented, relational, or anything else.
+
 
+
[[Context Provider]]s can be used to adapt data stores/sources such as:
+
* Directories: LDAP stores like eDirectory, Active Directory, OpenLDAP, etc...
+
* Relational databases used by enterprise apps to store identity/profile information.
+
* Digital social networks (node-edge graphs): data behind Facebook, MySpace, LinkedIn, etc; or the graphs created by mining email traffic
+
* Email/IM/collaboration client account data: email and IM client accounts, contact/buddy lists
+
* Identity/profile data stored in website "silos": personal information stored sites like eBay, Amazon, Google Groups, Yahoo Groups
+
 
+
==Open Issues==
+
* [[Data Model Open Issues]]
+
** [[LDAP Issues and To-Dos]] --open issues specifically related to LDAP schema
+
 
+
== Scope ==
+
The data model addresses "The need for interoperability" described here: [http://www.eclipse.org/higgins/goals.php Higgins Goals]. In addition, items #3 and #5 of the [http://www.eclipse.org/higgins/higgins-charter.php charter] state or imply the need for a robust identity and social networking data model:
+
: '''Scope item 3.''' Provide an API and data model for the virtual integration and federation of identity and security information from a wide variety of sources.
+
: '''Scope item 5.''' Provide a social relationship data integration framework that enables these relationships to be persistent and reusable across application boundaries.
+
 
+
== References ==
+
===RDF/OWL Related Resources===
+
* OWL
+
** W3C OWL working group: http://www.w3.org/2007/OWL/wiki/OWL_Working_Group
+
** OWL 1.1 at Google Code: http://code.google.com/p/owl1-1/
+
** OWL 1.1 WD 8: http://www.w3.org/TR/owl11-syntax/
+
* Intro to RDF/OWL: [[RDF-OWL Data Model]]
+
* Semantic Web (RDF/OWL) Resources
+
** Toolkit: [http://www.wiwiss.fu-berlin.de/suhl/bizer/toolkits/ Developers Guide to Semantic Web Toolkits]
+
** Reference documents: [http://www.w3.org/2001/sw/WebOnt/#Current W3C Web Ontology Working Group]
+
** Tutorial: http://www.cs.man.ac.uk/~horrocks/ISWC2003/Tutorial/
+
* Normalization to OWL/RDF
+
** [http://www.ldap.com/1/spec/schema/ont.shtml Schemat]
+
** Sebastian Dietzold, Generating RDF Models from LDAP Directories (PDF) , [http://www.semanticscripting.org/SFSW2006/ 2nd Workshop on Scripting for the Semantic Web] co-located with the [http://www.eswc2006.org/ 3rd European Semantic Web Conference], June 12, 2006
+
 
+
===Misc Resources===
+
* http://identityschemas.org
+
* "D3.2: Models" FIDIS, October, 2005, ([http://www.fidis.net/fileadmin/fidis/deliverables/fidis-wp2-del2.3.models.pdf    PDF] 74 pages). Summary: "The objective of this document is to present in a synthetic way different models of representation of a person ("person schema") that can be used in different application domains.
+
* [http://www.nmi-edit.org/eduPerson/internet2-mace-dir-eduperson-200604.html eduPerson spex]
+
 
+
== Links ==
+
* [http://eclipse.org/higgins Higgins Home]
+

Latest revision as of 11:23, 25 June 2010

{{#eclipseproject:technology.higgins|eclipse_custom_style.css}}

Higgins logo 76Wx100H.jpg


The Data Model provides a common representation for identity, profile and relationship data to enable interoperability and data portability across heterogeneous sites and systems. The model is described in these sections:

Information Cards

The Information Card (aka I-Card) metaphor includes the end-user concept of I-Cards and an Identity Selector to manage them

Tokens and Claims

Higgins supports identity service concepts such as Claim, Digital Identity, Security Token and other objects used by Identity Providers, Relying Parties, Service Providers and Identity Selectors

Context Data Model

The Context Data Model 1.0 describes a data model that can makes portable and interoperable data from heterogeneous data sources such as enterprise directories, databases, communications networks, and social networks

Back to the top