Skip to main content

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Jump to: navigation, search

Difference between revisions of "PDS Architecture"

(Data Models)
(HBX)
 
(39 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
{{#eclipseproject:technology.higgins|eclipse_custom_style.css}} [[Image:Higgins.funnell.PNG|right]]  
 
{{#eclipseproject:technology.higgins|eclipse_custom_style.css}} [[Image:Higgins.funnell.PNG|right]]  
  
A PDS is a cloud-based service that works on behalf of you, the individual. It gives you a central point of control for personal information about a you. Things like your interests, contact information, addresses, profiles, affiliations, friends, and so on. A PDS is a place where you establish bi-directional data flows between external businesses and your PDS. Or between your friends' PDS and your PDS.
+
This document describes the top level Higgins 2.0 PDS components under active development. Here are the bugzilla component names:
 +
* H2-Client
 +
* H2-HBX
 +
* H2-PDS
 +
* H2-PDS Support
 +
* H2-ADS
 +
* H2-Data Model
  
== Long Term Goals ==
+
== Front End  ==
  
This section is aspirational. It describes the vision we have for Higgins 2.0. Other sections describe actual code and progress we're making towards this vision.  
+
There are two front end components: a web client, and a browser extension.  
  
'''Data Management.''' In some cases the data itself flows directly between the data provider and the data consumer, while in others the data flows through the PDS intermediary. In some cases the source of the data is the PDS's local storage. In cases where data flows from or through the PDS, we have the opportunity to map it into a normalized data model, provide the ability to see the data values, and in some cases be able to edit and update it.  
+
[[Image:Higgins client 2.0.222.png|center]]
  
'''Discovery'''. A PDS supports a discovery API that allows the user to be discoverable by other people, organizations, apps and exchanges when the incoming inquiries meet criteria the user specifies.
+
=== Client ===
  
'''Interoperability'''. Each PDS is a peer that can exchange personal data with other PDS peers within a distributed network operated by a multiple organizations. Each PDS would be hosted by a trusted organization that acts on behalf of the individual, or be would be self-hosted. An individual's PDS would typically include links to objects stored in a friend's PDSes. These links, taken together, form a social graph that is distributed across the PDSes.
+
The client is written in HTML and JavaScript and runs in any desktop browser (e.g. IE, FF, Safari, Chrome). In the future we also plan to make it display well on the limited screen size of smartphone mobile browser (e.g. iPhone, Android, etc.).
  
[[Image:Pds 2.0.200.png|center]]
+
* [[Org.eclipse.higgins.js.pds.client | .js.pds.client]]
  
=== PDS ===
+
=== HBX ===
  
Information from a variety of data sources (e.g. social networks, telco and health data sources) are virtually integrated by the PDS and presented in a "dashboard" application in a browser or in desktop and mobile clients. The PDS gives you control over your own information by allowing you to share selected subsets of it with other people and organizations that you trust.  
+
The Higgins browser extension makes possible functionality like browser-side integration with other web APIs and sites, scraping and form filling.
  
* Is a service that enables the user to participate as a peer within a distributed personal data ecosystem
+
* .chrome.bx - Chrome-only Higgins Browser Extension
* Provides an online profile manager web app that provides an integrated view of the user’s data, the ability update self-asserted data, a way to manage authorizations (e.g. using something like an UMA Authorization Manager) and set policies under which 3rd parties (e.g. apps) gain access to portion of the user’s information
+
* .js.pds.cde - Connection Data Engine 1. Loads CDE1-compatible JSON Scripts (See [[App-data vocabulary]]) from templates and uses them to implement auto-login, auto-registration, form filling, etc.
* Implements a Discovery API that allows the user to be discoverable by other people, organizations, apps and exchanges whose inquiries that meet user-defined criteria 
+
* [[org.eclipse.higgins.js.pds.cde2|.js.pds.cde2]] - Connection Data Engine 2. Loads CDE2-compatible JSON Scripts (See [[App-data vocabulary]]) from templates and uses them to implement auto-login, auto-registration, form filling, etc.
* Provides an identity provider (IdP) endpoint (e.g. OpenID OP, SAML, Infocard)
+
* .js.pds.connector.common
* Implements two factor authentication
+
* Provides a run-time environment for Kynetx-like apps that run within the PDS itself
+
* Decrypts data from the user's personal data stores (using a local key) to allow their attributes to be managed in the PDS's dashboard UI.
+
  
===Attribute Data Service===
+
====Functionality====
Provides a data abstraction layer over both personal and managed data stores, mapping them into a common data model.
+
  
'''Personal data storage'''
+
=====Browser interactions=====
* Manages a set of locally stored contexts each of which holds a different, contextualized person object
+
When the user's browser lands on a new webpage it:
* Provides an encrypted "lock box" in the cloud such that many kinds of data in the store (e.g.  persona definitions) cannot be read by the store's operator
+
* Determines if the current PDS user is currently logged in.
* Backs up personal data stored on your desktop and mobile devices
+
** This requires there be a template for the current site (domain) and that it contains an IsLoggedIn script
* Synchronizes personal data to other devices and computers owned by the person using a variety of network protocols.
+
** It is possible that a different PDS user (not the current PDS user) is currently logged in.
* Links information from contexts to accounts (profiles) that the user has at services providers, websites, social networking sites, etc. and over which the user has joint control and rights
+
* If the user is not logged in then
* Links information from the user's contexts with the contexts of the user's friends and colleagues
+
** It automatically logs the user in (or should it just auto-fill in the userid/password and wait for the user to click?)
 +
* Looks for every appropriate form on the page
 +
** Automatically fills in each form as best it can  -- this requires there be a template for the current site (domain) and that it contains a Fill script for this form (is there one fill script container with lots of per-form-submit-URL scripts? Or are there lots of Fill scripts each with an for-this-form-submit-URL attribute?
 +
* Waits for the user to submit a form (including a login form with or without a custom template?)
 +
** Scrapes the form submit data and writes it into the PDS. If it is a login form then it writes into the proxy object, else the corresponding context
  
'''Managed data storage'''
+
=====Web client interactions=====
* Each external, managed data service is represented as a context container within which is a person object and its attributes. For example the user's profile on Facebook could be represented as a person object within a Facebook context. The user's friends would be represented as Proxy objects. Each proxy object is a link to a person object in its own context container.
+
When the user opens a connection editor page (e.g. to edit the nytimes.com connection):
 +
* The BX immediately starts a background process to login and scrape the latest data values from the site.  
 +
** This is necessary because the user may have gone to the site directly (not using the PDS) and updated data values. A progress bar that shows this background process.
 +
* If the user edits an attribute it writes the updated attribute value to the site.  
 +
** If this "write" operation happens before the background sync completes, there is some possibility for sync collisions and and confusion.
  
=== 3rd Party Apps ===
+
== Back End Components  ==
 
+
These include:
+
* '''Exchange.''' A kind of PDS App that is involved in creating personal data exchanges analogous to a stock exchange. An exchange itself is a platform that supports yet another layer of apps above it [this is not shown above].
+
* '''Data Refinery.''' A kind of PDS App that reads datasets from the PDS, refines them, and writes them back to the PDS user. The refinery process includes analytics, inferencing, segmentation, etc. Refineries generally to create higher value, more refined data from the more raw forms of data, while often also making the data sets less personally identifying.
+
 
+
=== Active Clients & HBX  ===
+
 
+
An optional Higgins Browser Extension (HBX) can be downloaded from the Web-based PDS portal and convert a passive browser to an "active client" that has additional capabilities:
+
 
+
* '''Data capture.''' Since the client is integrated with the browser it can capture information about the user (e.g. data entered into Web forms, etc.) as they browse the Web.
+
* '''Web augmentation.''' It can also augment the user's web experience via web augmentation (overlaying context-specific information within the browser) and automatic form filling (e.g. filling in passwords).
+
* '''Security.''' The client can add a measure of anti-phishing protection from malicious websites.
+
* '''Privacy.''' Personal data is encrypted on the client before transmission to the cloud-based personal data store using a key that is unknown to the cloud-based personal data store operator.
+
 
+
== Components  ==
+
 
+
This section describes the Higgins 2.0 PDS components under active development.
+
 
+
=== Front End Components  ===
+
 
+
There are two front end components: a web app client, and a browser extension.
+
 
+
[[Image:Client 2.0.212.png|center]]
+
 
+
;Client 
+
:The client is written in HTML and JavaScript and runs in any desktop browser (e.g. IE, FF, Safari, Chrome). In the future we also plan to make it display well on the limited screen size of smartphone mobile browser (e.g. iPhone, Android, etc.).
+
;HBX
+
:The Higgins browser extension makes possible functionality that isn't possible in a pure web app architecture. One kind of functionality is browser-side integration with other web APIs and sites. Shown above is a connector that imports the user's advertising preferences from Google's Ad Preference page (http://www.google.com/ads/preferences).
+
 
+
=== Back End Components  ===
+
  
 
There are three back end components mostly written in Java and running in the cloud (e.g. Amazon AWS):  
 
There are three back end components mostly written in Java and running in the cloud (e.g. Amazon AWS):  
Line 79: Line 59:
 
*ADS
 
*ADS
  
<br> [[Image:Server 2.0.216.png|center]]  
+
[[Image:Higgins server 2.0.230.png|center]]  
  
 +
===PDS===
 
PDS Subcomponents:  
 
PDS Subcomponents:  
  
 
*.pds.usermanager.ws - simple web service to manage user accounts, change password, etc.
 
*.pds.usermanager.ws - simple web service to manage user accounts, change password, etc.
  
 +
===PDS Support===
 
PDS Support Subcomponents:  
 
PDS Support Subcomponents:  
  
 
*.pds.client - wrapper around Open Anzo java client
 
*.pds.client - wrapper around Open Anzo java client
  
 +
===Attribute Data Storage===
 
ADS Subcomponents:  
 
ADS Subcomponents:  
  
*PLANNED: .ads.ld&nbsp; - Linked Data endpoint
+
*PLANNED: .ads.ld - Linked Data endpoint
 
+
 
+
 
+
=== Data Model  ===
+
 
+
==== A Common Vocabulary ====
+
 
+
Data that is either created by the user and stored on the PDS or passes through the PDS intermediary on its way from the data source to the data consuming service can, in many cases, be mapped into a rich, common data model. This allows it to be consistently displayed (and in some cases edited) to the user irrespective of its original source. The common data model being developed for the purpose of representing people and their social networks is called the [[Persona Data Model 2.0]].
+
 
+
People play different roles and share different subsets of their social graphs and attributes depending on who they are interacting with. For this reason a single person is represented as a set of partial identities that are used in different situations. The heart of the model used by the personal data store and managed data stores is based on a set of containers called ''contexts.'' Each context holds a partial digital identity called a ''persona''. Each persona instance has a set of attributes and values. The contexts, personas and attributes adhere to the Higgins [[Persona Data Model 2.0]].
+
 
+
These contexts are usually displayed as digital card metaphors in a user interface. A context/card could hold the attributes of a person's driver's license, home address, credit card. They might simply hold a verified assertion that a person is over 21 years of age. Contexts may also be about the user's friends and colleagues.
+
 
+
The user can choose to collect sets of these cards (partial identities) into a ''persona-set''. For example the user could group together a home address card, an AMEX credit card, a proof of age-over-21 and a card holding a set of "shopping friends" into an "eCommerce" persona. This is done by tagging each of these cards with the "eCommerce" label. When the user goes to a new eCommerce site, it can "project" (either by form filling or something more sophisticated!) the minimal set of required attributes from these "eCommerce" cards to the site without tedious data entry.
+
 
+
If the user desires, they can give a semi-permanent (revocable) permission to the relying site, app or system to be able to access an approved set of attributes. The user can basically send a "pointer" to these cards to the relying site. The relying site can dereference the pointer and read (and in some cases update) selected attributes.
+
  
The [[Persona Data Model 2.0]] mentioned above builds on the [[Higgins Data Model 2.0]] which defines a small set of fairly abstract attributes.
+
== Data Model ==
  
==== Naming: Entity and Context Ids ====
+
Data attributes whether created by the user or imported from an external service are stored in a common data model. This allows them to be consistently displayed to, and in some cases edited by, the user irrespective of its original source. We call this the [[Persona Data Model 2.0]].  
@@@TODO: write a section describing the notion of globally unique graph of UDIs; the fact that in Higgins 2.0 we are only allowing UDIs to be Linked Data URIs; the fact that there will be TWO separate APIs that clients can be built on; the first API provides data access to the local contents of the ADS; the second API provides access to the open web of data defined by (Linked Data) UDIs; the fact that the entire contents of the ADS can be treated as a cache of a (tiny) portion of the UDI web of data; need to describe the component that maintains the map from the URI of the external information resource vs. the URI of its data description (these are the WRONG terms) and a time-to-live; refresh algorithms; etc.
+
  
 
[[Category:Higgins 2]]
 
[[Category:Higgins 2]]

Latest revision as of 12:12, 4 January 2012

{{#eclipseproject:technology.higgins|eclipse_custom_style.css}}
Higgins.funnell.PNG

This document describes the top level Higgins 2.0 PDS components under active development. Here are the bugzilla component names:

  • H2-Client
  • H2-HBX
  • H2-PDS
  • H2-PDS Support
  • H2-ADS
  • H2-Data Model

Front End

There are two front end components: a web client, and a browser extension.

Higgins client 2.0.222.png

Client

The client is written in HTML and JavaScript and runs in any desktop browser (e.g. IE, FF, Safari, Chrome). In the future we also plan to make it display well on the limited screen size of smartphone mobile browser (e.g. iPhone, Android, etc.).

HBX

The Higgins browser extension makes possible functionality like browser-side integration with other web APIs and sites, scraping and form filling.

  • .chrome.bx - Chrome-only Higgins Browser Extension
  • .js.pds.cde - Connection Data Engine 1. Loads CDE1-compatible JSON Scripts (See App-data vocabulary) from templates and uses them to implement auto-login, auto-registration, form filling, etc.
  • .js.pds.cde2 - Connection Data Engine 2. Loads CDE2-compatible JSON Scripts (See App-data vocabulary) from templates and uses them to implement auto-login, auto-registration, form filling, etc.
  • .js.pds.connector.common

Functionality

Browser interactions

When the user's browser lands on a new webpage it:

  • Determines if the current PDS user is currently logged in.
    • This requires there be a template for the current site (domain) and that it contains an IsLoggedIn script
    • It is possible that a different PDS user (not the current PDS user) is currently logged in.
  • If the user is not logged in then
    • It automatically logs the user in (or should it just auto-fill in the userid/password and wait for the user to click?)
  • Looks for every appropriate form on the page
    • Automatically fills in each form as best it can -- this requires there be a template for the current site (domain) and that it contains a Fill script for this form (is there one fill script container with lots of per-form-submit-URL scripts? Or are there lots of Fill scripts each with an for-this-form-submit-URL attribute?
  • Waits for the user to submit a form (including a login form with or without a custom template?)
    • Scrapes the form submit data and writes it into the PDS. If it is a login form then it writes into the proxy object, else the corresponding context
Web client interactions

When the user opens a connection editor page (e.g. to edit the nytimes.com connection):

  • The BX immediately starts a background process to login and scrape the latest data values from the site.
    • This is necessary because the user may have gone to the site directly (not using the PDS) and updated data values. A progress bar that shows this background process.
  • If the user edits an attribute it writes the updated attribute value to the site.
    • If this "write" operation happens before the background sync completes, there is some possibility for sync collisions and and confusion.

Back End Components

There are three back end components mostly written in Java and running in the cloud (e.g. Amazon AWS):

  • PDS
  • PDS Support
  • ADS
Higgins server 2.0.230.png

PDS

PDS Subcomponents:

  • .pds.usermanager.ws - simple web service to manage user accounts, change password, etc.

PDS Support

PDS Support Subcomponents:

  • .pds.client - wrapper around Open Anzo java client

Attribute Data Storage

ADS Subcomponents:

  • PLANNED: .ads.ld - Linked Data endpoint

Data Model

Data attributes whether created by the user or imported from an external service are stored in a common data model. This allows them to be consistently displayed to, and in some cases edited by, the user irrespective of its original source. We call this the Persona Data Model 2.0.

Back to the top