EclipseLink/DesignDocs/Extensibility

From Eclipsepedia

Jump to: navigation, search

Contents

Bug

See [| Bug 335601]

Extensible Entity Types

The goal of this feature is to allow a user to start with a predesigned persistence unit and add mappings to the entities in that persistence unit without the need for redeployment of the persistence unit.

  • Static extensions defined for the extensible entity types prior to the application starting.
  • Dynamic are extensions defined while the application is operating.

The set of features we build for Extensible Entity Types will provide a foundation that can be build on by users that want to build multi-tenant applications can specify tenant specific extensions.

Requirements

  • Provide a mechanism so that developers can develop and map a persistent entity type that supports static and/or dynamic extensions where the extension is an additional property on the entity type with a Java type declared for the property.
  • Persist Extension Definitions
    • Extension metadata must be stored
  • Support Static Extensions
    • At application deployment the extension definitions must be merged into the run-time's mappings based on externally provided information (XML or Relational)
  • Support Dynamic Extensions
    • Provide an API to support the application declaring extensions dynamically
    • Concurrent application instances will become available of newly defined dynamic extension definitions

High Level Use Case

This feature set is designed to support a scenario where the bulk of an application is developed by one party and provided to a second party who can make certain types of modifications to that application.

Actors

  • Application Provider:
    • Provides the base application.
    • Has a strong understanding of the general domain
    • Has access to Java Development Skills
    • Has access to database development skills
  • Application User
    • Uses the base application
    • Strong understanding of general domain
    • Needs to extend the amount of data stored in the general domain
    • No Java Development Skills needed
    • No DBA skills needed

The Application Provider uses their undestanding of the general problem domain to design an application that covers the majority of the functionality that is required by application users. That application contains a persistence component that maps the data required for the application. That persistence component is mapped in a JPA persistence unit. The application provider provides, not only the application, but a management interface that allows the Application user to extend the data in the persistence unit.

The Application user makes use of the application provided by the Application Provider. When the Application User wants to expand the data set that is stored by the application beyond what the Application Provider has provided, they make use of the management interface to add to the mappings in the persistence unit provided by the Application Provider using the management interface. They are not required to write any Java code, or SQL and are not required to redeploy the application on the application server.

Application Development using Extensibility Feature Set

When we have completed our extensibility work, it will be possible for a persistence unit to be designed in two phases.

Application Provider Phase

The first phase will be nearly identical to the way a persistence unit is designed in EclipseLink today. In this phase of design the base schema, mappings, and properties of the persistence unit are defined. These parts of the persistence unit will exist for all running applications.

There will be some differences from typical persistence unit design, and they will potentially include:

  • Creation of additional tables in the schema
  • Adding additional columns to tables in the schema
  • Providing additional privileges for schema alteration
  • Specification of which Entities are extensibible

Example

The user designs a system for processing orders. They define the following Entities:

  • Customer
    • @Id id
    • @Basic name
    • @OneToMany Orders
  • Order
    • @Id id
    • @OneToOne Item
    • @Basic quantity
    • @ManyToOne customer
  • Item
    • @Id id
    • @Basic name
    • @Basic unitPrice

The mappings are made using JPA annotations, orm.xml or a combination, in the same way as other JPA applications. A persistence.xml is provided to configure the persistence unit.

The user designs a database schema to hold their Entities. The schema includes the following tables:

  • CUSTOMER
    • INTEGER ID
    • VARCHAR NAME
  • ITEM
    • INTEGER ID
    • INTEGER ITEM_ID
    • INTEGER QUANTITY
    • INTEGER CUST_ID
  • ORDER
    • INTEGER ID
    • VARCHAR NAME
    • FLOAT PRICE

Depending on their extensibility strategy, they may define other columns in those tables or other tables. This will be discussed more in Schema Design Section below.

The persistence unit is assembled and deployed. Not only is an interface provided for actual application use, but a management interface is provided to allow extension of the persistence unit by the user.

Application User Phase

The second phase of design involves customization of the persistence unit. With the persistence unit already deployed, the user doing customization adds mappings to the persistence unit. Because the schema has been designed to accomodate the new mappings, EclipseLink can either make use of predefined tables and columns to store the data for the mappings, or alter the schema to add new columns to existing tables.

Example

The user of the order processing system above decides to extend it. They wish to add an "address" attribute to Customer.

The user calls the management interface. The management interface calls EclipseLink API to add a mapping to the metadata. EclipseLink adjusts the users underlying session to make use of the new mapping without the need for redeployment on the application server. Future retreivals using Customer make use of the new "address" mapping.

Additional Use Case Support

The following use cases should also be addressed, but are not required to be supported without application shut-down.

  • Removal of mappings
  • Updating mappings

Schema Design for Extensibility

As mentioned above, one of the main differences in how an extensible application is designed as opposed to a typical application is in how the database schema is designed. There are several strategies that can be employed.

Flex Columns

Schema is designed to include preallocated columns that can be used to map additional data. In this example, Customer table might look like this:

  • CUSTOMER
    • INTEGER ID
    • VARCHAR NAME
    • VARCHAR FLEX_COL1
    • VARCHAR FLEX_COL2
    • VARCHAR FLEX_CO31

An arbitrary number of the columns prefixed with "FLEX_COL" could be defined. A user mapping the "address" property of Customer could simply map it to FLEX_COL1.

In order to support multiple data types, all flex-columns are defined as strings. Metadata about the expected type of the field is defined in the database. For each flex column, there is a set of entries in a database table. The table and column names storing these items could be Application provider determined and therefore they could all exist on the same table or they could exist on different tables.

  • Label (e.g. address)
  • Type (e.g. java.lang.String)

For instance, for the above example, an Application Provider might choose to define a single table containing the columns

  • CUSTOMER_METADATA
    • FLEX_COL1_LABEL
    • FLEX_COL1_TYPE
    • FLEX_COL2_LABEL
    • FLEX_COL2_TYPE
    • FLEX_COL3_LABEL
    • FLEX_COL3_TYPE

Alternatively, they might choose two define 3 tables

  • CUSTOMER_COL1_METADATA
    • FLEX_COL_LABEL
    • FLEX_COL_TYPE
  • CUSTOMER_COL2METADATA
    • FLEX_COL_LABEL
    • FLEX_COL_TYPE
  • CUSTOMER_COL3METADATA
    • FLEX_COL_LABEL
    • FLEX_COL_TYPE

Initially these tables would be quite sparse, but with the addition of Multi-tenant features they would become better populated

Flex Tables

Extensions use EclipseLink's secondary table feature. Additional tables are provided that can be used for extensions.

For instance, as above, CUSTOMER table is designed as follows:

  • CUSTOMER
    • INTEGER ID
    • VARCHAR NAME

An additional table is provided in the database

  • CUSTOMER_EXT1
    • INTEGER CUST_ID
    • VARCHAR FLEX_COL1
    • VARCHAR FLEX_COL2
    • VARCHAR FLEX_COL3

When the first extended mapping is added, CUSTOMER_EXT1 is added as a secondary table for Customer. The address field, can then be mapped to FLEX_COL1.

As in the above example an arbitrary number of columns prefixed with "FLEX_COL" could be provided.

In this case, in addition, each extended application could, optionally, provide their own extension table, and as a result, applications would not be forced to share unused data with other appliciations using the same base application, nor would metadata have to be held about types seach each EXT table could have the appropriate types.

This, also is supported by EclispeLink today, but we would want to consider adding DDL generation for this kind of scenario.

Custom Columns

Schema is designed with only the tables and columns used by the predefined mappings. A mechanism is build into EclipseLink to detect the metadata from the tables and alter those tables to add columns for the mappings that are added as extensions. EclipseLink must somehow have persmission to call ALTER TABLE

e.g. Customer table is designed as follows:

  • CUSTOMER
    • INTEGER ID
    • VARCHAR NAME

When the user adds an "address" mapping for Customer, EclipseLink inspects the metadata from the CUSTOMER table and calls an ALTER TABLE statement to add an ADDRESS field to the CUSTOMER table.

This design requires some new capabilities in EclipseLink

  1. A way to detect existing metadata, preferably one that works with multiple DBs
  2. A way to construct and issue ALTER TABLE statements

Value Rows

Schema is designed with a table that represents a map structure for mappings. One Map table is used for all additional mappings.

e.g. Customer table is designed as follows:

  • CUSTOMER
    • INTEGER ID
    • VARCHAR NAME

An addtional table is defined:

  • CUST_ATTR
    • NAME
    • VALUE
    • CUST_ID

We would likely also need a way of determining what metadata to expect. That could be represented in yet another table

  • CUST_METADATA
    • ATTR_NAME
    • ATTR_TYPE

If an "address" field is added to CUSTOMER, a new row could be added to the CUST_METADATA table with ATTR_NAME=address and and ATTR_TYPE=VARCHAR. Any value for that attribute will be added to the CUST_ATTR table with the NAME="address", the CUST_ID=the foreign key to customer and the value.

Queries for customer involve querying the CUST_METADATA table for a list of extensions, then the CUSTOMER table for the predefined mappings and finally either multiple queries or multiple joins the the CUST_ATTR table for the extended mappings.

This design requires some new capabilities in EclipseLink:

  1. A mapping that allows multiple attributes of the same entity to be mapped to the same map.
  2. A way to store and retrieve metadata the metadata we are storing
  3. We would want to consider a DDL generation for the new tables

Extensible Descriptors

The next component to an extendable application is the ability to actually add new mappings. New mappings must be persistent. i.e. They must be stored in a persistent media.

File-based-extension

It would be possible to enable extension by allowing the user to provide an additional orm.xml or eclipselink-orm.xml file that contains the additional mappings. That file could be allowed through a persistence unit property passed in when either the EntityManagerFactory or the EntityManager was created.

e.g. eclipselink.extended-mappings = <URL>

At processing-time this URL would be examined for an orm.xml file containing additional mappings and those additional mappings would be appended to existing descriptors.

A great deal of this strategy is already supported in EclipseLink. To allow this method of extension we would have to implement:

  1. Support for the property that points to the XML file.
  2. A mechanism to add mappings at either EntityManagerFactory deployment time, or EntityManager creation time

Persistent Extensions

Extended metadata information is stored in the database and retreived immediately after login. A table structure like time one in the Value Rows Schema section could be used to store the metadata.

To allow this method of extension we would have to implement

  1. A mechanism of storing and getting metadata information from a database table
  2. A mechanism to extend mappings in a descriptor based on the information in that table
  3. We might want a way to DDL generate that table

Entity classes

Given a descriptor for dynamic mappings, the entity it refers to needs fields that can store the actual data. We need to keep in mind the following two restrictions:

  1. Extension must occur without redeploy. This means any weaving we do, must occur before we know exactly what mappings will be added
  2. We want to avoid requiring the user to have any references to EclipseLink-specific classes in their domain classes.

Predefined attributes

Just as the database could hold predefined FLEX_COLUMNS, the Entities could hold predefined fields to hold the data. These fields would have a one-to-one coorespondance to the FLEX_COLUMNS. The weakness of this type of column is that we would have to predefine the Object type for it. As a result, it would have to hold a composite data structure that could read a String value and apply conversion based on the expected type stored in the database.

Attributes as a Map

The most obvious way to store a list of dynamically added attributes is in a Map. The map could be provided in the Entity class by the Application Provider and described in the metadata for the Entity, or it could be woven into the class for any Entity that allows dynamic mappings.

Name - Value Map

Here the key of the map is the name of the attribute and the value is its value.

Metadata Holder - Value Map

Here the key of the map is a data structure that holds metadata about the attribute including its name, its data type and any other required information.

API

The majority of the API documentation should come in design documents for the individual features we choose from this document. This section will capture any general comments about the API

Descriptor Extension

For Persistent Extensions above, we could provide a sample persistence unit that maps to our metadata tables and the API for changes would simply be the JPA API. Application providers could choose to use that persistence unit, or write there own.

Descriptors would need to be extended to know about the data structure of the tables and could even use that same persistence unit to read extended mappings.

Metadata Access

We should provide an API to access any metadata for the extensions. This API should maintain the EclipseLink policy of not requiring the user to use any EclipseLink-specific classes in thier implementation.

Perhaps the user could furnish an interface.

Persistence Unit Properties

Extensible Application Architectures

Note: This section starts to cross the line between our Extensibility feature set and any feature set we might develop for Multi-tenancy. It is included since we will have to keep in mind how applications are deployed when we figure out the limitations for adding mappings.

Deployment per client

In this type of deployment each client has their own application deployment. Database tables could be shared or in separate schemas.

EntityManagerFactory per Client

In this type of deployment, the same persistence unit is used by multiple clients. This persistence unit can be used by the same application, or by separate applications. Database schema can be shared or specified by each client.

EntityManager per Client

In this type of deployment, extensions are chosen at EntityManager creation time. An EntityManager contains constructs that let it make use of extended mappings that are not accessible to its EntityManagerFactory. Database schema can be shared or specified at EntityManager creation time.

Initial Feature Set And Limitations

This section will be updated after some discussion. At the moment, I am using it as a whiteboard and therefore it changes frequently.

The document above specifies a number of options for providing extensible functionality.

Feature List - initial proposal

  • DirectToMapMapping
    • Similar to the way attributes are mapped in Dynamic JPA
    • preexisting map in the Entity
    • map holds propertyName -> propertyValue mapping
  • Extensions to DirectToMapMapping to allow the DB side of the mapping to be a key-value structure as well
  • Extensible descriptor that can have mappings added post-deployment
  • Extensible descriptor that retreives additional mappings from a given table in the database and adds them
  • Weaving of map-structure for DirectToMapMapping into objects with extensible descriptors
  • Metadata storage and retreival strategy for mappings that are stored in the database that exposes the metadata to the customer
  • DDL Generation for Value Rows
  • Support for Removal and alteration of extended mappings
  • Support for detection of current schema and calls to Alter table to ammend schema to fit mappings
  • DDL Generation for Flex Columns and Flex Tables

Limitations

  • Initial Target for all features: Deployment-per-Client Application architecture
  • Initial feature list will only consider adding Basic mappings

Multi-tenancy

Multi-tenancy is not covered by this document, but the features defined in this document could be used in Multi-tenant systems.

The following document predates this document and discusses some of the multi-tenant options in EclipseLink. It will eventually be altered to discuss any multi-tenant features we hope to implement.