Skip to main content

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Jump to: navigation, search

Difference between revisions of "EclipseLink/DesignDocs/328404"

(Design Specification: JPA Entity Type Partitioning)
(Project overview)
Line 25: Line 25:
  
 
The goal is to allow for different classes contained in a single persistence context to be mapped to different databases.
 
The goal is to allow for different classes contained in a single persistence context to be mapped to different databases.
 +
 +
  (I do not think that is the goal, the goal is to expose two or more "existing databases" as a "single" persistence unit)
 +
  [[User:James.sutherland.oracle.com|James.sutherland.oracle.com]] 14:36, 4 November 2010 (UTC)
 +
 
<source lang="java">
 
<source lang="java">
 
em.persist(new A(..));
 
em.persist(new A(..));

Revision as of 10:36, 4 November 2010

Design Specification: Persistence Unit Composition

This feature adds support for persistence units being composed. This means a user can define multiple persistence units with a unique set of entity types and expose them through a composite persistence unit which combines the set of classes across multiple data sources to expose a single persistence context.

ER 328404

Document History

Date committer description
2010/10/22 Andrei Ilitchev Initial Version
2010/10/26 Doug Clarke Updated Requirements and Open Issues

Project overview

The goal is to allow for different classes contained in a single persistence context to be mapped to different databases.

 (I do not think that is the goal, the goal is to expose two or more "existing databases" as a "single" persistence unit)
 James.sutherland.oracle.com 14:36, 4 November 2010 (UTC)
em.persist(new A(..));
em.persist(new Z(..));
// insert A into db1; insert Z into db2:
// the two different data bases possibly from different vendors
em.flush();

Concepts

Terminology

  • Data Source: For the purposes of this discussion a data-source will include JTA and non-JTA data sources as well as EclipseLink's native JDBC connection pools.
  • Split PU: refers to a PU where the database elements are located in multiple data sources but the persistence.xml is defined with all entities together.
  • Aggregate PU: Refers to a PU which combines 2 or more other PU to provide a persistence context which allows queries and transactions (with limitations) across the combined set of entity types.

Background

  • Currently each persistence unit has a single ServerSession.
    • ServerSession typically connected to a single data base.
      • Even though ServerSession may use connection pools that connected to different data bases these databases always should share the same database platform.
        • That means it's impossible to use ServerSession to map different classes to different types of data bases (say, Oracle and MySQL).
  • Eclipselink core defines SessionBroker class that aggregates several ServerSessions.
    • SessionBroker maps each class to a single ServerSession.
    • Each ServerSession connected to a different data base.
      • These ServerSessions are not required to share the same database platform.
      • Therefore different classes mapped to different data bases, too.
  • This feature is going to define persistence unit that has a SessionBroker instead of ServerSession.
    • Let's call it Container persistence unit (or SessionBroker persistence unit).

Requirements

# Description
1 Add support for a single persistence context accessing entities stored in different data sources (schemas/table-space/database) where application developers can perform queries and transactions across the complete set of entities transparently.
2 Support mapping relationships between entities in different data sources.
3 Provide easy to use JPA configuration.
4 Clearly capture usage limitations and exceptions expected in Java Docs and user documentation. The most basic example of this is the ability to join across tables in different data sources. This limitation effects mapping and query execution and optimizations. These issues need to be identified and documented.
5 Support container managed and application bootstrap JPA usage with dynamic, static, and no weaving.

User Bugs

The following potentially related bugs should be taken into consideration in this feature work.

  • JPA enhancement requests.
    • bug 260258 - Add JPA EntityManagerFactoryBroker support
  • Core bugs
    • bug 269213 - SessionBroker handling of ExceptionHandler broken because stale copy/paste
    • bug 281569 - Integration of JPA using transaction-type="JTA" but without EJB container, set-up incomplete commit cleanup
    • bug 326649 - DatabaseSessionImpl.finalize() leads to ClassCastException

Design/Dev Requirements

Doug: I believe this is more implementation focused and should be moved into design section.

  • SessionBroker persistence unit must be defined in persistence.xml (as any other persistence unit).
  • The two main use cases:
    • Unify multiple persistence units (with independent from each other object models).
      • User starts with several independent persistence units, creates a new SessionBroker persistent unit that aggregates ServerSessions defined in the original persistence units.
      • The original persistence units remain unchanged, could still be used individually.
      • Limitation: the original persistence units and the SessionBroker persistent unit should be all defined in a single jar file.
    • Split single persistent unit.
      • User starts with a single persistence unit, alters it to become SessionBroker persistence unit.

Design Constraints

Design / Functionality

  • ServerSessions that are aggregated by a SessionBroker persistence unit are "invisible" outside of it:
    • don't have EntityManagerFactories associated with them;
    • EntityManagerSetupImpl used for their creation are not kept;
    • these sessions are not stored in SessionManager.
  • The two main use cases implementations
    • Unify multiple persistence units (with independent from each other object models).
      • The SessionBroker is constructed from the existing member ServerSessions, member descriptors copied to the SessionBroker.
        • This already works.
    • Split single persistent unit.
      • The existing SessionBroker is split into new member ServerSessions, descriptors and sequences copied into the member ServerSessions.
        • This is new.
  • DatabaseSessions.
    • To accommodate SessionBroker all the code in EntityManagerSetupImpl and around (EntityManagerFactoryImpl, EntityManagerImpl, etc.) needs to be changed:
      • type of the session should be changed from ServerSession to DatabaseSessionImpl (already done in prototype).
    • That opens the door for an option to use DatabaseSessionImpl instead of ServerSession if desired (both in a "simple" and SessionBroker persistence units).
    • Need a new property (eclipselink.jdbc.single-connection ?) with a boolean value (default false) to support DatabaseSessionImpl.

Testing

API

Tooling

The Dali team should be made aware of this feature work and track an enhancement request to uptake it.

Config files

  • Proposed configuration of SessionBroker persistence unit is done entirely through Persistence Unit properties.
    • Properties extension usage and naming patterns.
// Property with set values
eclipselink.define.my-set-property.value1 -> ""
eclipselink.define.my-set-property.value2 -> ""
...
eclipselink.define.my-set-property.valueN -> ""
 
// Converted to a Set:
my-set-property = {value1, value2, ..., valueN};
 
// Suggested naming convention:
// let's use "eclipselink.define" prefix for these "value-less" properties.


  • The two main use cases:
    • Unify multiple persistence units (with independent from each other object models).

There are two predefine independent persistence units: puABC (that maps classes A, B, C) and puXYZ (that maps classes X, Y, Z). None of A, B, C classes references any of X, Y, Z classes and vise verse. Below are several examples of persistence.xml for SessionBroker persistence unit.

<!--Scenario 1 - simply use puABC and puXYZ exactly as they were originally defined-->
-<persistence xmlns="http://java.sun.com/xml/ns/persistence" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://java.sun.com/xml/ns/persistence persistence_1_0.xsd" version="1.0">
    <persistence-unit name="broker" transaction-type="RESOURCE_LOCAL">
        <provider>
            org.eclipse.persistence.jpa.PersistenceProvider
        </provider>
        <properties>
	    <!--Note that puABC must exist and be defined in the same jar file with this persistence.xml-->
	    <property name="eclipselink.define.add-contained-pu.puABC" value=""/>
	    <!--Note that puXYZ must exist and be defined in the same jar file with this persistence.xml-->
	    <property name="eclipselink.define.add-contained-pu.puXYZ" value=""/>
        </properties>
    </persistence-unit>
</persistence>

The result is SessionBroker persistence unit named "broker" that aggregates two ServerSessions - one maps classes A, B, C; another - X, Y, Z.

<!--Scenario 2 - override some of the properies of puABC and puXYZ -->
-<persistence xmlns="http://java.sun.com/xml/ns/persistence" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://java.sun.com/xml/ns/persistence persistence_1_0.xsd" version="1.0">
    <!-- note that there is no need to pass transaction type down to contained sessions - that will be done automatically -->
    <persistence-unit name="broker" transaction-type="JPA">
        <provider>
            org.eclipse.persistence.jpa.PersistenceProvider
        </provider>
        <properties>
            <!-- note that there is no need to pass server platform down to contained sessions - that will be done automatically -->
            <property name="eclipselink.target-server" value="WebLogic_10"/>
 
	    <!--Note that puABC must exist and be defined in the same jar file with this persistence.xml-->
	    <property name="eclipselink.define.add-contained-pu.puABC" value=""/>
	    <property name="javax.persistence.jtaDataSource.puABC" value="ElOracleJTA"/>
	    <property name="javax.persistence.nonjtaDataSource.puABC" value="ElOracle"/>
            <!-- Switching to data sources from user/password - have to remove user name-->
	    <property name="javax.persistence.jdbc.user.puABC" value=""/>
 
	    <!--Note that puXYZ must exist and be defined in the same jar file with this persistence.xml-->
	    <property name="eclipselink.define.add-contained-pu.puXYZ" value=""/>
	    <property name="javax.persistence.jtaDataSource.puXYZ" value="ElMySQLJTA"/>
	    <property name="javax.persistence.nonjtaDataSource.puXYZ" value="ElMySQL"/>
            <!-- Switching to data sources from user/password - have to remove user name-->
	    <property name="javax.persistence.jdbc.user.puXYZ" value=""/>
        </properties>
    </persistence-unit>
</persistence>

The result is SessionBroker persistence unit named "broker" that aggregates two ServerSessions - one maps classes A, B, C; another - X, Y, Z. The original connection parameters in both member sessions were substituted with the ones provided in SessionBroker properties.

    • Split single persistent unit.

There is a single predefined persistence unit: puABCXYZ (that maps classes A, B, C, X, Y, Z). There could be any sorts of references between the classes. Below is example that contains persistence.xml of the original persistence unit and for SessionBroker persistence unit.

<!--Original persistence unit -->
-<persistence xmlns="http://java.sun.com/xml/ns/persistence" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://java.sun.com/xml/ns/persistence persistence_1_0.xsd" version="1.0">
    <persistence-unit name="puABCXYZ" transaction-type="RESOURCE_LOCAL">
        <provider>
            org.eclipse.persistence.jpa.PersistenceProvider
        </provider>
        <exclude-unlisted-classes>false</exclude-unlisted-classes>
        <properties>
	    <property name="javax.persistence.jdbc.user" value="MyUser"/>
	    <property name="javax.persistence.jdbc.password" value="MyPassword"/>
            <property name="javax.persistence.jdbc.driver" value="oracle.jdbc.OracleDriver"/>
            <property name="javax.persistence.jdbc.url" value="jdbc:oracle:thin:@qaott11.ca.oracle.com:1521:toplink"/>
        </properties>
    </persistence-unit>
</persistence>
<!--SessionBroker persistence unit -->
-<persistence xmlns="http://java.sun.com/xml/ns/persistence" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://java.sun.com/xml/ns/persistence persistence_1_0.xsd" version="1.0">
    <persistence-unit name="puABCXYZ" transaction-type="RESOURCE_LOCAL">
        <provider>
            org.eclipse.persistence.jpa.PersistenceProvider
        </provider>
        <exclude-unlisted-classes>false</exclude-unlisted-classes>
        <properties>
	    <property name="javax.persistence.jdbc.user" value="MyUser"/>
	    <property name="javax.persistence.jdbc.password" value="MyPassword"/>
            <property name="javax.persistence.jdbc.driver" value="oracle.jdbc.OracleDriver"/>
            <property name="javax.persistence.jdbc.url" value="jdbc:oracle:thin:@qaott11.ca.oracle.com:1521:toplink"/>
 
	    <!--Note that puXYZ does not exist - it's used as an alias for the new ServerSession to be created-->
            <!--either provide full class name-->
	    <property name="eclipselink.split-contained-pu.package.X" value="puXYZ"/>
	    <property name="eclipselink.split-contained-pu.package.Y" value="puXYZ"/>
            <!--or entity name-->
	    <property name="eclipselink.split-contained-pu.Z" value="puXYZ"/>
	    <property name="javax.persistence.jdbc.user.puXYZ" value="MyUser2"/>
	    <property name="javax.persistence.jdbc.password.puXYZ" value="MyPassword2"/>
            <property name="javax.persistence.jdbc.driver.puXYZ" value="com.mysql.jdbc.Driver"/>
            <property name="javax.persistence.jdbc.url.puXYZ" value="jdbc:mysql://qaott51.ca.oracle.com:3306/MyUser2"/>
        </properties>
    </persistence-unit>
</persistence>

The result is SessionBroker persistence unit that aggregates two ServerSessions - one maps classes A, B, C; another - X, Y, Z. The ServerSession that maps classes X, Y, Z is created using all the properties defined in the persistence unit, overriding with the values designated to puXYZ. For all the remaining classes (in this case A, B, C) the ServerSession is created with all the original properties.

  • Mix the two main use cases.

Let's create a SessionBroker persistence unit from original puABC, by splitting class C and adding puXYZ. Below is example that contains persistence.xml of the original persistence unit and for SessionBroker persistence unit.

<!--Original persistence unit -->
-<persistence xmlns="http://java.sun.com/xml/ns/persistence" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://java.sun.com/xml/ns/persistence persistence_1_0.xsd" version="1.0">
    <persistence-unit name="puABC" transaction-type="RESOURCE_LOCAL">
        <provider>
            org.eclipse.persistence.jpa.PersistenceProvider
        </provider>
        <class>A</class>
        <class>B</class>
        <class>C</class>
        <properties>
	    <property name="javax.persistence.jdbc.user" value="MyUser"/>
	    <property name="javax.persistence.jdbc.password" value="MyPassword"/>
            <property name="javax.persistence.jdbc.driver" value="oracle.jdbc.OracleDriver"/>
            <property name="javax.persistence.jdbc.url" value="jdbc:oracle:thin:@qaott11.ca.oracle.com:1521:toplink"/>
        </properties>
    </persistence-unit>
</persistence>
<!--SessionBroker persistence unit -->
-<persistence xmlns="http://java.sun.com/xml/ns/persistence" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://java.sun.com/xml/ns/persistence persistence_1_0.xsd" version="1.0">
    <persistence-unit name="puABCXYZ" transaction-type="RESOURCE_LOCAL">
        <provider>
            org.eclipse.persistence.jpa.PersistenceProvider
        </provider>
        <class>A</class>
        <class>B</class>
        <class>C</class>
        <properties>
	    <property name="javax.persistence.jdbc.user" value="MyUser"/>
	    <property name="javax.persistence.jdbc.password" value="MyPassword"/>
            <property name="javax.persistence.jdbc.driver" value="oracle.jdbc.OracleDriver"/>
            <property name="javax.persistence.jdbc.url" value="jdbc:oracle:thin:@qaott11.ca.oracle.com:1521:toplink"/>
 
	    <!--Note that puC does not exist - it's used as an alias for the new ServerSession to be created-->
            <!--either provide full class name or entity name-->
	    <property name="eclipselink.split-contained-pu.C" value="puC"/>
	    <property name="javax.persistence.jdbc.user.puC" value="MyUser2"/>
	    <property name="javax.persistence.jdbc.password.puC" value="MyPassword2"/>
            <property name="javax.persistence.jdbc.driver.puC" value="com.mysql.jdbc.Driver"/>
            <property name="javax.persistence.jdbc.url.puC" value="jdbc:mysql://qaott51.ca.oracle.com:3306/MyUser2"/>
 
	    <!--Note that puXYZ must exist and be defined in the same jar file with this persistence.xml-->
	    <property name="eclipselink.define.add-contained-pu.puXYZ" value=""/>
        </properties>
    </persistence-unit>
</persistence>

The result is SessionBroker persistence unit that aggregates three ServerSessions - one maps classes A, B; another - C; yet another - X, Y, Z. The ServerSession that maps classes X, Y, Z is created using puXYZ. The ServerSession that maps class C is created using all the properties defined in the persistence unit, overriding with the values designated to puC. For all the remaining classes (in this case A and B) the ServerSession is created with all the original properties.

Documentation

Open Issues

This section lists the open issues that are still pending that must be decided prior to fully implementing this project's requirements.

Issue # Description / Notes
1 Should this feature include both split and aggregated persistence units?
2 In the case of split persistence units where should the additional data sources be defined? PU properties in persistence.xml, eclipselink-orm.xml, both? Since it is possible for different data sources to be different database vendors or versions it should be possible to configure all JDBC/connection/pool level options on each data source.
3 Should this feature include support to customize additional datasources with persistence unit properties.
4 In the case of aggregate persistence units, should persistence units be usable on their own.

Decisions

This section lists decisions made. These are intended to document the resolution of open issues or constraints added to the project that are important.

Issue # Description / Notes Decision

Future Considerations

During the research for this project the following items were identified as out of scope but are captured here as potential future enhancements. If agreed upon during the review process these should be logged in the bug system.

1. It looks like this project is only considering of enabling the existing session broker support via JPA; i.e. enable the application to maintain different tables in different database instances. The concept overview or the requirements does not give any hint of enabling the application to define strategies for saving different sets of data in different database instances so that each databse instance contains the complete data matching a particular criteria. These days the more commonly practised database partitioning method is the latter one, popularly called sharding.

It would be better to consider providing support for this feature too since an attempt is being made to overhaul the sessionbroker support in eclipselink. In this method, every entity belongs to all the persistence units, however, eclipselink has to consult the defined criteria (range based or list based) while saving or retrieving the entities from any of the configured database instances.one of the complexities that may need to be addressed is how to handle the changes when a new databse instance is added to participate in the partitioning, especially if there needs to be a reorderig of the data across the database instances becasue the newly added instance falls in an intermediate range.

Back to the top