Jump to: navigation, search

Difference between revisions of "EclipseLink/Examples/JPA/Partitioning"

(Scaling the Database with Data Partitioning)
 
(3 intermediate revisions by the same user not shown)
Line 18: Line 18:
 
* [http://www.eclipse.org/eclipselink/api/2.2/org/eclipse/persistence/annotations/RoundRobinPartitioning.html @RoundRobinPartitioning] - load-balance requests to a set of connection pools/database nodes.
 
* [http://www.eclipse.org/eclipselink/api/2.2/org/eclipse/persistence/annotations/RoundRobinPartitioning.html @RoundRobinPartitioning] - load-balance requests to a set of connection pools/database nodes.
 
* [http://www.eclipse.org/eclipselink/api/2.2/org/eclipse/persistence/annotations/UnionPartitioning.html @UnionPartitioning] - union the results of a request to a set of connection pools/database nodes.
 
* [http://www.eclipse.org/eclipselink/api/2.2/org/eclipse/persistence/annotations/UnionPartitioning.html @UnionPartitioning] - union the results of a request to a set of connection pools/database nodes.
 +
* [http://www.eclipse.org/eclipselink/api/2.2/org/eclipse/persistence/annotations/Partitioning.html @Partitioning] - provide a custom partitioning policy.
 +
 +
Partitioning can be set at the session, entity, relationship and query level.
 +
Partitioning is supported on any database, include both clustered databases such as Oracle RAC, and standard databases such as MySQL.
 +
Fail-over is supported through EclipseLink's connection pooling support by allowing a set of fail-over connection pools for each connection pool.
 +
 +
==Example: Enabling Partitioning==
 +
This example shows Order being partitioned using hash partitioning by its Id.
 +
 +
<source lang="java">
 +
@Entity
 +
@Table(name="PART_ORDER")
 +
@HashPartitioning(name="HashPartitionByOrderId", partitionColumn=@Column(name="ORDER_ID"), connectionPools={"default","node2"})
 +
@Partitioned("HashPartitionByOrderId")
 +
public class Order {
 +
    @Id
 +
    @GeneratedValue(strategy=GenerationType.TABLE)
 +
    @Column(name="ORDER_ID")
 +
    private long id;
 +
    @Basic
 +
    private String description;
 +
    @Basic
 +
    private BigDecimal totalCost = BigDecimal.valueOf(0);
 +
    @OneToMany(mappedBy="order", cascade=CascadeType.ALL, orphanRemoval=true)
 +
    @OrderBy("lineNumber")
 +
    private List<OrderLine> orderLines = new ArrayList<OrderLine>();
 +
    @ManyToOne(fetch=FetchType.LAZY)
 +
    private Customer customer;
 +
}
 +
</source>
 +
 +
To enable partitioning the "node2" connection pool also must be defined.
 +
This is done in the persistence.xml or persistence unit properties.
 +
 +
<source lang="xml">
 +
<persistence-unit name="order" transaction-type="RESOURCE_LOCAL">
 +
        <class>model.Order</class>
 +
        <class>model.OrderLine</class>
 +
        <class>model.Customer</class>
 +
        <properties>
 +
            <property name="javax.persistence.jdbc.driver" value="com.mysql.jdbc.Driver" />
 +
            <property name="javax.persistence.jdbc.url" value="jdbc:mysql://node1:3306/mysql" />
 +
            <property name="javax.persistence.jdbc.user" value="root" />
 +
            <property name="javax.persistence.jdbc.password" value="password" />
 +
 +
            <!-- Configure default connection pool. -->
 +
            <property name="eclipselink.connection-pool.default.initial" value="1" />
 +
            <property name="eclipselink.connection-pool.default.min" value="64" />
 +
            <property name="eclipselink.connection-pool.default.max" value="64" />
 +
           
 +
            <!-- Configure 2nd database connection pool. -->
 +
            <property name="eclipselink.connection-pool.node2.url" value="jdbc:mysql://node2:3306/mysql" />
 +
            <property name="eclipselink.connection-pool.node2.user" value="root" />
 +
            <property name="eclipselink.connection-pool.node2.password" value="password" />
 +
            <property name="eclipselink.connection-pool.node2.initial" value="1" />
 +
            <property name="eclipselink.connection-pool.node2.min" value="64" />
 +
            <property name="eclipselink.connection-pool.node2.max" value="64" />
 +
           
 +
            <!-- Default partioning to replication to allow DDL to be sent to all nodes -->
 +
            <property name="eclipselink.partitioning" value="Replicate" />
 +
        </properties>
 +
</persistence-unit>
 +
</source>
 +
 +
Note that partitioning does not require RESOURCE_LOCAL and can be used with a DataSource.  Just specify the "eclipselink.connection-pool.node2.jtaDataSource", or "eclipselink.connection-pool.node2.nonJtaDataSource" instead of the url.  When using a DataSource the user, password, initial, min, max are also not required nor relevant.
 +
 +
When using partitioning is is normally a good idea to set a default partitioning policy for the persistence unit.  This is set in the above example using "eclipselink.partitioning".  By setting the default policy to a replication policy (defined through annotations) this will enable schema generation and other non-partitioned operation to be applied to all databases, which is normally convenient.
 +
 +
For the full example see, [http://dev.eclipse.org/svnroot/rt/org.eclipse.persistence/trunk/examples/org.eclipse.persistence.example.jpa.partitioned/org.eclipse.persistence.example.jpa.partitioned.zip download link] - [http://dev.eclipse.org/svnroot/rt/org.eclipse.persistence/trunk/examples/org.eclipse.persistence.example.jpa.partitioned/ SVN]
 +
 +
See also,
 +
* [http://java-persistence-performance.blogspot.com/2011/05/data-partitioning-scaling-database.html Data Partitioning - Scaling the Database (Blog)].
  
 
[[Category:EclipseLink/Example/JPA]]
 
[[Category:EclipseLink/Example/JPA]]

Latest revision as of 12:34, 4 May 2011

Scaling the Database with Data Partitioning

Data partitioning can be used to scale an application across multiple database machines, or with a clustered database such as Oracle RAC.

Partitioning splits your data across each of the database nodes. There is horizontal partitioning, and vertical partitioning. Vertical partitioning splits your data by class or table across multiple database nodes.

For horizontal partitioning each database node will have the same tables, but each node's table will only store part of the data. You can partition the data by the data values, such as range partitioning, value partitioning, hash partitioning or a custom partitioning policy.

Data replication can be used to backup data, for fault tolerance and fail-over, or for load balancing and scaling the database.

EclipseLink supports data partitioning, replication, load-balancing and fail-over. Several different partitioning policies are supported:

  • @HashPartitioning - partition data by hashing a field value into a set of connection pools/database nodes.
  • @ValuePartitioning - partition data by mapping the value of a field to a different connection pool/database nodes.
  • @RangePartitioning - partition data by mapping a range of values of a field to a different connection pool/database node.
  • @PinnedPartitioning - pin data to a specific connection pool/database node (vertical partitioning).
  • @ReplicationPartitioning - replicate data to a set of connection pools/database nodes.
  • @RoundRobinPartitioning - load-balance requests to a set of connection pools/database nodes.
  • @UnionPartitioning - union the results of a request to a set of connection pools/database nodes.
  • @Partitioning - provide a custom partitioning policy.

Partitioning can be set at the session, entity, relationship and query level. Partitioning is supported on any database, include both clustered databases such as Oracle RAC, and standard databases such as MySQL. Fail-over is supported through EclipseLink's connection pooling support by allowing a set of fail-over connection pools for each connection pool.

Example: Enabling Partitioning

This example shows Order being partitioned using hash partitioning by its Id.

@Entity
@Table(name="PART_ORDER")
@HashPartitioning(name="HashPartitionByOrderId", partitionColumn=@Column(name="ORDER_ID"), connectionPools={"default","node2"})
@Partitioned("HashPartitionByOrderId")
public class Order {
    @Id
    @GeneratedValue(strategy=GenerationType.TABLE)
    @Column(name="ORDER_ID")
    private long id;
    @Basic
    private String description;
    @Basic
    private BigDecimal totalCost = BigDecimal.valueOf(0);
    @OneToMany(mappedBy="order", cascade=CascadeType.ALL, orphanRemoval=true)
    @OrderBy("lineNumber")
    private List<OrderLine> orderLines = new ArrayList<OrderLine>();
    @ManyToOne(fetch=FetchType.LAZY)
    private Customer customer;
}

To enable partitioning the "node2" connection pool also must be defined. This is done in the persistence.xml or persistence unit properties.

<persistence-unit name="order" transaction-type="RESOURCE_LOCAL">
        <class>model.Order</class>
        <class>model.OrderLine</class>
        <class>model.Customer</class>
        <properties>
            <property name="javax.persistence.jdbc.driver" value="com.mysql.jdbc.Driver" />
            <property name="javax.persistence.jdbc.url" value="jdbc:mysql://node1:3306/mysql" />
            <property name="javax.persistence.jdbc.user" value="root" />
            <property name="javax.persistence.jdbc.password" value="password" />
 
            <!-- Configure default connection pool. -->
            <property name="eclipselink.connection-pool.default.initial" value="1" />
            <property name="eclipselink.connection-pool.default.min" value="64" />
            <property name="eclipselink.connection-pool.default.max" value="64" />
 
            <!-- Configure 2nd database connection pool. -->
            <property name="eclipselink.connection-pool.node2.url" value="jdbc:mysql://node2:3306/mysql" />
            <property name="eclipselink.connection-pool.node2.user" value="root" />
            <property name="eclipselink.connection-pool.node2.password" value="password" />
            <property name="eclipselink.connection-pool.node2.initial" value="1" />
            <property name="eclipselink.connection-pool.node2.min" value="64" />
            <property name="eclipselink.connection-pool.node2.max" value="64" />
 
            <!-- Default partioning to replication to allow DDL to be sent to all nodes -->
            <property name="eclipselink.partitioning" value="Replicate" />
        </properties>
</persistence-unit>

Note that partitioning does not require RESOURCE_LOCAL and can be used with a DataSource. Just specify the "eclipselink.connection-pool.node2.jtaDataSource", or "eclipselink.connection-pool.node2.nonJtaDataSource" instead of the url. When using a DataSource the user, password, initial, min, max are also not required nor relevant.

When using partitioning is is normally a good idea to set a default partitioning policy for the persistence unit. This is set in the above example using "eclipselink.partitioning". By setting the default policy to a replication policy (defined through annotations) this will enable schema generation and other non-partitioned operation to be applied to all databases, which is normally convenient.

For the full example see, download link - SVN

See also,