Scaling the Database with Data Partitioning

Data partitioning can be used to scale an application across multiple database machines, or with a clustered database such as Oracle RAC.

Partitioning splits your data across each of the database nodes. There is horizontal partitioning, and vertical partitioning. Vertical partitioning splits your data by class or table across multiple database nodes.

For horizontal partitioning each database node will have the same tables, but each node's table will only store part of the data. You can partition the data by the data values, such as range partitioning, value partitioning, hash partitioning or a custom partitioning policy.

Data replication can be used to backup data, for fault tolerance and fail-over, or for load balancing and scaling the database.

EclipseLink supports data partitioning, replication, load-balancing and fail-over. Several different partitioning policies are supported:

@HashPartitioning - partition data by hashing a field value into a set of connection pools/database nodes.
@ValuePartitioning - partition data by mapping the value of a field to a different connection pool/database nodes.
@RangePartitioning - partition data by mapping a range of values of a field to a different connection pool/database node.
@PinnedPartitioning - pin data to a specific connection pool/database node (vertical partitioning).
@ReplicationPartitioning - replicate data to a set of connection pools/database nodes.
@RoundRobinPartitioning - load-balance requests to a set of connection pools/database nodes.
@UnionPartitioning - union the results of a request to a set of connection pools/database nodes.
@Partitioning - provide a custom partitioning policy.

Partitioning can be set at the session, entity, relationship and query level. Partitioning is supported on any database, include both clustered databases such as Oracle RAC, and standard databases such as MySQL. Fail-over is supported through EclipseLink's connection pooling support by allowing a set of fail-over connection pools for each connection pool.

Example: Enabling Partitioning

This example shows Order being partitioned using hash partitioning by its Id.

@Entity
@Table(name="PART_ORDER")
@HashPartitioning(name="HashPartitionByOrderId", partitionColumn=@Column(name="ORDER_ID"), connectionPools={"default","node2"})
@Partitioned("HashPartitionByOrderId")
public class Order {
    @Id
    @GeneratedValue(strategy=GenerationType.TABLE)
    @Column(name="ORDER_ID")
    private long id;
    @Basic
    private String description;
    @Basic
    private BigDecimal totalCost = BigDecimal.valueOf(0);
    @OneToMany(mappedBy="order", cascade=CascadeType.ALL, orphanRemoval=true)
    @OrderBy("lineNumber")
    private List<OrderLine> orderLines = new ArrayList<OrderLine>();
    @ManyToOne(fetch=FetchType.LAZY)
    private Customer customer;
}

To enable partitioning the "node2" connection pool also must be defined. This is done in the persistence.xml or persistence unit properties.

<persistence-unit name="order" transaction-type="RESOURCE_LOCAL">
        <class>model.Order</class>
        <class>model.OrderLine</class>
        <class>model.Customer</class>
        <properties>
            <property name="javax.persistence.jdbc.driver" value="com.mysql.jdbc.Driver" />
            <property name="javax.persistence.jdbc.url" value="jdbc:mysql://node1:3306/mysql" />
            <property name="javax.persistence.jdbc.user" value="root" />
            <property name="javax.persistence.jdbc.password" value="password" />
 
            <!-- Configure default connection pool. -->
            <property name="eclipselink.connection-pool.default.initial" value="1" />
            <property name="eclipselink.connection-pool.default.min" value="64" />
            <property name="eclipselink.connection-pool.default.max" value="64" />
 
            <!-- Configure 2nd database connection pool. -->
            <property name="eclipselink.connection-pool.node2.url" value="jdbc:mysql://node2:3306/mysql" />
            <property name="eclipselink.connection-pool.node2.user" value="root" />
            <property name="eclipselink.connection-pool.node2.password" value="password" />
            <property name="eclipselink.connection-pool.node2.initial" value="1" />
            <property name="eclipselink.connection-pool.node2.min" value="64" />
            <property name="eclipselink.connection-pool.node2.max" value="64" />
 
            <!-- Default partioning to replication to allow DDL to be sent to all nodes -->
            <property name="eclipselink.partitioning" value="Replicate" />
        </properties>
</persistence-unit>

Note that partitioning does not require RESOURCE_LOCAL and can be used with a DataSource. Just specify the "eclipselink.connection-pool.node2.jtaDataSource", or "eclipselink.connection-pool.node2.nonJtaDataSource" instead of the url. When using a DataSource the user, password, initial, min, max are also not required nor relevant.

When using partitioning is is normally a good idea to set a default partitioning policy for the persistence unit. This is set in the above example using "eclipselink.partitioning". By setting the default policy to a replication policy (defined through annotations) this will enable schema generation and other non-partitioned operation to be applied to all databases, which is normally convenient.

For the full example see, download link - SVN

Notice: This Wiki is now read only and edits are no longer possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

EclipseLink/Examples/JPA/Partitioning

Scaling the Database with Data Partitioning

Example: Enabling Partitioning

Breadcrumbs

Notice: This Wiki is now read only and edits are no longer possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

EclipseLink/Examples/JPA/Partitioning

Scaling the Database with Data Partitioning

Example: Enabling Partitioning