Jump to: navigation, search

Difference between revisions of "EclipseLink/UserGuide/JPA/Advanced JPA Development/Data Partitioning"

Line 6: Line 6:
 
|api=y
 
|api=y
 
|apis=  
 
|apis=  
* [http://www.eclipse.org/eclipselink/api/latest/org/eclipse/persistence/descriptors/AbstractSession.html AbstractSession]
+
* [http://www.eclipse.org/eclipselink/api/latest/org/eclipse/persistence/descriptors/partitioning/PartitioningPolicy.html PartitioningPolicy]
 +
* [http://www.eclipse.org/eclipselink/api/latest/org/eclipse/persistence/descriptors/partitioning/package-summary.html Package partitioning]
 +
 
 +
* [http://www.eclipse.org/eclipselink/api/latest/org/eclipse/persistence/sessions/Session.html Session]
 
* [http://www.eclipse.org/eclipselink/api/latest/org/eclipse/persistence/descriptors/ClassDescriptor.html ClassDescriptor]
 
* [http://www.eclipse.org/eclipselink/api/latest/org/eclipse/persistence/descriptors/ClassDescriptor.html ClassDescriptor]
 
* [http://www.eclipse.org/eclipselink/api/latest/org/eclipse/persistence/queries/DatabaseQuery.html DatabaseQuery]
 
* [http://www.eclipse.org/eclipselink/api/latest/org/eclipse/persistence/queries/DatabaseQuery.html DatabaseQuery]
Line 20: Line 23:
 
You configure data partitioning using partitioning policies. The different kinds of policies are:
 
You configure data partitioning using partitioning policies. The different kinds of policies are:
  
* RoundRobinPolicy - Cycles through a list of connection pools to distribute the load evenly. There is an option to load balance read queries only and an option to replicate write queries.
+
* CustomPartitioningPolicy - Defines a user defined partitioning policy. Used by metadata to defer class loading to init.
 +
 
 +
* FieldPartitioningPolicy - Partitions access to a database cluster by a field value from the object, such as the object's ID, location, or tenant. All write or read requests for objects with that value are sent to the server. If a query does not include the field as a parameter, it can either be sent to all servers and unioned or left to the session's default behavior.
 +
 
 +
* HashPartitionPolicy - Partitions access to a database cluster by the hash of a field value from the object, such as the object's location, or tenant. The hash indexes into the list of connection pools. All write or read request for objects with that hash value are sent to the server. If a query does not include the field as a parameter, it can be sent to all servers and unioned, or it can be left to the session's default behavior.
 +
 
 +
 
 +
* PartitioningPolicy - Partitions the data for a class across multiple different databases or across a database cluster such as Oracle RAC. Partitioning can provide improved scalability by allowing multiple database machines to service requests. (If multiple partitions are used to process a single transaction, JTA should be used for proper XA transaction support.)
 +
 
 +
* PinnedPartitioningPolicy - Pins requests to a single connection pool.  
  
* RangePartitionPolicy - Maps a query parameter name to a node, based on its value and a set of ranges. If the query does not define the parameter, the policy either uses the session default behavior or, based on the XXXXXXX option setting, sends the query to all pools and unions the result.
+
* RangePartitionPolicy - Partitions access to a database cluster by a field value from the object, such as the object's ID, location, or tenant. Each server is assigned a range of values. All write or read requests for objects with that value are sent to the server. Each server is assigned a range of values. All write or read request for object's with that value are sent to the server. If a query does not include the field as a parameter, then it can either be sent to all server's and unioned, or left to the sesion's default behavior.
  
* ValuePartitionPolicy - Behaves the same as the range policy, but maps value to a pool instead of a range. It also defines a default pool to use for any unmapped values.
+
* ReplicationPartitioningPolicy - Sends requests to a set of connection pools. This policy is for replicating data across a cluster of database machines. Only modification queries are replicated.  
  
* HashPartitionPolicy - Hashes the parameter value into a list of connection pools.
+
* RoundRobinPolicy - Sends requests in a round-robin fashion to the set of connection pools. It is for load balancing read queries across a cluster of database machines. It requires that the full database be replicated on each machine, so it does not support partitioning. The data should either be read-only, or writes should be replicated on the database.  
  
* ReplicationPolicy - Sends write queries to a set of connection pools.
+
* UnionPartitionPolicy - Sends queries to all connection pools and unions the results. This is for queries or relationships that span partitions when partitioning is used, such as on a ManyToMany cross partition relationship.  
  
* UnionPartitionPolicy - Sends read queries to a set of connection pools and has an option to replicate wrties.
+
* ValuePartitionPolicy - Partitions access to a database cluster by a field value from the object, such as the object's location or tenant. Each value is assigned a specific server. All write or read requests for objects with that value are sent to the server. If a query does not include the field as a parameter, then it can be sent to all servers and unioned, or it can be left to the session's default behavior.  
  
  

Revision as of 16:11, 25 January 2011

EclipseLink JPA


Data Partitioning

This section is in progress...


With data partitioning, you can subdivide a database table, index or index-organized table into smaller units. That makes it possible to manage and access those objects at a finer level of granularity, thereby improving manageability, performance, and availability. For example, data partitioning facilitates load-balancing and replicating data across multiple different databases or across a database cluster.

You configure data partitioning using partitioning policies. The different kinds of policies are:

  • CustomPartitioningPolicy - Defines a user defined partitioning policy. Used by metadata to defer class loading to init.
  • FieldPartitioningPolicy - Partitions access to a database cluster by a field value from the object, such as the object's ID, location, or tenant. All write or read requests for objects with that value are sent to the server. If a query does not include the field as a parameter, it can either be sent to all servers and unioned or left to the session's default behavior.
  • HashPartitionPolicy - Partitions access to a database cluster by the hash of a field value from the object, such as the object's location, or tenant. The hash indexes into the list of connection pools. All write or read request for objects with that hash value are sent to the server. If a query does not include the field as a parameter, it can be sent to all servers and unioned, or it can be left to the session's default behavior.


  • PartitioningPolicy - Partitions the data for a class across multiple different databases or across a database cluster such as Oracle RAC. Partitioning can provide improved scalability by allowing multiple database machines to service requests. (If multiple partitions are used to process a single transaction, JTA should be used for proper XA transaction support.)
  • PinnedPartitioningPolicy - Pins requests to a single connection pool.
  • RangePartitionPolicy - Partitions access to a database cluster by a field value from the object, such as the object's ID, location, or tenant. Each server is assigned a range of values. All write or read requests for objects with that value are sent to the server. Each server is assigned a range of values. All write or read request for object's with that value are sent to the server. If a query does not include the field as a parameter, then it can either be sent to all server's and unioned, or left to the sesion's default behavior.
  • ReplicationPartitioningPolicy - Sends requests to a set of connection pools. This policy is for replicating data across a cluster of database machines. Only modification queries are replicated.
  • RoundRobinPolicy - Sends requests in a round-robin fashion to the set of connection pools. It is for load balancing read queries across a cluster of database machines. It requires that the full database be replicated on each machine, so it does not support partitioning. The data should either be read-only, or writes should be replicated on the database.
  • UnionPartitionPolicy - Sends queries to all connection pools and unions the results. This is for queries or relationships that span partitions when partitioning is used, such as on a ManyToMany cross partition relationship.
  • ValuePartitionPolicy - Partitions access to a database cluster by a field value from the object, such as the object's location or tenant. Each value is assigned a specific server. All write or read requests for objects with that value are sent to the server. If a query does not include the field as a parameter, then it can be sent to all servers and unioned, or it can be left to the session's default behavior.


Configuration Files

orm.xml

<partitioning-policy class="org.acme.MyPolicy"/>
<round-robin-policy replicate-writes="true">
  <connection-pool>node1</connection-pool>
  <connection-pool>node2</connection-pool>
</round-robin-policy>
<random-policy replicate-writes="true">
  <connection-pool>node1</connection-pool>
  <connection-pool>node2</connection-pool>
</random-policy>
<replication-policy>
  <connection-pool>node1</connection-pool>
  <connection-pool>node2</connection-pool>
</replication-policy>
<range-partitioning-policy parameter-name="id" exclusive-connection="true" union-unpartitionable-queries="true">
  <range-partition connection-pool="node1" start-value="0" end-value="100000" value-type="java.lang.Integer"/>
  <range-partition connection-pool="node2" start-value="100001" end-value="200000" value-type="java.lang.Integer"/>
  <range-partition connection-pool="node3" start-value="200001" value-type="java.lang.Integer"/>
</range-partitioning-policy>

persistence.xml

 

Examples

@Entity
@IdClass(EmployeePK.class)
@UnionPartitioning(
        name="UnionPartitioningAllNodes",
        replicateWrites=true)
@ValuePartitioning(
        name="ValuePartitioningByLOCATION",
        partitionColumn=@Column(name="LOCATION"),
        unionUnpartitionableQueries=true,
        defaultConnectionPool="default",
        partitions={
            @ValuePartition(connectionPool="node2", value="Ottawa"),
            @ValuePartition(connectionPool="node3", value="Toronto")
        })
@Partitioned("ValuePartitioningByLOCATION")
public class Employee {
    @Id
    @Column(name = "EMP_ID")
    private Integer id;
 
    @Id
    private String location;
    ...
 
    @ManyToMany(cascade = { PERSIST, MERGE })
    @Partitioned("UnionPartitioningAllNodes")
    private Collection<Project> projects;
    ...
}
@Entity
@RangePartitioning(
        name="RangePartitioningByPROJ_ID",
        partitionColumn=@Column(name="PROJ_ID"),
        partitionValueType=Integer.class,
        unionUnpartitionableQueries=true,
        partitions={
            @RangePartition(connectionPool="default", startValue="0", endValue="1000"),
            @RangePartition(connectionPool="node2", startValue="1000", endValue="2000"),
            @RangePartition(connectionPool="node3", startValue="2000")
        })
@Partitioned("RangePartitioningByPROJ_ID")
public class Project {
    @Id
    @Column(name="PROJ_ID")
    private Integer id;
    ...
}

Eclipselink-logo.gif
Version: 2.2.0 DRAFT
Other versions...