Revision as of 12:03, 28 August 2008

1 Design Specification: Performance and Concurrency
2 Document History
3 Project overview
4 Concepts
5 Requirements
6 Design Constraints
7 Functionality
- 7.1 Avoiding ChangeSets for New Objects
- 7.2 Deferring Resume
8 Testing
9 API
10 Config files
- 10.1 persistence.xml
11 GUI
12 Documentation
13 Open Issues
14 Decisions
15 Future Considerations

Design Specification: Performance and Concurrency

Document History

Date	Author	Version Description & Notes
2008-07-17	James	0.1 Draft

Project overview

This project groups several smaller performance related bug fixes and enhancements into a single unit. Its' goal is the improve the performance, concurrency and scalability of the product.

Concepts

Performance is concerned about reducing CPU usage and finding more optimal methods of processing operations.

Concurrency is concerned with reducing contention and improving multi-threaded and multi-CPU performance.

Scalability is concerned with clustering, large workloads and data.

Requirements

The goal of this project is to ensure that our product remains the leading high-performance persistence solution. Areas of improvement are determined through performance comparison with other persistence products and benchmarking.

Specific performance investigations desired for this release:

JPA performance comparison with EclipseLink 1.0
core performance comparison with EclipseLink 1.0
core concurrency comparison with EclipseLink 1.0
JPA cache coordination comparison in clustered environment
JPA performance comparison with Hibernate
JPA performance comparison with OpenJPA
JPA concurrency comparison with Hibernate
JPA provider and app server comparison through SPECjAppServer ® benchmark.

Design Constraints

The goal of the project is to improve performance of common usage patterns. Fringe features and usage patterns will not be specifically targeted unless found to be highly deficient.

Any optimization must also be weighed in its' impact on usability, and spec compliance. Optimizations that may have a large negative impact to usability may need to be only enabled through specific configuration.

Functionality

Each specific performance improvement is discussed separately below.

Avoiding ChangeSets for New Objects

In EclipseLink 1.0 change sets are created for both new and existing objects (that changed). The object is used to insert to the database, but for updates the change set is used. To merge into the cache (if caching) the change set is used for both updates and inserts. If the merge needs to merge a reference to an existing object or update to an existing object, and the original object is not in the cache (transactional read, gc), the merge uses the object instead of the change set. Change sets are also serialized for cache coordination if enabled, however new objects are not sent by cache coordination by default.

This optimization will avoid creating ChangeRecords for the ChangeSets for new objects. The change sets will still be created, as that have many dependecies in the commit and merge, and are used to cache certain artifacts such as the CacheKey to optimize the merge and commit. There is one ChangeSet for each new object, and used to be one ChangeRecord for each attribute, the ChangeRecords will no longer be populated. This improves performance, as these ChangeRecords are not normally required. The commit already uses objects, so will not change, the merge will be changed to use objects, but this is something that was already supported.

If the descriptor uses cache coordination with new objects, then the ChangeRecords will still be created, and the old merge will be used. This is also somewhat of a backdoor to get the old merge functionality. There is also a backdoor static on ClassDescriptor.shouldUseFullChangeSetsForNewObjects, to allow the old functionality in case of unforeseen issues.

If a new object's ChangeSet is referenced from cache coordination from an existing object's ChangeSet, then ChangeRecords will be filled-in and written during serialization.

Deferring Resume

In EclipseLink 1.0 after a commit() or flush() operation all managed objects were had their change tracking reset. This could mean re-building of the backup clones, or the clearing of their change listeners. Also some bookkeeping on in the UnitOfWork is required for a resume.

Commonly in JPA, the EntityManager is closed after a commit. For a managed EntityManager this is always the case (unless extended). So, the resume cost commonly has no purpose. JPA states that changes made before a call to Transaction.begin() are undetermined, so another option is to defer the resume, or possibly even avoid change tracking until the begin().

An persistence unit option will be added to defer the resume until the begin() (typically deferred forever, as the EnityManager is normally closed after commit).

Testing

Both the existing performance and concurrency tests and pubic benchmarks will be used to monitor and evaluate performance improvements.

Specific performance testing desired for this release:

JPA performance comparison with EclipseLink 1.0
core performance comparison with EclipseLink 1.0
core concurrency comparison with EclipseLink 1.0
JPA cache coordination comparison in clustered environment
JPA performance comparison with Hibernate
JPA performance comparison with OpenJPA
JPA concurrency comparison with Hibernate
JPA provider and app server comparison through SPECjAppServer ® benchmark.

API

Config files

persistence.xml

GUI

Documentation

Open Issues

Issue #	Owner	Description / Notes
1	Group	Should weaving.eager be true or false by default?

Decisions

Issue #	Description / Notes	Decision

Future Considerations

Continually improve performance.

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Difference between revisions of "EclipseLink/DesignDocs/221546(1.1)"

Revision as of 12:03, 28 August 2008

Contents

Design Specification: Performance and Concurrency

Document History

Project overview

Concepts

Requirements

Design Constraints

Functionality

Avoiding ChangeSets for New Objects

Deferring Resume

Testing

API

Config files

persistence.xml

GUI

Documentation

Open Issues

Decisions

Future Considerations

@@ Line 50: / Line 50: @@
 = Functionality =
-Each specific performance improvement is discussed seperately below.
+Each specific performance improvement is discussed separately below.
+== Avoiding ChangeSets for New Objects ==
+In EclipseLink 1.0 change sets are created for both new and existing objects (that changed).  The object is used to insert to the database, but for updates the change set is used.  To merge into the cache (if caching) the change set is used for both updates and inserts.  If the merge needs to merge a reference to an existing object or update to an existing object, and the original object is not in the cache (transactional read, gc), the merge uses the object instead of the change set.  Change sets are also serialized for cache coordination if enabled, however new objects are not sent by cache coordination by default.
+This optimization will avoid creating ChangeRecords for the ChangeSets for new objects.  The change sets will still be created, as that have many dependecies in the commit and merge, and are used to cache certain artifacts such as the CacheKey to optimize the merge and commit.  There is one ChangeSet for each new object, and used to be one ChangeRecord for each attribute, the ChangeRecords will no longer be populated.  This improves performance, as these ChangeRecords are not normally required.  The commit already uses objects, so will not change, the merge will be changed to use objects, but this is something that was already supported.
+If the descriptor uses cache coordination with new objects, then the ChangeRecords will still be created, and the old merge will be used.  This is also somewhat of a backdoor to get the old merge functionality.  There is also a backdoor static on ClassDescriptor.shouldUseFullChangeSetsForNewObjects, to allow the old functionality in case of unforeseen issues.
+If a new object's ChangeSet is referenced from cache coordination from an existing object's ChangeSet, then ChangeRecords will be filled-in and written during serialization.
+== Deferring Resume ==
+In EclipseLink 1.0 after a commit() or flush() operation all managed objects were had their change tracking reset.  This could mean re-building of the backup clones, or the clearing of their change listeners.  Also some bookkeeping on in the UnitOfWork is required for a resume.
+Commonly in JPA, the EntityManager is closed after a commit.  For a managed EntityManager this is always the case (unless extended).  So, the resume cost commonly has no purpose.  JPA states that changes made before a call to Transaction.begin() are undetermined,  so another option is to defer the resume, or possibly even avoid change tracking until the begin().
+An persistence unit option will be added to defer the resume until the begin() (typically deferred forever, as the EnityManager is normally closed after commit).
 = Testing =

Breadcrumbs

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Difference between revisions of "EclipseLink/DesignDocs/221546(1.1)"

Revision as of 12:03, 28 August 2008

Contents

Design Specification: Performance and Concurrency

Document History

Project overview

Concepts

Requirements

Design Constraints

Functionality

Avoiding ChangeSets for New Objects

Deferring Resume

Testing

API

Config files

persistence.xml

GUI

Documentation

Open Issues

Decisions

Future Considerations