Notice: This Wiki is now read only and edits are no longer possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.
EclipseLink/DesignDocs/221546(1.1)
Design Specification: Performance and Concurrency
Document History
Date | Author | Version Description & Notes |
---|---|---|
2008-07-17 | James | 0.1 Draft |
Project overview
This project groups several smaller performance related bug fixes and enhancements into a single unit. Its' goal is the improve the performance, concurrency and scalability of the product.
Concepts
Performance is concerned about reducing CPU usage and finding more optimal methods of processing operations.
Concurrency is concerned with reducing contention and improving multi-threaded and multi-CPU performance.
Scalability is concerned with clustering, large workloads and data.
Requirements
The goal of this project is to ensure that our product remains the leading high-performance persistence solution. Areas of improvement are determined through performance comparison with other persistence products and benchmarking.
Specific performance investigations desired for this release:
- JPA performance comparison with EclipseLink 1.0
- core performance comparison with EclipseLink 1.0
- core concurrency comparison with EclipseLink 1.0
- JPA cache coordination comparison in clustered environment
- JPA performance comparison with Hibernate
- JPA performance comparison with OpenJPA
- JPA concurrency comparison with Hibernate
- JPA provider and app server comparison through SPECjAppServer ® benchmark.
Design Constraints
The goal of the project is to improve performance of common usage patterns. Fringe features and usage patterns will not be specifically targeted unless found to be highly deficient.
Any optimization must also be weighed in its' impact on usability, and spec compliance. Optimizations that may have a large negative impact to usability may need to be only enabled through specific configuration.
Functionality
Each specific performance improvement is discussed separately below.
Avoiding ChangeSets for New Objects
In EclipseLink 1.0 change sets are created for both new and existing objects (that changed). The object is used to insert to the database, but for updates the change set is used. To merge into the cache (if caching) the change set is used for both updates and inserts. If the merge needs to merge a reference to an existing object or update to an existing object, and the original object is not in the cache (transactional read, gc), the merge uses the object instead of the change set. Change sets are also serialized for cache coordination if enabled, however new objects are not sent by cache coordination by default.
This optimization will avoid creating ChangeRecords for the ChangeSets for new objects. The change sets will still be created, as that have many dependecies in the commit and merge, and are used to cache certain artifacts such as the CacheKey to optimize the merge and commit. There is one ChangeSet for each new object, and used to be one ChangeRecord for each attribute, the ChangeRecords will no longer be populated. This improves performance, as these ChangeRecords are not normally required. The commit already uses objects, so will not change, the merge will be changed to use objects, but this is something that was already supported.
If the descriptor uses cache coordination with new objects, then the ChangeRecords will still be created, and the old merge will be used. This is also somewhat of a backdoor to get the old merge functionality. There is also a backdoor static on ClassDescriptor.shouldUseFullChangeSetsForNewObjects, to allow the old functionality in case of unforeseen issues.
If a new object's ChangeSet is referenced from cache coordination from an existing object's ChangeSet, then ChangeRecords will be filled-in and written during serialization.
Deferring Resume
In EclipseLink 1.0 after a commit() or flush() operation all managed objects were had their change tracking reset. This could mean re-building of the backup clones, or the clearing of their change listeners. Also some bookkeeping on in the UnitOfWork is required for a resume.
Commonly in JPA, the EntityManager is closed after a commit. For a managed EntityManager this is always the case (unless extended). So, the resume cost commonly has no purpose. JPA states that changes made before a call to Transaction.begin() are undetermined, so another option is to defer the resume, or possibly even avoid change tracking until the begin().
An persistence unit option will be added to defer the resume until the begin() (typically deferred forever, as the EnityManager is normally closed after commit).
Testing
Both the existing performance and concurrency tests and pubic benchmarks will be used to monitor and evaluate performance improvements.
Specific performance testing desired for this release:
- JPA performance comparison with EclipseLink 1.0
- core performance comparison with EclipseLink 1.0
- core concurrency comparison with EclipseLink 1.0
- JPA cache coordination comparison in clustered environment
- JPA performance comparison with Hibernate
- JPA performance comparison with OpenJPA
- JPA concurrency comparison with Hibernate
- JPA provider and app server comparison through SPECjAppServer ® benchmark.
API
Config files
persistence.xml
GUI
Documentation
Open Issues
Issue # | Owner | Description / Notes |
---|---|---|
1 | Group | Should weaving.eager be true or false by default? |
Decisions
Issue # | Description / Notes | Decision |
---|
Future Considerations
Continually improve performance.