Notice: This Wiki is now read only and edits are no longer possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.
Difference between revisions of "EclipseLink/DesignDocs/298985"
(→Document History) |
(→Singleton cache keys and cache refactoring) |
||
Line 63: | Line 63: | ||
For singleton primary key objects, the Id value (Integer, Long, String) will instead be used as the cache key. | For singleton primary key objects, the Id value (Integer, Long, String) will instead be used as the cache key. | ||
This will have a very broad impact as it changes the usage of Vector for the primary key, to be of type Object. | This will have a very broad impact as it changes the usage of Vector for the primary key, to be of type Object. | ||
− | + | A new CacheId object will be used for composite or complex primary keys. The CacheId will be a basic wrapper for an Object array, adding equals and hashCode implementations. | |
− | CacheKey will still be used | + | A CacheKey will still be used as the cache value. |
− | + | The cacheKeyType will be configurable on a ClassDescriptor or through the existing @PrimaryKey annotation. | |
This would affect a lot of internal API, as well as some external API that currently is typed to Vector. The public API taking Vector could still be supported, but the API returning Vector would either need to be changed, or new methods added and old ones deprecated. | This would affect a lot of internal API, as well as some external API that currently is typed to Vector. The public API taking Vector could still be supported, but the API returning Vector would either need to be changed, or new methods added and old ones deprecated. | ||
+ | |||
+ | The purpose of this change is for performance reasons. It also has the benefit of removing our usage of the legacy Vector API. For JPA classes that use a single simple Id value, it also has the benefit of using the JPA Id value as the cache key. For JPA IdClass or EmbeddedId it will not match the JPA Id, but the cache key is mainly an internal value, and should reflect what is optimal for cache usage. This work removes the primary key casting as Vector, so would make it easy to support usage of the JPA IdClass if desired (as a separate feature unrelated to performance). Extreme caution should be used in doing this however, as it requires that the user implement equals() and hashCode() correctly in their IdClass, which is quite easy to mis implementing, or implement incorrectly. It also will cause a negative performance impact as building the IdClass in our internal cache usage will be much less efficient than usage of the CacheId, and the user's equals() and hashCode() implementation is most likely not optimal. | ||
+ | |||
+ | The existing API on IdentiyMapAccessor, ReadObjectQuery and ReportQuery currently uses Vector for the primary key. This API will still be supported, but deprecated. New API will be added that take Object for the primary key. | ||
+ | |||
+ | The JPA Cache interface will be extended in the same pattern as our JpaEntityManager to expose our additional cache API using the JPA Id. This will make our internal cache key type transparent to JPA users. | ||
= Testing = | = Testing = |
Revision as of 12:10, 26 January 2010
Design Specification: Performance and Concurrency
Document History
Date | Author | Version Description & Notes |
---|---|---|
2010-01-06 | James | 0.1 Draft |
2010-01-206 | James | 0.2 Updated CacheId, batch reading |
Project overview
This project groups several smaller performance related bug fixes and enhancements into a single unit. Its' goal is the improve the performance, concurrency and scalability of the product.
Concepts
Performance is concerned about reducing CPU usage and finding more optimal methods of processing operations.
Concurrency is concerned with reducing contention and improving multi-threaded and multi-CPU performance.
Scalability is concerned with clustering, large workloads and data.
Requirements
The goal of this project is to ensure that our product remains the leading high-performance persistence solution. Areas of improvement are determined through performance comparison with other persistence products and benchmarking.
Specific performance investigations desired for this release:
- JPA performance comparison with EclipseLink 2.0
- core performance comparison with EclipseLink 2.0
- JPA concurrency comparison with EclipseLink 2.0
- JPA provider and app server comparison through SPECjAppServer ® benchmark.
Design Constraints
The goal of the project is to improve performance of common usage patterns. Fringe features and usage patterns will not be specifically targeted unless found to be highly deficient.
Any optimization must also be weighed in its' impact on usability, and spec compliance. Optimizations that may have a large negative impact to usability may need to be only enabled through specific configuration.
Functionality
Each specific performance improvement is discussed separately below.
Building objects from ResultSets
There is currently an old prototype of building objects directly from ResultSets. The goal of the feature is to allow "simple" objects and queries to be able to bypass the intermediate DatabaseRow objects build from JDBC used to build objects. Also to avoid a lot of the checks for non core features and simplify the object building process.
This will introduce a second path on queries and object building for these optimized queries, that should avoid a lot of the general overhead required to support advanced features. The feature will only be used on a query, or perhaps a class through configuration or if the class/query are determined to be "simple".
Initially simple will only include direct mappings, but hopefully be expanded to include single primary key relationships, or perhaps composite primary keys. It will not include inheritance, events, complex queries, fetch groups, etc.
Singleton cache keys and cache refactoring
Currently cache access can be an expensive operation. This will be improved by simplifying the CacheKey. For singleton primary key objects, the Id value (Integer, Long, String) will instead be used as the cache key. This will have a very broad impact as it changes the usage of Vector for the primary key, to be of type Object. A new CacheId object will be used for composite or complex primary keys. The CacheId will be a basic wrapper for an Object array, adding equals and hashCode implementations. A CacheKey will still be used as the cache value.
The cacheKeyType will be configurable on a ClassDescriptor or through the existing @PrimaryKey annotation.
This would affect a lot of internal API, as well as some external API that currently is typed to Vector. The public API taking Vector could still be supported, but the API returning Vector would either need to be changed, or new methods added and old ones deprecated.
The purpose of this change is for performance reasons. It also has the benefit of removing our usage of the legacy Vector API. For JPA classes that use a single simple Id value, it also has the benefit of using the JPA Id value as the cache key. For JPA IdClass or EmbeddedId it will not match the JPA Id, but the cache key is mainly an internal value, and should reflect what is optimal for cache usage. This work removes the primary key casting as Vector, so would make it easy to support usage of the JPA IdClass if desired (as a separate feature unrelated to performance). Extreme caution should be used in doing this however, as it requires that the user implement equals() and hashCode() correctly in their IdClass, which is quite easy to mis implementing, or implement incorrectly. It also will cause a negative performance impact as building the IdClass in our internal cache usage will be much less efficient than usage of the CacheId, and the user's equals() and hashCode() implementation is most likely not optimal.
The existing API on IdentiyMapAccessor, ReadObjectQuery and ReportQuery currently uses Vector for the primary key. This API will still be supported, but deprecated. New API will be added that take Object for the primary key.
The JPA Cache interface will be extended in the same pattern as our JpaEntityManager to expose our additional cache API using the JPA Id. This will make our internal cache key type transparent to JPA users.
Testing
Both the existing performance and concurrency tests and pubic benchmarks will be used to monitor and evaluate performance improvements.
Specific performance testing desired for this release:
- JPA performance comparison with EclipseLink 2.0
- core performance comparison with EclipseLink 2.0
- JPA concurrency comparison with EclipseLink 2.0
- JPA provider and app server comparison through SPECjAppServer ® benchmark.
API
Singleton cache keys and cache refactoring
(old API is still supported, but deprecated)
- IdentityMapAccessor
- *(Vector, Class) -> *(Object, Class)
- Session
- keyFromObject(Object) -> getId(Object)
- ReadObjectQuery
- get/setSelectionKey(List) -> get/setSelectionId(Object)
- ReportQueryResult
- getPrimaryKeyValues() -> getId()
Config files
Documentation
Open Issues
Issue # | Owner | Description / Notes |
---|---|---|
1 | Group | What is the impact of the cache refactoring on Cache interceptors integration? |
2 | Group | What is the impact of the cache refactoring on backward compatibility? |
Decisions
Issue # | Description / Notes | Decision |
---|
Future Considerations
Continually improve performance.