EclipseLink Metadata Cache
This feature is to look at caching the metadata project so that the setup can avoid costs associated with reading in multiple orm.xml and annotation processing on entities within a persistence unit to rebuild it unnecessarily.
The persistence unit in eclipselink-annotation-model.jar from JPA testing was chosen for investigation as it is the largest catch all unit in testing. Gathering acurate numbers to determine the costs and benifits are difficult as serialization is not posssible, and it is not yet understood what can be shared. The org.eclipse.persistence.sessions.Project object is going to be used as a starting point as it is the underlying object that gets built from metadata processing and should contain all mapping information for a session/EntityManagerFactory. Caching this object should be sufficient to prevent the need to reprocess the entire persistence unit as would be done from scratch.
The project cannot be serialized as is, and the process of serializing to a file would depend entirely on file io. Initial numbers gathered indicate that creating a session from an existing project into the SessionManager, and then building an EntityManagerFactory/EntityManager from it takes 1/10 the time as building the initial persistence unit. This number is incorrect though, as the test had to build the project by accessing the default persistence unit, thereby causing the agent to load and much of the static initialization to be done. Comparing the time to load a default persistence unit to a subsequent unit within the same persitence.xml, the subsequent pu took 1/3 the time. So 2/3 could have been due to costs that might not be able to be avoided through metadata caching - further testing is required.
The next step is to modify the org.eclipse.persistence.sessions.Project and its references so that it can be reliably seralized and reused when serialized.
Problems and resolutions
1) a few classes are not serializable.
r) add serializable interface to them when encountered.
with deserialization and initialization
1) Project assumes it has a collection of queries when creating a session, but this is marked transient (an ongoing theme)
r) remove the transient marker. (consequences?) Attributes holding user objects will remain transient
2) deploy calls convertClassNamesToClasses on the project
2a) results in NPE since most queries are transient (queries held in DescriptorQueryManager are almost entirely transient)
2a r) Not call convertClassNamesToClasses on serialized project since the classloader is likely going to be the application loader anyway.
r) (?)The loader will need to be looked at to make sure we use the correct one avoiding the need for this method call.
As we do not serialize queries other than existance checks, I assume this is because they are not needed on projects used by remote sessions. Changing this will impact usage/performance of remote sessions which needs to be investigated more.
3) Customizers are called as they are processed and not stored on the project/session (see processCustomizers on MetadataProcessor). This will require refactoring to allow a string representation to be added to the project, and call them after the project is serialized so that they are not called twice on the same project. 4)
- how this should interact with extensibility and RCM refresh commands. A user might not wish to get the cached project when triggering that the metadata has refreshed, so it will need a way to be overriden, but once read in, others on the server might want to use the cached version. The timing of caching the project might be a factor with the current setup
- if the project isn't built and cached before the RCM refresh command goes out.
- Dynamic classes are currently built using the MetadataDescriptors, not the Project/Descriptor classes that will be cached. This will require changes to how dynamic entities are created to be supported and is left outside the scope of this feature.
- Serialization could be a problem if some nodes using it are using an EclipseLink version different than the project was initially serialized from. Ie if one node in a cluster is patched while the others are in the process.