Skip to main content

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Jump to: navigation, search

EclipseLink/Bugs/259993

< EclipseLink‎ | Bugs
Revision as of 11:08, 19 January 2009 by Michael.obrien.oracle.com (Talk | contribs) (=Source)

Bug Analysis Document: 259993: Deadlock on em.find() on WebSphere 7.0.0.1

Document History

Date Author Version Description & Notes
20080115 Michael O'Brien 1.0 Initial reproduction use cases

Overview

This bug describes the behavior and fix for an isssue where an em.find() on WebSphere 7.0.0.1 hangs indefinitly when attempting to acquire lock on the concurrencyManager on a cacheKey.

Exception

There is no exception until the tests are stopped or the server is restarted.

StackTrace

Query: DataModifyQuery(sql="CREATE TABLE CMP3_ENTITYD_SEQ (SEQ_NAME VARCHAR2(50) NOT NULL, SEQ_COUNT NUMBER(38) NULL, PRIMARY KEY (SEQ_NAME))")
[1/15/09 17:48:15:936 EST] 0000000e ThreadMonitor W   WSVR0605W: Thread "ORB.thread.pool : 1" (00000022) has been active for 670586 milliseconds and may be hung.  There is/are 2 thread(s) in total in the server that may be hung.
	at java.lang.Object.wait(Native Method)
	at java.lang.Object.wait(Object.java:196)
	at org.eclipse.persistence.internal.sessions.UnitOfWorkIdentityMapAccessor.getAndCloneCacheKeyFromParent(UnitOfWorkIdentityMapAccessor.java:146)
	at org.eclipse.persistence.internal.sessions.UnitOfWorkIdentityMapAccessor.getFromIdentityMap(UnitOfWorkIdentityMapAccessor.java:110)
	at org.eclipse.persistence.internal.queries.ExpressionQueryMechanism.checkCacheForObject(ExpressionQueryMechanism.java:730)
	at org.eclipse.persistence.queries.ReadObjectQuery.checkEarlyReturnImpl(ReadObjectQuery.java:229)
	at org.eclipse.persistence.queries.ObjectLevelReadQuery.checkEarlyReturn(ObjectLevelReadQuery.java:627)
	at org.eclipse.persistence.queries.DatabaseQuery.execute(DatabaseQuery.java:613)
	at org.eclipse.persistence.queries.ObjectLevelReadQuery.execute(ObjectLevelReadQuery.java:891)
	at org.eclipse.persistence.queries.ReadObjectQuery.execute(ReadObjectQuery.java:397)
	at org.eclipse.persistence.queries.ObjectLevelReadQuery.executeInUnitOfWork(ObjectLevelReadQuery.java:954)
	at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.internalExecuteQuery(UnitOfWorkImpl.java:2651)
	at org.eclipse.persistence.internal.sessions.AbstractSession.executeQuery(AbstractSession.java:1181)
	at org.eclipse.persistence.internal.sessions.AbstractSession.executeQuery(AbstractSession.java:1165)
	at org.eclipse.persistence.internal.sessions.AbstractSession.executeQuery(AbstractSession.java:1125)
	at org.eclipse.persistence.internal.jpa.EntityManagerImpl.executeQuery(EntityManagerImpl.java:571)
	at org.eclipse.persistence.internal.jpa.EntityManagerImpl.findInternal(EntityManagerImpl.java:516)
	at org.eclipse.persistence.internal.jpa.EntityManagerImpl.find(EntityManagerImpl.java:431)
	at org.eclipse.persistence.internal.jpa.EntityManagerImpl.find(EntityManagerImpl.java:339)
	at com.ibm.ws.jpa.management.JPATxEmInvocation.find(JPATxEmInvocation.java:211)
	at com.ibm.ws.jpa.management.JPAEntityManager.find(JPAEntityManager.java:175)
	at org.eclipse.persistence.testing.tests.jpa.relationships.VirtualAttributeTestSuite.testReadVirtualAttribute(VirtualAttributeTestSuite.java:76)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:45)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
	at java.lang.reflect.Method.invoke(Method.java:599)
	at junit.framework.TestCase.runTest(TestCase.java:154)
	at junit.framework.TestCase.runBare(TestCase.java:127)
	at org.eclipse.persistence.testing.framework.junit.JUnitTestCase.runBareServer(JUnitTestCase.java:391)
	at org.eclipse.persistence.testing.framework.server.TestRunnerBean.runTest(TestRunnerBean.java:87)
	at org.eclipse.persistence.testing.framework.server.EJSRemote0SLTestRunner_ed643271.runTest(EJSRemote0SLTestRunner_ed643271.java)
	at org.eclipse.persistence.testing.framework.server._EJSRemote0SLTestRunner_ed643271_Tie.runTest(_EJSRemote0SLTestRunner_ed643271_Tie.java)
	at org.eclipse.persistence.testing.framework.server._EJSRemote0SLTestRunner_ed643271_Tie._invoke(_EJSRemote0SLTestRunner_ed643271_Tie.java)
	at com.ibm.CORBA.iiop.ServerDelegate.dispatchInvokeHandler(ServerDelegate.java:622)
	at com.ibm.CORBA.iiop.ServerDelegate.dispatch(ServerDelegate.java:475)
	at com.ibm.rmi.iiop.ORB.process(ORB.java:504)
	at com.ibm.CORBA.iiop.ORB.process(ORB.java:1571)
	at com.ibm.rmi.iiop.Connection.respondTo(Connection.java:2771)
	at com.ibm.rmi.iiop.Connection.doWork(Connection.java:2640)
	at com.ibm.rmi.iiop.WorkUnitImpl.doWork(WorkUnitImpl.java:63)
	at com.ibm.ejs.oa.pool.PooledThread.run(ThreadPool.java:118)
	at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java:1527)

Concepts

JPA Specification Notes

JPA 1.0 Specification

JPA 2.0 Specification

Reproduction

Prerequisites

Data Model

Use Case 1: Persist unmanaged entity (with @PrePersist query) referenced by managed entity

StackTrace

Logs

Analysis

Before debugging this issue it looks like the following code is never returning from the while loop because either the lock was not released or the cached object is still null.

package org.eclipse.persistence.internal.sessions;
public class UnitOfWorkIdentityMapAccessor extends IdentityMapAccessor {
    /**
     * INTERNAL:
     * This method will return the object from the parent and clone it.
     */
    protected Object getAndCloneCacheKeyFromParent(Vector primaryKey, Class theClass, boolean shouldReturnInvalidatedObjects, ClassDescriptor descriptor) {
        // Note: Objects returned from the parent's identity map should include invalidated
        // objects. This is important because this internal method is used in the existence
        // check in the UnitOfWork.
        UnitOfWorkImpl unitOfWork = (UnitOfWorkImpl)getSession();
        org.eclipse.persistence.internal.sessions.IdentityMapAccessor parentIdentityMapAccessor = unitOfWork.getParent().getIdentityMapAccessorInstance();
        CacheKey cacheKey = parentIdentityMapAccessor.getCacheKeyForObject(primaryKey, theClass, descriptor);
        if ((cacheKey == null) && unitOfWork.getParent().isUnitOfWork()) {
            //for nested unit of work
            //make parent clone and register object
            ((UnitOfWorkIdentityMapAccessor)parentIdentityMapAccessor).getAndCloneCacheKeyFromParent(primaryKey, theClass, shouldReturnInvalidatedObjects, descriptor);
            //get the cachekey that was created in the parent.
            cacheKey = parentIdentityMapAccessor.getCacheKeyForObject(primaryKey, theClass, descriptor);
        }
 
        Object objectFromCache = null;
        // this check could be simplfied to one line but would create a window
        // in which GC could remove the object and we would end up with a null pointer
        // as well we must inspect the cacheKey without locking on it.
        if ((cacheKey != null) && (shouldReturnInvalidatedObjects || !descriptor.getCacheInvalidationPolicy().isInvalidated(cacheKey))) {
            synchronized (cacheKey.getMutex()) {
                //if the object in the cachekey is null but the key is acquired then
                //someone must be rebuilding it or creating a new one.  Sleep until
                // it's finished. A plain wait here would be more efficient but we may not
                // get notified for quite some time (ie deadlock) if the other thread
                //is building the object.  Must wait and not sleep in order for the monitor to be released
                objectFromCache = cacheKey.getObject();
                try {
                    while (cacheKey.isAcquired() && (objectFromCache == null)) {
// WebSphere 7.0.0.1 hangs indefinetly here
                        cacheKey.getMutex().wait(5); // in milliseconds
                    }
                } catch (InterruptedException ex) {
                }
                if (objectFromCache == null) {
                    return null;
                }
            }
        } else {
            return null;
        }
 
        // Consider read-only class CR#4094
        if (getSession().isClassReadOnly(theClass, descriptor)) {
            // PERF: Just return the original object.
            return objectFromCache;
        }
 
        if(getSession() instanceof RepeatableWriteUnitOfWork ) {
            Object unregisteredDeletedClone = ((RepeatableWriteUnitOfWork)getSession()).getUnregisteredDeletedCloneForOriginal(objectFromCache);
            if(unregisteredDeletedClone != null) {
                return unregisteredDeletedClone;
            }
        }
 
        return unitOfWork.cloneAndRegisterObject(objectFromCache, cacheKey, descriptor);
    }

Constraints

Concurrency and Thread Safety

Design / Functionality

Alternative 1: Short Fix: Do Not Wait Indefinitely

This change will affect all platforms.

We will limit the number of wait cycles when holding for the mutex on the key to release.

=Source

              private static final int MAXIMUM_MUTEX_STILL_ACQUIRED_WAIT_ITERATIONS = 2000;
              private static final int MUTEX_STILL_ACQUIRED_WAIT_MILLISECONDS_TIME = 5;
 
 
               objectFromCache = cacheKey.getObject();
                try {
--->                int iterations = 0;
                    while (cacheKey.isAcquired() && (objectFromCache == null) 
--->                    && (iterations++ < MAXIMUM_MUTEX_STILL_ACQUIRED_WAIT_ITERATIONS)) {
                        cacheKey.getMutex().wait(MUTEX_STILL_ACQUIRED_WAIT_MILLISECONDS_TIME);
                    }
                } catch (InterruptedException ex) {
                  // we should log a finest message here
                }

Logs

server-run-test:
     [echo] Running test eclipselink-relationships-model on websphere
Warning: Reference compile.server.path has not been set at runtime, but was found during
build file parsing, attempting to resolve. Future versions of Ant may support
 referencing ids defined in non-executed targets.
Warning: Reference compile.path has not been set at runtime, but was found during
build file parsing, attempting to resolve. Future versions of Ant may support
 referencing ids defined in non-executed targets.
    [junit] Running org.eclipse.persistence.testing.tests.jpa.relationships.ServerTestSuite
    [junit]     org.omg.CORBA.MARSHAL: Unable to read value from underlying bridge : Mismatched serialization UIDs : Source (RepId RMI:junit.framework.AssertionFailedError:2E921E9DEB124A7F:0000000000000001) = 0000000000000001 whereas Target
 (RepId RMI:junit.framework.AssertionFailedError:048A3675A6FD5463:A02206A4C2A2FD62) = A02206A4C2A2FD62  vmcid: IBM  minor code: 896  completed: No)
    [junit]     org.omg.CORBA.MARSHAL: Unable to read value from underlying bridge : Mismatched serialization UIDs : Source (RepId RMI:junit.framework.AssertionFailedError:2E921E9DEB124A7F:0000000000000001) = 0000000000000001 whereas Target
 (RepId RMI:junit.framework.AssertionFailedError:048A3675A6FD5463:A02206A4C2A2FD62) = A02206A4C2A2FD62  vmcid: IBM  minor code: 896  completed: No)
    [junit] Tests run: 30, Failures: 0, Errors: 2, Time elapsed: 69.799 sec

Alternative 2: Long Fix: WebSphere 7.0 Platform Change for Alternative 1

This change will target the change in alternative 1 to WebSphere 7 only

=Source

Alternative 3: Longer Fix:

=Source

 

Logs

Implementation

Logging

Testing

API

GUI

Config files

Documentation

Open Issues

Issue # Owner Description / Notes
I1 mobrien -

Decisions

Issue # Description / Notes Decision

Future Considerations

During the research for this bug, the following items were identified as out of scope but are captured here as potential future enhancements. If agreed upon during the review process these should be logged in the bug system.

References

Back to the top