- February 29, 2008
- Paul Slauenwhite
- Duwayne Morris
- Kent Siefkes
- Joe Toomey
- Todd Merriweather (part-time cameo appearance)
- Demo of TPTP parallel execution.
- Review code changes made to fix parallel execution.
- AG review of enhancement 162605 (Test execution should support parallel execution of tests).
Todd was busy helping a customer with some issues and thus missed most of the meeting.
The bulk of the discussion revolved around the processing of Remote Method Invocation (RMI) return data in Marshaller.java when initializing multiple agents in parallel.
In a consumming product, there is a separate thread used to initialize each agent. Many agents may be employed to execute tests concurrently. Code exists to mark each RMI CallData with a unique identifier that is passed to the agent and returned within the ReturnData object. There were errors in existing code that resulted in exceptions being thrown due to errors in handling ReturnData processing for the calling threads.
The changes implemented thus far are working properly and is considered an acceptable solution. Each calling thread looking for the proper return data is placed in a polling mode with a 20 millisecond sleep statement until the matching ReturnData object is found or a timeout (3 minutes) occurs. There was discussion of determining the actual number of RMI calls being made for each agent and running a few experiments to optimize the sleep time while not consumming excessive CPU.
There was a lengthy discussion that an optimized design would provide synchronization where the calling thread would be notified when the expected return data was available rather than operating in a polling loop. The author stated that this indeed was tried with a CallDataQueue object array and he was unable to make it work properly. The issue was that notify was not waking up the desired thread. Thus, the current solution was implemented to get things working properly.
It was decided that it is worth added time for a day or two of effort to see if this optimal design could be resolved to work, perhaps with independent assistance from Paul and/or Joe.
Finally, there some discussions of miscellaneous action items and testing. Discussions included the fact that there are many existing hard coded strings in hyades.execution that should be externalized.
Attach a patch to the bugzilla for the fix that has been developed.
Work on a non-polling solution for an extra day or two and see if the thread notification problem can be resolved.
Write test cases for TPTP that excercize running multiple tests in parallel. It was concluded that consumming product testing would be adequate for maintaining a check on parallel agent startup.
File a bugzilla against hyades.execution to externalize hard-coded strings and try to externalize the one's that are considered most common to occur by the 3/14 i6 cut-off date.