New binary format is intended to be a size-optimized alternate to the XML data format currently used by the TPTP Java Profiler to increase scalability and performance.
The new binary stream format consists of:
Data stream descriptor: Basic attributes describing the data stream such as ID, version, encoding, endianness, etc..
Header: Describes the message including unique ID and message length.
Message attributes: Ordered integer, long, double, and null-terminating string attributes describing the message. Each message contains the CPU frequency (CPU ticks) for calculating the time stamp on the client side.
CPU time measurement in stand-alone mode have improved performances by only 30% since most of the time is spent for I/O.
Could implement a buffering or caching strategy but implementation time and complexity is considerably more than the benefit for a peripheral use case.
Scalability:
Compression ratios:
Thread Profiling: ~56%
Call Graph Profiling: ~25%
Low since the binary message is a fixed-length structure, many unused fields (~50% of all fields) are still being encoded (see Java Specification for Java Profiling).
Currently uses fixed-length structure so removing unused fields would require a bitmap or field identifiers.
Stanislav will do a cost benefit analysis.
Capability:
The handshake algorithm for backward compatibility is outstanding so we are unable to deal with TPTP 4.4 and below Java Profiler Agents.
Currently defaults to XML for controlled/enabled modes but binary for stand-alone mode.
For controlled/enabled modes, the Java Profiler will send XML data but if the client responds, the binary format will be used. This passive approach needs to be changed to a handshake algorithm.
Needs to default to XML for all modes since this is the existing format and users cannot convert binary data to XML data.
Implementation:
Loader and import wizard are implemented.
However, the loader does not handle custom format.
Most inefficient part of the implementation is the loaders, which reuse existing XML loaders to populate the name and value of each attribute but use a new interface to convert primitive values to strings and set based on the defined name.
Stanislav will do a cost benefit analysis of the loading costs.
No UI changes needed other than file extension filter for the important wizard.
Questions/Answers
Q: Is there a utility to convert binary to XML format (e.g. the user generates binary data by mistake, they would need to rerun the trace)?
A: No. Setting the default mode to XML would solve this problem.
Q: For peer agent discovery, who controls defining the format? How do we handle mixed-modes?
A: This use case has not been considered.
Action Items
Stanislav to provide cost benefit analysis for:
Compression benefits and complexity costs for removing unused fields.
Performance costs for current binary data loaders.
Stanislav will implement the handshake algorithm for backward compatibility.
Stanislav will set the default mode to XML for stand-alone Java Profiling.