Jump to: navigation, search

TPTP-AG-20080229

Date

  • February 29, 2008

Attendees

  • Present:
    • Paul Slauenwhite
    • Joanna Kubasta
    • Eugene Chan
    • Alexander Alexeev
    • Stanislav Polevic
    • Chris Elford

Topic

Minutes

  • New binary format is intended to be a size-optimized alternate to the XML data format currently used by the TPTP Java Profiler to increase scalability and performance.
  • The new binary stream format consists of:
  1. Data stream descriptor: Basic attributes describing the data stream such as ID, version, encoding, endianness, etc..
  2. Messages: Individual binary messages consisting of:
    1. Header: Describes the message including unique ID and message length.
    2. Message attributes: Ordered integer, long, double, and null-terminating string attributes describing the message. Each message contains the CPU frequency (CPU ticks) for calculating the time stamp on the client side.
  • Performance
    • CPU time measurement in stand-alone mode have improved performances by only 30% since most of the time is spent for I/O.
    • Could implement a buffering or caching strategy but implementation time and complexity is considerably more than the benefit for a peripheral use case.
  • Scalability:
    • Compression ratios:
      • Thread Profiling: ~56%
      • Call Graph Profiling: ~25%
        • Low since the binary message is a fixed-length structure, many unused fields (~50% of all fields) are still being encoded (see Java Specification for Java Profiling).
        • Currently uses fixed-length structure so removing unused fields would require a bitmap or field identifiers.
        • Stanislav will do a cost benefit analysis.
  • Capability:
    • The handshake algorithm for backward compatibility is outstanding so we are unable to deal with TPTP 4.4 and below Java Profiler Agents.
    • Currently defaults to XML for controlled/enabled modes but binary for stand-alone mode.
    • For controlled/enabled modes, the Java Profiler will send XML data but if the client responds, the binary format will be used. This passive approach needs to be changed to a handshake algorithm.
    • Needs to default to XML for all modes since this is the existing format and users cannot convert binary data to XML data.
  • Implementation:
    • Loader and import wizard are implemented.
    • However, the loader does not handle custom format.
    • Most inefficient part of the implementation is the loaders, which reuse existing XML loaders to populate the name and value of each attribute but use a new interface to convert primitive values to strings and set based on the defined name.
      • Stanislav will do a cost benefit analysis of the loading costs.
    • No UI changes needed other than file extension filter for the important wizard.

Questions/Answers

  • Q: Is there a utility to convert binary to XML format (e.g. the user generates binary data by mistake, they would need to rerun the trace)?
  • A: No. Setting the default mode to XML would solve this problem.
  • Q: For peer agent discovery, who controls defining the format? How do we handle mixed-modes?
  • A: This use case has not been considered.

Action Items

  • Stanislav to provide cost benefit analysis for:
    • Compression benefits and complexity costs for removing unused fields.
    • Performance costs for current binary data loaders.
  • Stanislav will implement the handshake algorithm for backward compatibility.
  • Stanislav will set the default mode to XML for stand-alone Java Profiling.