Jump to: navigation, search

Difference between revisions of "PTP/designs/rm framework"

< PTP‎ | designs
Line 1: Line 1:
 +
==Overview ==
 +
 
This page describes the new design for the PTP resource manager monitoring/control framework.
 
This page describes the new design for the PTP resource manager monitoring/control framework.
  

Revision as of 12:32, 19 November 2010

Overview

This page describes the new design for the PTP resource manager monitoring/control framework.

The purpose of this framework is to:

  • Collect and display monitoring information relating to the operation of a target system in a scalable manner
  • Provide job submission, termination, and other job-related operations
  • Support debugger launch and attach
  • Enable the collection and display of stdin and transmission of stdout information from running jobs (where supported by the target system)

Monitoring information will comprise:

  • The status and position of user's jobs in queues
  • Job attribute information
  • Target system status and health information for arbitrary configurations
  • The physical/logical location of jobs on the target system
  • Predictive information about job execution

Key attributes of the framework include:

  • Support for arbitrary system configurations
  • Support for all existing resource managers
  • The ability to scale to petascale system sizes and beyond
  • Support for both user-installable and system-installable modes of operation
  • Automated installation for user-installable operation
  • Simple to add support for new resource managers

Design

Outstanding Issues