Skip to main content

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Jump to: navigation, search

Difference between revisions of "COSMOS Design 209227"

(Logging an Exception)
Line 50: Line 50:
 
* Joel volunteered to take first pass here.
 
* Joel volunteered to take first pass here.
 
** Some initial thoughts....
 
** Some initial thoughts....
 +
 +
All WSDM faults implemented in Muse derive from BaseFault. Base Fault contains an 'origin' field, for designating the party responsible for raising the fault. We may want to add this as an optional field in our CosmosException base class. When not present, a value can be intuited based on the current handler of an inbound request (this will require a bit of work to support, but is do-able).
 +
 +
The current CosmosFault implementation appears to be situated somewhere between a Wef Event and a Base Fault. It contains situation info (Wef), but has none of the referential fields (like Reporter, etc).
  
 
The relationship between exception and logging:  Exceptions are created and thrown at the origin of the problem.  Logging is (usually) done at the receiver end of the exception, e.g. a catch block.  Logging can also be done at situations without exceptions.  Logs are used for a debugging/tracing purposes, and exceptions are programming constructs to handle and recover from error situations.   
 
The relationship between exception and logging:  Exceptions are created and thrown at the origin of the problem.  Logging is (usually) done at the receiver end of the exception, e.g. a catch block.  Logging can also be done at situations without exceptions.  Logs are used for a debugging/tracing purposes, and exceptions are programming constructs to handle and recover from error situations.   
 +
 +
We appear to be settling on Log4j as an implementation, which is reasonable considering our relationship to Apache Muse. Log4j supports a logging delegation model based on appenders - which enables us to do things like advertise the existence of a log file or forward exceptions as wef events to an external listener, etc. by implementing our own appender for logged cosmos events.
  
  
Line 66: Line 72:
 
- muse logging (?)
 
- muse logging (?)
 
- any transaction ID to inject into log record for correlation
 
- any transaction ID to inject into log record for correlation
 +
 +
We'll require our own logging formatter to enable things like transaction injection, etc. This argues (again) for implementing a Cosmos log4j appender.
  
 
Error code
 
Error code

Revision as of 11:49, 18 December 2007

Overview

Comments and discussion on the talk page

The scope of this document is to define the exception and error handling in a consistent manner throughout the scope of the COSMOS project. It is expected that COSMOS will have different adoption patterns. For example, the Management Data Repository framework does not have to require the COSMOS user interface. Therefore, it is important for the logging and exception strategy of COSMOS to be compatible with existing management infrastructure. . Because COSMOS does not have any existing exception or logging facilities, it makes sense to look towards existing standards for guidelines and ideas.


  • [update this section with conversation from Jimmy. Elaborate on adoption scenarios and how exceptions matter in here]
  • show this

WS-Soap Fault -> Convenience API -> DataManager API -> Native API

  • Network Boundary is the only place where soap faults come into play
  • The convenience api must be able to catch exceptions b/c it may need to return a fault that is specified by a spec, e.g. CMDBf
  • Data Managers should be able to catch exceptions when wrapping native/existing access APIs to the data
  • Do not want to expose the soap faults to: a) the user b) the adopter
  • What is the difference b/t convenience api and data manager api? (convenience api will have spec defined faults, map to exceptions)


Exception Definition

The Oasis Web Services Distributed Management specification defines a general purpose event format and a set of situations that will be used as the logging and event structure for COSMOS. The WSDM Event Format (WEF) is derived from the Common Base Event (CBE) structure found in the TPTP project. In fact, CBE was the initial submission to Oasis and served as the starting point for WEF


One of the goals of the COSMOS is consistency between the exceptions in the code, entries that are logged, and the management events that are raised. Consistency between the logs and exceptions brings the management aspects closer to the point of origin in the code and improves manageability. A key aspect of this consistency is situations, and more specifically, situation category.


The WSDM specification defines a “situation” element that helps classify an event. The situations were derived by a “thorough analysis of event types” [MUWS, Part 2, 2.5.1]. Situations allow another dimension of classifications for events and facilitate consistent analysis across heterogeneous components, including COSMOS. All COSMOS developers should be familiar with the WSDM specification, specifically, WSDM 1.1, MUWS Part 1, section 4 , as well as WSDM 1.1, MUWS Part 2, section 2.5, and Appendix F. These sections define the situation format and present guidelines for its usage.

  • [Is the burden to understand situations too much for the developer? They would need to know what situation that applies. ]


Not all fields defined by the WSDM specification for situation type are necessary for exceptions. For example, because exceptions are thrown when bad things happen, SuccessDisposion will always be “Unsuccessful”. Likewise, the “Message” field is logically the same as the private variable “detailedMessage” in Java Exception. Where COSMOS can take guidance from the WSDM specification is by creating a common exception class that captures the extra detail that can be placed directly into the logs as part of the situation.

  • SituationCategory (required)
  • SituationTime (optional, defaults to System time)
  • Priority (optional)
  • Severity (optional)


There will be a root level exception defined as part of COSMOS (org.eclipse.cosmos.common.exceptions.CosmosException). This will subclass java.lang.Exception and define protected variables for each of the four additional fields defined above. An enumerated list of values will be provided for SituationCategory and Severity. Exceptions are considered part of the COSMOS API and will conform to the API guidelines specified by Eclipse.


The use of the additional fields added by COSMOSException is strongly encouraged throughout the project. However, there are circumstances within the code where it may be difficult to determine additional classification via situation. Thus, the use of these additional fields are optional. Further, in the situations where third party users are extending the framework, we will encourage, but not require the adoption of these fields.

Exception Hierarchy

The main exception class in COSMOS is: org.eclipse.cosmos.common.COSMOSException. It is the intent of the COSMOS project to keep the exception hierarchy shallow, and introduce child exceptions only when necessary.

One situation where it is necessary to introduce a new subclass of exception is when a management standard defines a set of faults. An example of this is the CMDBf specification. In these circumstances, it is appropriate to map an Java exception onto the fault defined by the specification. For an example, please reference org.eclipsecosmos.dc.mdr.exception.CMDBfException.

Logging an Exception

  • Joel volunteered to take first pass here.
    • Some initial thoughts....

All WSDM faults implemented in Muse derive from BaseFault. Base Fault contains an 'origin' field, for designating the party responsible for raising the fault. We may want to add this as an optional field in our CosmosException base class. When not present, a value can be intuited based on the current handler of an inbound request (this will require a bit of work to support, but is do-able).

The current CosmosFault implementation appears to be situated somewhere between a Wef Event and a Base Fault. It contains situation info (Wef), but has none of the referential fields (like Reporter, etc).

The relationship between exception and logging: Exceptions are created and thrown at the origin of the problem. Logging is (usually) done at the receiver end of the exception, e.g. a catch block. Logging can also be done at situations without exceptions. Logs are used for a debugging/tracing purposes, and exceptions are programming constructs to handle and recover from error situations.

We appear to be settling on Log4j as an implementation, which is reasonable considering our relationship to Apache Muse. Log4j supports a logging delegation model based on appenders - which enables us to do things like advertise the existence of a log file or forward exceptions as wef events to an external listener, etc. by implementing our own appender for logged cosmos events.


Components: - Domain - Broker - MDR/Data manager

-> Each component will keep its own log file -> can use tooling (TPTP) to merge them when doing analysis

Log format - Java logging - muse logging (?) - any transaction ID to inject into log record for correlation

We'll require our own logging formatter to enable things like transaction injection, etc. This argues (again) for implementing a Cosmos log4j appender.

Error code

Translation

  • Message catalogs


  • Map Operational Status on the MDR
    • Datamanager available, but application it's wrapping is not available, what's the op status of the data manager
  • logfile registry & viewer
  • error/warning event pub/sub
  • cbe v. wef
  • msg formats & IDs (valentina to help here)




Additional Topics to cover...

  • stratify the conversation
    • What do we do at the exact point of exception
    • What do we do where we catch / recover the exception
    • What is the interface between local exception handling and WSDM
    • How do we interact with the logging system - it must accept these WEF events?
    • How do we integrate with non-COSMOS exceptions?
    • Which component will build these features?



Back to the top