The COSMOS team is having a face to face that bookends the TPTP face to face in Toronto, CA, Jan 8 - 12, 2006.
Daily Conference Calls
We will hold daily conference calls at 11:00am and 5:00pm EST.
Call in number 888 241 8547 pc: 999451
Mark Weitzel (Jan 8 - Jan 12)
Harm Sluiman (Jan 8 - Jan 12)
Valentina Popescu ( Jan 8 - Jan 12; not available from 1-4PM Jan 8 )
Don Ebright (Jan 8 - Jan 9 morning and early afternoon)
Joel Hawkins (Jan 8 - Jan 9 morning and early afternoon)
Oliver Cole (??)
Marius Slavescu (Jan 8 - Jan 10, Jan 12)
Joseph P Toomey (Jan 8 - Jan 10)
The agenda contains a section for questions/discussions/issues we want to work through with the TPTP team.
The main goal is to establsh a project plan, resource assignments, and set of deliverables necessary to realize the initial COSMOS prototype
Monday Jan 8
I (Don) wrote this agenda but feel free to make changes
- 8:00 Review current status and plans for TPTP AC and agents
- 8:30 Long term requirements for TPTP and COSMOS agent infrastructure
- 9:00 Data collection query interface
- 11:00 Conference call
- 11:30 Lunch / Demo of portal based log viewer (tentative)
- 12:00 OSGi based agent control framework
- 2:00 Data normalization
- TPTP APIs, new and existing
- 4:00 Anything else
- 5:00 Conference call
Mark Weitzel, Harm Sluiman, Joel Hawkins, Don Ebright, Hubert Leung, Marius Slavescu, Valentina Popescu, Zulah Eckert (for OSGi discussion), Joe Toomey
We started the first part of the meeting understanding the current history of the TPTP architecture, in particular the RAC and the existing models that it uses.
There are five models in TPTP, Hierarchy, Statistical, Log, Trace and Test. Of immediate relevance to COSMOS are the Hierarchy, Statistical, and Log models.
In TPTP, the existing models reside in memory and are created using EMF. In several areas, the use of EMF bleeds through to the API layer. It was agreed that we did not want EMF to be surfaced in the API layers in the data collection work. That applied to both the writing of data and the reading of data.
- The working assumption is that the existing TPTP query interface (RDB -> EMF) will be reworked.
- The queries are tightly coupled to the requirements of the UI implementation.
- COSMOS would not rely on EMF in the views, primarily b/c these would be web based reports/UI.
- The existing EMF models may continue to be used by TPTP for the existing views. However, this was determined to be a TPTP decision. See using existing TPTP views with new data store for an overview.
- For Log data, there is a data store already that is used in the large log support for TPTP.
EMF & Metadata: One of the advantages of using EMF is that it provided a mechanism to derive meta information about the models that are being worked with. There will be multiple levels of metadata in the system that we will need to work. Ideally, a mechanism should be in place to provide a consistent strategy.
- EMF may be appropriate for a RCP/Eclipse UI, but this must not bleed through to any other level.
- Will SML (or CML) be a mechanism that we can leverage?
High Level Architecture: The second part of the day was focused on diagramming the high level architecture of COSMOS and where we would share (be dependent on) parts of TPTP. In later sessions, these diagrams were refined to focus on what could be delivered by the March milestone.
During this discussion, it was observed that there is an opportunity to formalize the control interface of the RAC. The formalization would be done using WSDM interfaces. This would provide a standards based mechanism of controlling the data collection mechanism. This approach had several advantages
- COSMOS could be data collection agnostic and still control the collection of data
- Just as the RAC would be enabled, other agents could as well, e.g. Nagios, or commercial vendors
- The existing commands that flow across the command channel of the RAC would be explicitly expressed
- A standard API would allow data collection to participate in higher order processes, e.g. BPEL based workflows
- It may be possible to combine SML-IF or CML documents in the API as a way to specify information to collect (or the kind of information to collect). A rough initial pass at this is represented in the demo scenario where an SML-IF document is updated to indicate that log and statistical information is now available.
An interesting observation that emerged from this conversation is that using WSDM in this capacity would provide an opportunity to develop a new standard for the industry. This led to the discussion if COSMOS should be actively trying to lead the industry in defining new standards for systems management or simply use management standards where they exist.
- It was agreed that the role of COSMOS should be to facilitate the development of new standards, and to provide reference implementations of existing standards where appropriate.
The other advantage to taking a standards based approach is that each part of the infrastructure could automatically discover and bind to the other parts. This would be similar to taking a SOA based approach where SCA could be used to “wire” together a systems management infrastructure.
This approach would also allow an optimization to happen that would short circuit any remote connections. The use of Tuscany was brough up, and this was generally viewed positively, although the full implications of adding a dependency on this project was unclear. Everyone agreed this should be explored in more detail.
Longer term, this would lead to decoupling the runtime monitoring from the development time instrumentation by allowing the application to discover the available monitoring infrastructure.
We also discussed rehosting the TPTP AC in process. This discussion, combined with the standardization discussion about apply/wrapping standards around the RAC, lead to the realization that this could transform what COSMOS delivers into a basic Java based (OSGi) agent framework. A question was raised if this was outside the initial intent of the scope proposed in the project creation and if so, did the companies participating want to pursue this. This would also allow COSMOS to be self hosting.
This is an open item.
The final part of the afternoon was spent discussing the query interface that will be shared between TPTP and COSMOS. The detailed notes on the query interface are captured here.
Tuesday Jan 9
- More Data Modeling re: data collection, loaders, persistence, etc...
- Look at the Hierarchy and Statistical models
- Joe T. to look at the Test model
- Attempt to identify small API
- Query interface
Mark Weitzel, Joel Hawkins, Don Ebright, Hubert Leung, Marius Slavescu, Joe Toomey
We began by looking at some existing work performed by IBM to create a portal based log viewer. This was motivated because the existing CBE models use the default XMI storage on the file system which results in scaleability problems. The large log file support allows this model to be persisted in a database. The inability search and index also presented motivations for this support.
- The viewer is JSR 168 compliant
- It uses a persistent data store for log information. This is the same data store that is used in the TPTP large log file support.
- The portlet GUIs use the EMF model
- A separate wrote a new controller for the TPTP log import because the logic was very entangled with the Eclipse UI.
- Logs were the only TPTP model that was put into this environment
An important decision was reached in that the role of the normalization layer as being the transformation of similar data types, not an overall any to any transformation. This also implies that there are distinct “kinds” of models, e.g. the way TPTP has things laid out now.
Extensibility was raised as an issue both on how you would add new models and how a user would add to existing models.
This is an open item.
It was agreed that for the initial delivery in March, we would provide a very basic servlet based web interface. However, there was a discussion of what the proper user interface should be long term, e.g. portal, web2.0, etc…
- We will need to think through what UI and user experience we wish to provide.
- We also need to think through how an automated log import and monitoring facility will work. This may relate to the control API that was discussed on Monday.
- We have not specified an event filtering and propagation mechanism.
This should fall into the domain of the Reporting UI component.
WSDM and TPTP Models
Currently, the log model in TPTP supports the Common Base Event. A question was raised as to how we would support the industry standard WSDM Event Format.
In addition, the WSDM specification presents the idea of normalization for several areas of management. In the WSDM specification, these are referred to as capabilities. For example, the metrics capability defines what statistical information should look like. Another example of the kind of information that is normalized in WSDM is Operational Status.
- The data collection team needs to review these standard capabilities with the existing models to determine the kind of API and storage mechanism that we need to produce
- The advantage of using this set of normalized information in the APIs would be that it would be based upon an industry ratified standard
Eclipse Con Demo Outline
We spent the afternoon taking an initial cut at the Eclipse Con Demo. The intent was to use the definition of a demo to drive the concrete deliverables for March.
- Show BIRT reports for uptime and statistical data.
- Show normalization of data, emphasizing the collection of data from multiple sources.
- Show management of the agent infrastructure through WSDM.
System Under Test
- We can use the TPTP perfmon agent for Windows and a Tomcat instance.
- We can use JMX for the Tomcat application.
- We may be able to use a WSDM agent from Tivoli BTM??
- MUSE 2.2 (fallback position is 2.1).
- J2SE 5 (Sun JVM)
- TPTP 4.4 M1 driver (fallback is 4.3).
Need to assemble this stack and build a WSDM API to start and stop agents. We expect it to run inside the TPTP RCP JVM and use the IAC (RAC for fallback). (Joel)
- We decided that Marius's prototype query API will be the upper (consumer) layer of the query API and his persistence APIs will be the lower layer API interfaces to the data store.
- We decided to implement loaders and persistence and query APIs for statistical and log models.
For the collection of the statistical data, we can collect memory usage of OS (via perfmon), Tomcat (via JMX) an application deployed in Tomcat (via JMX), and the TPTP agent controller (via WSDM).
There were several other observations/discussion on the High_level_architecture, e.g. optimization, 'matched set of components', that have been worked into the notes on that page.
The rest of the afternoon was spent discussing the query API and is captured here
Wednesday Jan 10
- 8:00am - 10:00am Review/present work so far to TPTP team
- BtM / ME
Mark Weitzel, Marius Slavescu, Olver Cole, Vsevolod Sandomirskiy (Seva)
On Wednesday we spent time working through issues surrounding the Management Enablement (ME) component of COSMOS. Oliver Cole led this discussion as a way to position the ME work. Several key points were raised that served as an outline to the discussion.
- Have we articulated the true value of COSMOS to the consumers? (Is the project proposal enough?)
- What is the key differentiating value of COSMOS above and beyond other efforts?
- What is the exact use case and key targets that COSMOS will accomplish?
Once concern is that without some presence in the market (mindshare), the feedback and its value will be limited. Therefore, the March should be viewed as a stepping stone, an initial pass at providing something tangible where the community can see demonstrable progress. Additional goals for the March deliverable are to
- Use the work to garner feedback from the community.
- Garner support and participation from the community
- Re-engage key vendors that we would like to participate and review with them the deliverables
Using the above as guidelines, and focusing on the March deliverable, we concentrated on:
- articulating the value of COSMOS
- describing the benefit available in the March prototype
- defining a single key message for the demo
- refining the scenario to focus on the most effective way to deliver this message
Value & Differentiation
We spent some time discussing other efforts available to consumers of systems management software. We determined the keys to COSMOS’ value and differentiation are being able to address management throughout the entire lifecycle, from development to production, and using a consistent, (soon to be standard) language to share information between the stages of the lifecycle.
Based on the above, we focused what we would show in the demo. This is captured in the current Eclipse con demo script.
Thursday Jan 11
- 9:00am - 11:00am EST: Work on query APIs
- 11:00am - 3:00pm EST
- Using SML in Management Enablement
- Interaction interfaces with validator
- Discuss and agree first deliverables
- Initial assumptions:
March: SML Validator & samples (not just basic ones, but more complex ones that can form a 'bootstrap' for editor validation). Second release: SML GUI editor.
Mark Weitzel, Oliver Cole, Seva, Harm Sluiman, Valentina Popescu, Steve Jerman (via phone), Zulah Ekert (via phone)
Thursday morning was spent discussing how we intend to use SML and SML-IF with Steve Jerman (via phone). Zulah Eckert also joined via phone.
We would like COSMOS to have synergy with the SML & CML workgroups. Specifically, Steve J., (with support from Harm and Valentina) agreed to:
- “Advertise” COSMOS on the SML/CML workgroup.
- Promote the cross fertilization b/t COSMOS and the SML/CML workgroup. This is because COSMOS will be an early consumer of the work produced by the SML/CML team.
- Be a direct feedback link from COSMOS to the SML/CML workgroup
It was suggested that COSMOS (as a project) could pursue membership in the CML workgroup. Although this was a thought of as a good idea, it’s likely that the legal agreements that need to be signed (by the Eclipse foundation) would be a significant barrier.
The afternoon was spent drilling down into the Eclipse con demo script and defining an initial project plan.
Specifically, it was decided that Management Enablement was ready to show as a concept, but it was premature to implement ME capabilities. The data collection and reporting capabilities could be defined with a good level of confidence, but Management Enablement is not so well defined. While it is clear that Management Enablement is supposed to "Match Impedances" between data collection and reporting/monitoring and that SML is a key differentiator, it is not yet clear how that should work. The decisions was made to have the OCS resources (0.5 of email@example.com) support the data collection software effort for the March demo.
Friday Jan 12
11:00 - 12pm EST Wrap up.
Initial Development Plan Jan 12 - March 1
- Jan 12
Draft of Demo Script (Mark W.) Submit demo to Eclipse Con (Mark W.)
- Jan 18
Each team must review the demo script and determine their work items.
- Don E: Data Collection
- Craig T: Reporting
- Oliver C: Management Enablement
- Steve J: Resource Modeling
February 14: Demo Script Final
March 1: Code Complete
- RAC roadmap and plans for a managability interface
Resulting Action Items
- Don Ebright to document the decisions on the data query API. This will need to be done with the help of Marius and Sleyva
- Can we use the existing data store model for Logs that is used in the TPTP large log file support—at least in the short term? (Marius/Mark/Don (or Joel?)
- Research the use of Tuscany in the shared architecture (Joel w/Mark)
- The data collection team needs to review these standard capabilities with the existing models to determine the kind of API and storage mechanism that we need to produce (Mark/Joel/Marius/Don)
- Since many of the key partners that we would like to participate in COSMOS are in Northern California, or will be at Eclipse con, we should use this as an opportunity to re-engage and review with them the deliverables and progress (Toni et al.)
- Need to establish timeframe so we can get on top of the travel arrangements