This wiki is locked. Future workgroup activity and specification development must take place at our new wiki. For more information, see this blog post about the new governance model and this post about changes to the website.

Application Health Summary

Business Goal

Reduce Mean Time to Root Cause a performance problem with an application middleware resource.

Technical Goals

  1. Data providers will update, in real time, the health and performance of an application's middleware resource components.
  2. A central registry will contain a reconciled and current view of the application components. The middleware resources that support the application are reconciled across multiple data providers.
  3. The consumers of the application's health are able to dynamically determine the health of the components. Any time a user hovers over the icon representing the middleware resource, an OSLC client will dynamically determine the current providers and federate their data about the resource into a UI preview.

Preconditions

Postconditions

Steps

  1. An existing resource (e.g. web application server) is used to host an application.
  2. An application health consumer queries for monitoring information about the resource.
  3. The application health provider responds with a set of Best Practices metrics that summarize the health of the web application server and its applications.
  4. The end user visualizes the current health of the application server and its applications through the UI preview and uses that information to determine if the performance problem is a trend or an anomoly.
  5. Based on their determination, the user opens a ticket, launches into a deep-dive monitoring tool, or runs some automation to quickly resolve a known issue.

Examples

UI Preview Shows Implication(s)
From the UI preview, the app owner can see that the number of users connecting to a component and doing work is trending up This might point to a capacity problem under peak usage or point to a tuning problem during normal usage of the application.
The number of outstanding connections between components is trending up over time This points to a connection leak in the application. The app owner should open a defect against the app.
The heap usage of a software server is trending up over time This points to a memory leak. The app owner should open a defect against the app.
If the resource is an operating system/computer system, the user is presented with a list of running agents and can see that a monitor that should be running is not running  
The user can see that available disk space is trending down or is simply lower than expected This points to an application not cleaning up logs or files or a capacity problem beginning to form
The user can see the Top 5 processes in terms of CPU utilization and one is pegging a CPU  
The user can see the Top 5 processes in terms of Memory utilization  
The user can see the database server's operational status If non-Active, may require administration

The user can see that the percentage of used buffer pool is getting close to 100, and/or the number of connection entries is higher than normal

May point to a connection leak in an application or a capacity problem
The user can see that a reorganization of a table is needed  
The user sees that a particular database has a high number of failed SQL statements Could point to a poorly written application

Variations

1. This scenario does not preclude other resource types besides applications (e.g. computer systems, network switches, etc.)

Detailed Steps

Assumptions: The consumer and provider have shared knowledge about a target resource

1. Consumer queries the resource registry for a monitoring service provider URL for the selected resource

  1. Resource registry looks up resource record for resource and determines if any monitoring service provider URLs have been registered for it
  2. Resource registry finds monitoring service provider URL and returns the URL to the consumer as an RDF response

2. Consumer invokes a GET method on monitoring service provider URL that was returned to it by the resource registry for the selected resource

  1. Consumer indicates compact XML in the content-type header because the receiving client is a UI preview window/iframe
  2. Consumer connects to monitoring service provider and issues a GET request on its URL for the target resource

3. Monitoring Service provider responds to the UI preview with an HTML page embedded in a compact XML document

  1. Monitoring Service provider maps OSLC resource to an internal resource name
  2. Monitoring Service provider gets Best Practices health metrics data for resource
  3. Monitoring Service provider builds an HTML page, embedding the data and any UI elements (e.g. chart, labels) needed to display it
  4. Monitoring Service provide encodes a response document as compact-XML and returns it to the requesting consumer

4. Consumer displays HTML page in UI preview window/iframe

5. Based on the content returned, user takes appropriate action

Edit | Attach | Print version | History: r16 < r15 < r14 < r13 < r12 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r14 - 12 Jun 2012 - 21:44:25 - JohnArwe
Main.PmAppMonitoring moved from Main.AppMonitoring on 14 May 2012 - 11:47 by JohnArwe - put it back
 
This site is powered by the TWiki collaboration platform Copyright � by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Contributions are governed by our Terms of Use
Ideas, requests, problems regarding this site? Send feedback