Application Monitoring

decorumgroveInternet και Εφαρμογές Web

7 Αυγ 2012 (πριν από 5 χρόνια και 10 μέρες)

302 εμφανίσεις

The Northwestern Mutual Life Insurance Company


Milwaukee, WI

Application Monitoring

Jeremy Kalsow


Why Application Monitoring


Majority of all corporations


Northwestern Mutual


Total 1,000+ servers


Team is 6 people


Team uses 16 servers


Average 50 applications per server


Need a way to know status fast


What is it?


The ability to monitor performance and
availability


Gather metrics


Show trends


Pretty pictures for management




Why?


Trends predict future problems


Solve application issues faster


Uptime relates directly to profit for many
companies


View all applications, servers, databases
and other items being monitored with a
single dashboard.

Types of Monitoring


Fault


Performance


Configuration


Security


Accounting

Fault


Detects major errors


Easy to implement


Examples


Network loss


Database Connectivity


Very Important


Fault

Type of Monitoring

What to Monitor

When to monitor

Hardware

CPU

utilization

CPU load

Load > 99% for x
minutes

Memory

utilization

Memory load

Load > 99% for x
minutes

Storage System

Available space

System

out of Space

Applications

Application

available

Application working

Working or Error

Application Logs

Error Log monitoring

If error occurred

Databases

Database online

Database is online

Database

is up/down

Network

Latency

Latency

Latency > acceptable
range

Performance


Slow Performance


Service Level Agreements


Metrics


Old and New Metrics


Visual Display

Performance

http://www.ibm.com/developerworks/websphere/library/techarticles/0304_polozoff
/polozoff.html

Configuration


Configuration variables


Connectivity


Speed


Performance


Proactive



Servers and Applications

Configuration


Why would the configuration change?


Hardware


Storage


Service packs


Hot fixes


Windows Updates

Security


Attempts to access the system


Open ports


Inventories


Firewall


Packets


System events


Blocked Exploits

Accounting


Monitors Usage


Generally used for fees


Profit/Loss



Example


Electric Company


Northwestern Mutual


Types of Monitoring Recap


Fault


Performance


Configuration


Security


Accounting


Types of Monitoring Recap


Historical data


Baseline test


Current test


Performance disagreements

Types of Monitoring Recap


Allows for trends to be seen


Modifications can be made


Trends over multiple releases

Types of Monitoring Recap


Monitoring is important


Not enough time is given


Implemented After discovery of an issue


Monitoring only in areas of known problems


Adding monitoring requires time and money

Challenges of application monitoring


Various types of systems


Shared


Clustered


Virtualized


Production logging


Shared Systems


1 server / Multiple applications


System resources are shared


Tracking individual usage is difficult


Many applications may be impacted


Server without access (production)

Clustered Systems



Applications on more than one server


Avoid single point of failure


May be hard to target the issue

Production Logging


Generally Limited


Most errors repeated in test


Application downtime


Use of company resources

Implement Application Monitoring



Plan Early


Monitor Proactively


Create a Recovery Plan


Create and use SLAs

Plan Early


Planning stage


Add monitoring during development


Late additions cover known issues

Monitor Proactively


Harder to implement


Issues are dealt with before end user knows

Monitor Proactively


Tools based approach


Easy and relatively fast setup


No code


Multiple applications

Monitor Proactively


Logging is directly in the code


Less efficient


More specific


Developers have less time

Create a Recovery Plan


Fast resolution


Knowledge management



Recovery Plan Template

Service Level Agreements


What percentage of time that the services will
be up (uptime)


How many people can use the application at
once without performance issues


Performance metrics and benchmarks to be
used with performance monitoring alerts


The rules for notification announcements


What statistics will be monitored and when
and where they will be available


Acceptable response time


Service Level Agreements

Using the Statistics



Visual display


Alerts


Tickets

Visual (Dashboard)


Easily view statistics


Comparison results


Trend comparison


Cross Platform


Auto
-
generated management reports

Dashboard

Alerts and Tickets


Auto
-
generated alerts


Tickets for queue system


Vital information in each

Alerts and Tickets


Most common: Email


Text, popup, printout, recording and more


Tickets: auto
-
generated


Knowledge databases


Common fixes and resolutions

Application Monitoring


Maximize application uptime


Higher end user satisfaction


Higher Profit

References


Polozoff, A. (2003, April 9). Proactive Application Monitoring.
IBM
-

United States
.

Retrieved October 20, 2011, from
http://www.ibm.com/developerworks/websphere/library/techarticles/0304_polozoff/polozoff.html



Choice. (2009, December 20). Application Monitoring.
Adminschoice
-

Unix Made Easy
. Retrieved
October 31, 2011, from http://adminschoice.com/application
-
monitoring


Application Monitoring Software
-

uptime software. (n.d.).
Server Monitoring Software
-

IT Systems
Management, Capacity Planning, Application and Server Monitoring Tool by uptime software
.
Retrieved October 31, 2011, from http://www.uptimesoftware.com/application
-
monitoring.php



Marko, K. (2005, December 30). Proactive Application Monitoring.
Processor.com:


Data Center IT Equipment at Processor, Routers, Storage, Rackmount Servers, Computer Room
Cabling and Flooring
. Retrieved October 29, 2011, from
http://www.processor.com/editorial/article.asp?article=articles%2Fp2752%2F43p52%2F43p52.asp



"IT Service Level Agreement Templates | ContinuityPlanTemplates."
ContinuityPlanTemplates |

Free Business Continuity Plan (BCP) Templates
. ContinuityPlan Templates, n.d. Web.

30 Oct. 2011. http://www.continuityplantemplates.com/it
-
service
-
level
-
agreement
-
templates

XML

Upcoming events with Dashboard


Ability to display visualized graphs and other pertinent
information



Ability to click a failed component and have the system auto
generate a ticket



Ability to Alert others of the issue found



Performance monitoring as well as fault