Mining Gold from the RMF Data Mountain

capybarabowwowΛογισμικό & κατασκευή λογ/κού

30 Οκτ 2013 (πριν από 3 χρόνια και 10 μήνες)

322 εμφανίσεις

Mining Gold from the
RMF Data Mountain

Ivan L. Gelb

Gelb Information Systems Corp.

Email: ivan@gelbis.com

Phone: 732
-
303
-
1333


CMG ‘007


San Diego, CA


© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


2

Think
Faster

with

Gelb Information

TRADEMARKS



The following are trade or service marks of the IBM
Corporation: CICS, CICS TS , CICSPlex, DB2, IBM, MVS,
OS/390, z/OS, Sysplex, Parallel Sysplex. Any omissions are
purely unintended.


© 2007 Gelb Information Systems Corp.

URL:
www.gelbis.com

Phone: 732
-
303
-
1333

No part of this material can be reproduced by any means
without prior written permission from the author and with
proper attribution displayed.

© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


3

Think
Faster

with

Gelb Information

MOTHER OF ALL DISCLAIMERS (MOAD )



All of the information in this document is tried and true.
However, this fact alone cannot guarantee that you can get
the same results at your place and with your skills. In fact,
some of this advice can be hurtful if it is misused and
misunderstood. As with all kinds of analysis, anything you
may hear or read can be understood and misunderstood in
many ways that may seem contradictory to you. Gelb
Information Systems Corporation, Ivan Gelb and any one
found anywhere assume no responsibility for this
information’s accuracy, completeness or suitability for any
purpose. Anyone attempting to adapt these techniques to
their own environments anywhere do so completely at their
own risk.


© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


4

Think
Faster

with

Gelb Information

Agenda


Your Questions…Now


SMF & RMF Introduction


RMF Reports Overview


CPU Reports


LPAR Reports


5 More Reports


Drawing for attendee prizes


Note:


獹sbol 晬慧猠r散e浭敮d慴aons


偌U区 R敷慲d猠景r mo獴 qu敳瑩on猿

?

© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


5

Think
Faster

with

Gelb Information

SMF & RMF Introduction


SMF & RMF Data Collection



RMF Record Types



RMF Reports Overview


RMF Report Types



Monitor I Reports



Monitor II Reports



Monitor III Reports

© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


6

Think
Faster

with

Gelb Information

SMF & RMF Data Collection
-

1


ERBRMF00 or 02 member for Monitor I options. Examples:


CYCLE(250)


/* Sample every 250 msec.


SYNCH(SMF)


/* Use SMFPRMxx time values



SMFPRMxx member for SMF options. Examples:


INTVAL(mm)


/* recording interval 01
-
60 (30)


SYNCVAL(mm)


/* recording synchronization 00
-
59 (00)


INTERVAL(hhmmss)

/* NOINTERVAL is default for SMF 30s


SMF,SYNC


/*type 30s sync
-
d based on SYNCVAL

© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


7

Think
Faster

with

Gelb Information

SMF & RMF Data Collection
-

2




Processor overhead for record collection increases as
CYCLE value is decreased.




Shorter INTVAL produces more SMF and RMF records and
higher collection related overhead




Recommended service definition coefficients:


MSO = 0.0





CPU = 1.0


SRB = 1.0


IOC = 1.0 or less by orders of 10 (0.1 or 0.01; IBM recommends
0.5)




Note potential impact on chargeback algorithms if they use
service units in their calculations.

© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


8

Think
Faster

with

Gelb Information

RMF Record Types Summary


70
-
1

Processor


70
-
2

Crypto processor


71


Paging activity


72
-
1

Workload PGN
-
s (compat. mode)


72
-
3

Workload service classes (goal mode)


73


Channel path activity


74
-
1

Device activity


74
-
2

XCF activity


74
-
5

Cache activity


74
-
7

FICON director activity


75


Paging activity


77


Enqueue activity


78
-
2

Virtual storage activity

© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


9

Think
Faster

with

Gelb Information

RMF Report Types


Monitor I


20+ real
-
time reports and long
-
term data
collection



Monitor II


20+ activity snapshot reports



Monitor III


50+ interactive performance analysis reports and
long
-
term data collection



Other RMF data based reporting tools (downloads):




Spreadsheet reporter




RMF PA (Performance Analyzer)

© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


10

Think
Faster

with

Gelb Information

Monitor I Reports


CACHE


Cache Subsystem


CF


Coupling Facility Activity


CHAN


Channel Path Activity


CPU


CPU Activity


CRYPTO


Crypto Hardware
Activity


DEVICE


Device Activity


DOMINO


Lotus Domino Server


ENQ


Enqueue Activity


FCD


FICON Director Activity


HFS


Hierarchical File System


HTTP


HTTP Server


IOQ


IO Queuing Activity


OMVS


OMVS Kernel Activity


PAGESP


Page/Swap Data Set
Activity


PAGING


Paging Activity


SDEVICE


Shared Device
Activity


TRACE


Trace Activity


VSTOR


Virtual Storage Activity


WKLD


Workload Activity
(compat mode)


WLMGL


Workload Activity
(goal mode)


XCF


Cross
-
system Coupling
Activity

Monitor I can produce these reports at the end of each collection interval, or they can
be produced by the Postprocessor component at a later time.

© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


11

Think
Faster

with

Gelb Information

Monitor II Reports


ARD / ARDJ


Address Space
Resource Data


ASD / ASDJ


Address Space
Data


ASRM / ASRMJ


Address
Space SRM Data


CHANNEL


Channel Path
Activity


DDMN


Domain Activity


DEV / DEVV Device Activity


HFS


Hierarchical File System


ILOCK


IRLM Long Lock Detect


IOQUEUE


IO Queuing Activity


LLI
-

Library List


PGSP


Page/Swap Data Set
Activity


SDS


Sysplex Data Server


SENQ


Systems ENQ
Contention


SENQR


System ENQ Reserve


SPAG


Paging Activity


SRCS


Central Storage /
Processor / SRM


TRX


Transaction Activity

Monitor II can produce snapshot reports on demand or at definable intervals.

© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


12

Think
Faster

with

Gelb Information

Monitor III Reports


Monitor III can produce Sysplex wide or for a single system
reports of the delays experienced by a job, group of jobs,
service class, TSO, OMVS, enclaves, etc…. We will present
just 7 of more than 50 available reports:

1.
Delay Report

2.
Processor Delays

3.
CF Overview

4.
CF Systems

5.
Device Delays

6.
VSAM LRU Overview

7.
VSAM RLS Activity by Storage Class and by Data Set

© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


13

Think
Faster

with

Gelb Information

Who, What, How Much, & Analysis


RMF Delay Report


CPU Activity Report & Analysis


LPAR Activity Report & Analysis


CF Activity Report and Analysis


Workload Activity Report & Analysis


I/O Device Activity Report & Analysis


File I/O Activity Report & Analysis

© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


14

Think
Faster

with

Gelb Information

M3
-

Which Resources Cause Delays

© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


15

Think
Faster

with

Gelb Information


Use to quickly establish which system resources are delaying the work. “%
Delayed for” indicators are:


PRC = in/ready but work not being dispatched on CPU


DEV = delayed for disk or tape


STR = delayed for storage liked COMM, LOCAL, SWAP, XMEM, or
found on OUT & READY queue


SUB = delayed by JES, HSM, XCF


OPR = delayed by operator message, or mount request, or quiesce
command by operator


ENQ = delayed waiting for
any

enqueued resource

Address space type column


CX:

A = ASCH


O = as second char. Indicates OMVS process for this task

B = Batch


S = Started Task

E = Enclave


T = TSO

O = OMVS


? = invalid/missing data

Cr column indicates CPU critical or Storage critical attribute for this address space

Which System Resources Cause Delays…
NOTES

© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


16

Think
Faster

with

Gelb Information

CPU, LPAR & CF Activity Reports


CPU Activity Reports


Processor Delays


What is Your LPAR’s Guaranteed Capacity?


LPAR Partition Data Report


Coupling Facility Activity (CF) Report

© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


17

Think
Faster

with

Gelb Information

PP
-

CPU Activity Report
-

Part 1

© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


18

Think
Faster

with

Gelb Information

PP
-

CPU Activity Report
-

Part 2

© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


19

Think
Faster

with

Gelb Information



Provides 100% accurate CPU utilization figures for all LPAR
-
s and each LPAR
individually. Use it in conjunction with workload activity measurements to
establish CPU utilization capture ratios
.

Observe and consider:

1.
ONLINE TIME


less than 100% indicated CPU being varied on
-

or offline.
IRD or manual process may cause this.

2.
LPAR BUSY %
-

what % of each allocated CPU this LPAR utilized. Less than
100% indicates possible capacity issues.

3.
MVS BUSY %
-

LPAR’s % CPU utilization. 100% should cause performance
and capacity concerns if (a) anyone complains, and (b) critical workloads +
SYSTEM make up 90
-
95%+ of the utilization

4.
QUEUE LENGTHS (%)


indicates how many others you may have to wait
behind for CPU access

5.
IN READY
-

address spaces ready to run but CPU not available

6.
OUT READY


even worst than IN READY if the OUT
-
s are workloads you
care about. See workload activity reports to determine the victims

CPU Activity Report…. NOTES

© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


20

Think
Faster

with

Gelb Information

Monitor III (M3) Processor Delays
-

1







© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


21

Think
Faster

with

Gelb Information

Processor delays report identifies who is delayed and by ABOUT how much.


1.
DLY % = (# of Delay Samples / # of Samples) * 100 is % of time task is
delayed from getting CPU time

2.
USG % = (# Using Samples / # Samples ) * 100 is % of time the task is
receiving CPU service

3.
Holding Job(s)


up to three tasks that most contributed to delay


Note that delays are collected via statistical sampling!


MVS reduced preemption approach, the cause of always present
CPU delay

Monitor III Processor Delays
-

1... NOTES

© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


22

Think
Faster

with

Gelb Information



What is Your LPAR’s Guaranteed Capacity?


LPAR’s share is determinant of physical CP capacity


LPAR weights & # logical CPUs determine share

Share = LCPU/Tot
-
PCPU * LPAR weight /
∑ LPAR weights


Example: If two LPARS, PRODA 700 weight and PRODB
weight 300, with access to the total of 10 physical CPs each:


PRODA Capacity = 10/10 * 700/1000 = 7.0 CPs


PRODB Capacity = 10/10 * 300/1000 = 3.0 CPs


LPAR weights are ONLY enforced if

Physical CP BUSY = 100% or if LPAR is capped by PR/SM


If PRODA only utilizes 2.0 CPs most of the time, PRODB
could get the other 8.0 CPs if it needs them! When PRODA
gets busy using its maximum share, PRODB will be

!

© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


23

Think
Faster

with

Gelb Information

PP
-

LPAR Partition Data Report

© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


24

Think
Faster

with

Gelb Information

Partition Data Report is from the RMF post processor. This is the most
useful single place where we can see defined and actual LPAR
capacity reporting.


1.
WGT


LPAR’s weight/Total defined weight is the % SHARE this
LPAR will be dispatched by PRSM if it needs CPU service

2.
MSU DEF and ACT


defined and actual LPAR MSUs

3.
CAPPING DEF


partition’s capping option

4.
CAPPING WLM%
-

% of time WLM capped this LPAR

5.
LPAR MGT


LPAR management overhead


Type = AAP for zAAP
-
s
processors

Type = IIP for zIIP
-
s processors

LPAR Partition Data Report… NOTES

© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


25

Think
Faster

with

Gelb Information

CF Activity Reports and Analysis


Data collection controlled by ERBRMFxx option of CFDETAIL
or NOCFDETAIL



CFDETAIL collects a
lot

of SMF data!



To reduce system overhead, data collection is done only on
one member of Sysplex as decided automatically by RMF
Sysplex Data Server

© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


26

Think
Faster

with

Gelb Information

M3
-

CF Activity
-

1


If PROCESSOR UTIL% is high (95%+???):


Under PR/SM, dedicate CPs or add CPs to partition


Rebalance by moving structures to lower utilized CF if available


Buy more or faster CFs

© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


27

Think
Faster

with

Gelb Information

M3
-

CF Activity
-

2


AVG SERV in microseconds!
Do

compare Async. Serv. to Disk Serv.!




CHNG% percent of requests changed from sync to asynch




DEL% percent of requests delayed by subchannel contention or
dump serialization

© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


28

Think
Faster

with

Gelb Information

Workload Activity Reports & Analysis


RMF Workload Measurements


RMF Workload Activity


1 CICS Service


RMF Workload Activity


2 TSO Service


RMF Workload Activity Report Analysis

© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


29

Think
Faster

with

Gelb Information

Source: Chris Baker, IBM

RMF Workload Measurements

© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


30

Think
Faster

with

Gelb Information

Source: Chris Baker, IBM

PP
-

RMF Workload Activity
-

1 CICS

REPORT BY: POLICY=HPTSPOL1 WORKLOAD=PRODWKLD SERVICE CLASS=CICSHR RESOURCE GROUP=*NONE PERIOD=1 IMPORTANCE=HIGH


-
TRANSACTIONS
--

TRANSACTION TIME HHH.MM.SS.TTT

AVG 0.00 ACTUAL 000.00.00.114

MPL 0.00 QUEUED 000.00.00.036

ENDED 216 EXECUTION 000.00.00.078

END/SEC 0.24 STANDARD DEVIATION 000.00.00.270

#SWAPS 0

EXECUTD 216



-------------------------------
RESPONSE TIME BREAKDOWN IN PERCENTAGE
--------------------

------
STATE
------

SUB P TOTAL ACTIVE READY IDLE
-------------------------
WAITING FOR
--------------------------

SWITCHED TIME (%)

TYPE LOCK I/O CONV DIST LOCAL SYSPL REMOT TIMER PROD MISC LOCAL SYSPL REMOT

CICS BTE 93.4 10.2 0.0 0.0 0.0 0.0 83.3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 83.3 0.0 0.0

CICS EXE 67.0 13.2 7.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 46.7 0.0 0.0 0.0 0.0




This is a sample RMF post processor (ERBRMFPP) output with option SYSRPTS(WLMGL(SCPER))

PP = RMF Post processor Report

© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


31

Think
Faster

with

Gelb Information

PP
-

RMF Workload Activity
-

2 TSO Part 1

2

3

4

5

6

7

1

8

8

8

© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


32

Think
Faster

with

Gelb Information

PP
-

RMF Workload Activity


2 TSO Part 2

© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


33

Think
Faster

with

Gelb Information

RMF Workload Activity


2…. NOTES

1.
CPU and STORAGE Service class attributes

2.
TRANSACTIONS
-

Number of transactions and related statistics

3.
TRANS. TIME


various transaction time measures

4.
DASD I/O


rate and response time components

5.
SERVICE RATES

6.


PAGE
-
IN RATES monitor them carefully

7.


MSO coefficient should be 0! Other values will produce unstable performance on zSeries
processors!

8.


APPL% can be greater than 100%. If a single CICS region is in report, it can track CICS
TCB saturation risk. APPL% includes AAPCP and IIPCP time!

9.
AAPCP & IIPCP are time zAAP and zIIP eligible work spent on standard CPs.

10.
PROJECTCPU option in SYS1. PARMLIB member IEAOPTxx needed for AAPCP and IIPCP

11.
APPL% calculation:

IIT= I/O interpt.

HST= Hiperspace

RCT= Region Ctl.

© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


34

Think
Faster

with

Gelb Information

RMF Workload Activity Report Analysis


Response time distribution report is best and usually least
overhead causing source for design of repose time goals.



Workload activity response time distribution report can be
produced for a variety of report classes in support of service
policy development activities.





Quick and low overhead source of service and utilization
data.




Watch out for “funny” samples in STATE SAMPLE
BREAKDOWN (%)


WAITING FOR. Each state sample
category’s value, except OTHR, is based on the last 14 non
-
zero values.

© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


35

Think
Faster

with

Gelb Information

I/O Activity Reports & Analysis



Device Activity Components



I/O Device Activity
-

1



I/O Device Activity


2



Device Delays



Device Activity Tuning



VSAM File I/O Activity


© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


36

Think
Faster

with

Gelb Information



Device Activity Components



CONN = due to data transfer time


DISC = time disconnected from channel that consists of
SEEK and SET SECTOR, Latency (wait for record to be
under head), RPS (obsolete with ESS


Sharks)


PEND = I/O delays in access path. May include delays
caused by channel, control unit, director port delay. Often
caused by shared DASD!


IOSQ = wait for another task on the same system to finish
using this device.


What I/O response time is too high? WARNING: this is a trick
question.


Analyze response time components to decide what to do.


© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


37

Think
Faster

with

Gelb Information

I/O Device Activity (RMF PP Report)























© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


38

Think
Faster

with

Gelb Information

M3
-

Device Delays









© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


39

Think
Faster

with

Gelb Information



Device Activity Tuning
-

1



I/O priority ON (check for APAR OW47667)


CONN = due to data transfer time


DISC, IOSQ, PEND are I/O delays


Enable Parallel Access Volumes (PAV) to reduce / eliminate
IOSQ


Manage static PAVs to minimize IOSQ


Manage number of dynamic PAVs via policy to minimize
IOSQ


ESS (Shark) multiple allegiance support reduces contention
reported as PEND time.


Track cache performance and manage it as needed


© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


40

Think
Faster

with

Gelb Information



Device Activity Tuning
-

2


DISC > 2
-
5 msec with cache may indicate problem(s).


Not enough Non
-
Volatile Storage (NVS) or NVS get filled.


Poor cache hit ratio on IBM ESS.


High physical disk utilization. May need to move data to
balance the activity between available resources.


Very high disk to cache transfer activity rate.


DISC > 13 msec may indicate RPS misses due to path
contention. This should not occur on IBM ESS.


If %DEV UTIL > 35%, work to reduce activity rate on device:


Balance activity better across available resources


Isolate or Do not cache files and volumes that are BAD cache
candidates


Tune based on analysis of caching activity


© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


41

Think
Faster

with

Gelb Information

M3
-

File I/O Tuning


VSAM LRU
-

1


Buffer goal limit defaults to 100 MB; can be 1.5 GB max; see IGDSMxx in
your PARMLIB for details




“Accel %” when LRU aging algorithms were accelerated;



“Reclaim %” when aging algorithms were to reclaim buffers



“Read BMF%” data found in local buffers



“Read CF%” data found in Coupling Facility (CF) cache



“Read DASD%” data read from DASD


Monitor average CPU time used by BMF LRU

© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


42

Think
Faster

with

Gelb Information

M3
-

File I/O Tuning


VSAM RLS
-

1

VSAM RLS activity by data set.

Also available by Storage Class.

© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


43

Think
Faster

with

Gelb Information

File I/O Tuning


VSAM RLS…NOTES


“LRU Status”

status of local buffers under Buffer Manager
Facility (BMF) control




GOOD = BMF at or below goal




ACCELERATED = buffer aging algorithms accelerated
because BMF is over goal




RECLAIMED = buffer aging bypassed accelerated
because BMF is over goal


“BMF Valid %”

percent of BMF reads that were valid

NOTE: BMF read hits is
sum

of
valid

and
invalid

hits.
Buffers can be invalid because (A) data altered, or (B) CF
lost track of buffer status


BMF READ HIT% = BMF READ% / BMF VALID% * 100


BMF INVALID READ HIT% = BMF READ HIT%
-

BMF READ%


© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


44

Think
Faster

with

Gelb Information

Summary


We examined just 7 main types of reports out of the 90+
available from RMF real
-
time or via post
-
processor. They
are:


RMF Delay Report


CPU Activity Report


LPAR Activity Report


CF Activity Report


Workload Activity Report


I/O Device Activity Report


VSAM File I/O Activity Report


With practice, you should be able to find the “gold” and solve
performance “mysteries” by looking at just 1


3 RMF reports.


















© 2007 Gelb Information Systems Corp.
-

www.gelbis.com


45

Think
Faster

with

Gelb Information

Need/Want to Know More…and


Start at
www.ibm.com/servers/eserver/zseries/


Documentation:


SC33
-
7991 RMF Report Analysis


SC33
-
7992 RMF Performance Management Guide


SA22
-
7602 z/OS MVS Planning: Workload Management


RMF Newsletters


IBM and SHARE presentations


WWW.CMG.ORG



Computer Measurement Group


Large Systems Performance Reference:

http://www
-
1.ibm.com/servers/eserver/zseries/lspr/