guts
guts
is a tool to measure/analyse performance. In the default mode, it
prints one line every second, and counts the number of operations or bytes-processed in one
second intervals. guts
is an internal utility, and is subject to change without
notice.
guts
provides information on entities such as:- CPU
- Indicates whether the CPU is idle or busy
- RPCs
- Number of RPCs, RPCs-in, RPCs-out, bytes-in, and bytes-out
- MFS
- Number of local-writes, local-reads, and other operations (such as lookup, create and remove)
- Log
- Number of log writes, log flushes, and log force-flushes
- IO
- Number of disk-operations/second (read/write), and the disk-io/second in MB (read/write)
Syntax
/opt/mapr/bin/guts
guts
-help
instance:<id> time:unix time:all time:none (add timestamp to output)
key:5660 (is server port)
shmid:<shared memory id> (client's shared memory id, check ipcs)
threadcpu:core threadcpu:all
cpu:none cpu:all
net:sum net:msum net:ksum net:all net:none
disk:none disk:ops disk:mb disk:all
diskMajor:major# of disk
ssd:none ssd:all
cache:none cache:small cache:med cache:all
cleaner:small cleaner:all cleaner:none
fs:rw fs:all fs:none
kv:all kv:none
btree:all btree:none
allocator:all allocator:none
rpc:none rpc:op rpc:all rpc:debug
db:none db:op db:get db:put db:scan db:all
dbrepl:none dbrepl:op dbrepl:all
streams:none streams:op streams:all
dsec:infinity (run time in sec.)
period:n (output every n sec.)
cache:small cache:med cache:all cache:none
log:all log:none
btree:all btree:none
resync:all resync:none
io:all io:small
hb:all io:none
gateway:all gateway:op gateway:lc gateway:none
mastgateway:all mastgateway:tier mastgateway:db mastgateway:mfsops mastgateway:none
fstier:all fstier:none
nfs:all nfs:none
moss:all moss:basic moss:none
client:none client:db client:fs client:all (requires shmid parameter)
clientpid:<process id of a running client process>
nfs4client:all
fuse:all shmid:<shared memory id> (posix client's shared memory id, check fuse logs)
header:all header:none (doesn't seem to work)
flush:none flush:line (if line, then output is flushed on every output line)
defaults: time:none net:none disk:none rpc:op db:op db:put dbrepl:none streams:none fs:rw cache:small kv:none
cleaner:short log:none btree:none resync:none period:1
Interpreting Output
The prefix c identifies client metrics. The suffix P refers to the number of pending RPCs. The suffix C denotes the number of completed RPCs.
The pending metrics are a snapshot of pending RPCs when the output is printed. The completed metrics are the increase that happened in the last print interval.
Parameters and Output
- CPU
- cpu:
all
— Percentage of idle time of each CPU on the system in the last second.
- IO
-
The metrics are
ior
andiow
, which are displayed by default.ior
— The first number reports the number of I/O reads for a machine in the last second. The second number reports the amount of I/O reads in MB in the last second.iow
— The first number reports the number of I/O writes for a machine in the last second. The second number reports the amount of I/O writes in MB in the last second.
- Disk
-
disk:ops
— Number of I/O requests (read+write) for each disk in the last second.disk:mb
— Amount of I/O in MB (read+write) for each disk in the last second.-
disk:all
— The preceding two numbers for each disk in the last second. The first number is fromdisk:ops
, the second number is fromdisk:mb
.
- File System
fs:rw
— Reports MFS file system activities. Reported metrics are:read
— The first number reports the number of remote reads in the last second. The second number reports the amount of data read in MB in the last second.write
— The first number reports the number of remote writes in the last second. The second number reports the amount of data written in MB in the last second.lread
/lwrite
— are similar to theread
andwrite
metrics, but are applicable for local reads/writes.
guts
displays the following file system metrics:crP
— Total pending read RPCs in the last second.crC
— Total completed read RPCs in the last second.cwP
— Total pending write RPCs in the last second.cwC
— Total completed write RPCs in the last second.ccP
— Total pending create RPCs in the last second.ccC
— Total completed create RPCs in the last second.cuP
— Total pending unlink RPCs in the last second.cuC
— Total completed unlink RPCs in the last second.
- RPC
- Reports the following metrics:
rpc:none
— Does not display any RPC related metrics.rpc:op
— rpc metricrpc:all
— rpc, im, and om metrics.rpc
— Number of RPC calls received in the last second.im
— Amount of RPC calls received in MB in the last second.om
— Amount of RPC calls sent in MB in the last second.
- Cache
-
cache:small
— Metrics on inode and dentry cache, which are displayed by default. The metrics reported are:icache
(inode cache) — The first number reports the number of inode cache lookups in the last second. The second number reports the number of inode cache lookup misses in the last second.dcache
(dentry cache) — The first and second numbers report dcache lookups and lookup misses in the last second, respectively.
- MOSS (Multithreaded Object Store Server)
- Reports the following metrics:
moss:none
— Does not display any MOSS-related metrics.moss:basic
— Displays only MOSS metrics related to the number of gets and puts.moss:all
— Displays all MOSS-related retrics.
s3bc
— Number of buckets created in the last second.s3bcd
— Number of buckets deleted in the last second.s3bi
— Number of bucket infos in the last second.s3bl
— Number of buckets lists in the last second.tp
— Number of tiny puts in the last second.sp
— Number of small puts in the last second.lp
— Number of large puts in the last second.jp
— Number of jumbo puts in the last second.tg
— Number of tiny gets in the last second.sg
— Number of small gets in the last second.lg
— Number of large gets in the last second.jg
— Number of jumbo gets in the last second.oi
— Number of object infos in the last second.ols
— Number of object lists in the last second.otag
— Number of object tags that are modified in the last second.oput
— Number of total object puts (sum of tiny, small, large, and jumbo) in the last second.opm
— Total size of data puts in MB in the last second.oget
— Number of total object gets (the sum of tiny, small, large, and jumbo) in the last second.ogm
— Total size of data gets in MB in the last second.
- Network
-
net:sum
— Total network traffic in bytes received and transmitted from all network interfaces for a machine.net:msum
— Total network traffic in megabytes.net:ksum
— Total network traffic in kilobytes.net:all
— Not yet implemented.net:none
— Does not display any network related metrics.
Metrics returned are:-
nI
— Total amount of network traffic received in bytes in the last second. This is a summation of network traffic from all network interfaces for a machine. nO
— Total amount of network traffic sent in bytes in the last second. This is a summation of network traffic from all network interfaces in a machine.
- Database
db:get
— Metrics related togets
. The output columns are as follows:rOP
— Number of RPCs completed for type OP in the last second.rOPR
— Number of rows processed from all RPCs of type OP in the last second.tOPR
— Number of rows processed from all RPCs in the last second.cOP
— Number of in-progress RPCs for the OP (not differential).
- Cleaner Metrics
guts
displays the following cleaner metrics:di
— Number of inodes dirtied by update operations in the last second.ic
— Number of inodes cleaned by the drainer in the last second.dd
— Number of data blocks dirtied by update operations in the last second.dc
— Number of data blocks cleaned by the drainer in the last second.
- Operational Metrics
guts
displays the following operational metrics:rput
— Number of put RPCs completed in the last second.rputR
— Sum of put rows completed in the last second, from all put rpcs.tputR
— Sum of put rows completed in the last second, from all rpcs (put, increment, checkAndPut, Append ..)cput
— Number of put RPCs in progress currently. This is not a differential, but displays the number of outstanding put RPCs at that particular instant.rget
— Number of get RPCs completed in the last second.rgetR
— Sum of get rows completed in the last second, from all get RPCs.tgetR
— Sum of get rows completed in the last second, from all rpcs (get, increment, checkAndPut, Append ..)cget
— Number of get RPCs in progress currently. This is not a differential, but displays the number of outstanding get RPCs at that particular instant.rsc
— Number of scan RPCs completed in the last second.rscR
— Sum of scan rows returned in the last second, from all scan RPCs.csc
— Number of scan RPCs currently in progress. This is not a differential, but shows the number of outstanding scan RPCs at that particular instant.rinc
— Number of increment RPCs completed in the last second.cinc
— Number of increment RPCs currently in progress. This is not a differential, but shows the number of outstanding increment RPCs at that particular instant.rchk
— Number of checkAndPut/checkAndDelete RPCs completed in the last second.rapp
— Number of append RPCs completed in the last second.rtlk
— Number of tablet lookup RPCs completed in the last second.ctlk
— Number of tablet lookup RPCs currently in progress. This is not a differential, but shows the number of outstanding lookup RPCs at that particular instant.rbulkb
— Number of bulk-import-bucket RPCs completed in the last second.rbulks
— Number of bulk-import-segment RPCs completed in the last second.
- Put Metrics
guts
displays the following put metrics:rput
— Number of put RPCs completed in the last second.rputR
— Sum of put rows completed in the last second, from all put rpcs.tputR
— Sum of put rows completed in the last second, from all rpcs (put, increment, checkAndPut, Append ..)cput
— Number of put RPCs in progress currently. This value is not a differential, but displays the number of outstanding put RPCs at that particular instant.rsf
— Reserved free memory in MemIndex in MB. If this value falls very low, put RPCs can get throttled. This value is not a differential.bucketWR
:- Column1 : Number of bucket writes (calls to MFS) in the last second.
- Column2 : Amount of bucket writes in MB in the last second.
fl
— Number of bucket flushes fired in the last second.ffl
— Number of force-flushes of buckets in the last second. If the bucket was flushed before it reached its optimal size, then the flush is counted as a force-flush.-
sfl
— Number of segments touched by the bucket-flushes in the last second. mcom
— Number of segments mini-packed in the last second.fcom
— Number of segments packed fully in the last second.ccom
— Number of segment packs running currently. This value is not a differential.scr
— Number of segment creates in the last second.spcr
— Number of spill creates in the last second.
- Get Metrics
guts
displays the following get metrics:rget
— Number of get RPCs completed in the last second.rgetR
— Sum of get rows completed in the last second, from all get RPCs.tgetR
— Sum of get rows completed in the last second, from all rpcs (get, increment, checkAndPut, Append ..)cget
— Number of get RPCs currently in progress. This is not a differential, but displays the number of outstanding get RPCs at that particular instant.vcM
— Size of the value-cache in MB. This value is not differential.cL
— Number of value-cache lookups in the last second.vcH
— Number of value-cache hits in the last second.bget
— Number of bucket gets in the last second. Will be 0 if there are no active buckets.sg
— Number of segment gets in the last second. Will normally be equal totgetR minus the number of value-cache hits
.spg
— Number of spill gets in the last second. This value is calculated assigma(segments * spill-per-segment) - bloomFilterSkips
bskp
— Number of spill gets that were avoided/saved by the bloom filter in the last second.
- Scan Metrics
guts
displays the following scan metrics:rsc
— Number of scan RPCs completed in the last second.rscR
— Sum of scan rows returned in the last second, from all scan RPCs.csc
— Number of scan RPCs currently in progress. This is not a differential, but shows the number of outstanding scan RPCs at that particular instant.bsc
— Number of buckets scanned in the last second.ssc
— Number of segments scanned in the last second.spsc
— Number of spills scanned in the last second.spscR
— Number of rows scanned from spills in the last second.ldbr
— Number of ldb blocks read in the last second.blkr
— Number of data blocks read in the last second (over spills, buckets ..)raSg
— Number of segments for which read-ahead was done in the last second.raSp
— Number of spills for which read-ahead was done in the last second.nAdv
— Number of fadvise calls made to MFS for scan read-ahead in the last second.raBl
— Sum of blocks in the fadvise calls made to MFS for scan read-ahead in the last second.
- Cumulative Metrics
guts
displays the following cumulative metrics:- cmP — Total pending RPCs from the client in the last second.
- cmC — Total completed RPCs from the client in the last second.
- DB Metrics
guts
displays the following database metrics:cgP
— Total pending get RPCs.cgC
— Total completed get RPCs.cpP
— Total pending put RPCs.cpC
— Total completed put RPCs.csP
— Total pending scan RPCs.csC
— Total completed scan RPCs.ciP
— Total pending increment RPCs.ciC
— Total completed increment RPCs.caP
— Total pending append RPCs.caC
— Total completed append RPCs.cgR
— Total client get rows.cpR
— Total client put rows.csR
— Total client scan rows.ciR
— Total client increment rows.caR
— Total client append rows.
Example Usage
- Find the process ID of the client program.
- Find all the shared memory segments (shmem) for this
program:
Here, there are two shared memory segments — one between the client and MFS, and the other between the client andipcs -mp | grep <pid> 998080521 root 30030 21850 998113290 root 30030 30030 ^^^^^^^^^ shmem ID
guts
. - Identify the correct shmem segment for
guts
:
The shmem with size 20M is between client and MFS. Here, we select shmem with ID 998113290.ipcs | grep 998113290 0x00000000 998113290 root 666 2288 1 dest ipcs | grep 998080521 0x00000000 998080521 root 660 20971520 1 dest ^^^^^^^^ size
- Run
guts
:
Pass the shmem ID and one of the client options. Client options are one of:/opt/mapr/bin/guts client:all shmid:998113290 Printing only client statistics cmP cmC cgP cgC cpP cpC csP csC ciP ciC caP caC crP crC cwP cwC ccP ccC cuP cuC 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
none
— Used when printing MFS/dbserver statisticsdb
— Prints client statistics for DB operationsfs
— Prints client statistics for file system operationsall
— Prints all client statistics
CLDB Guts
The cldbguts
utility prints information about active container reports,
full container reports, registration requests, MapR-FS heartbeats, NFS server heartbeats,
and containers. For more information, see cldbguts.
NFS Guts
guts
displays the following NFS metrics: req
— Number of requests received from all the NFS clients to this NFS server in the last second.dpC
— Number of dropped calls from NFS client due to running out of ONC handles (probably cluster is responding slow OR NFS client is bombarding the NFS server ).inReadReq
— Number of incoming read requests from NFS clients.outReadResp
— Number of outgoing read request responses to NFS Clients.inReadDataReq
— Size/Length of incoming read requests (buffer size) from NFS Clients.outReadDataResp
— Size/Length of outgoing read request response (buffer size) to NFS Clients.inWriteReq
— Number of incoming write requests from NFS clients.outWriteResp
— Number of outgoing read request responses to NFS clients.inWriteDataReq
— Size/Length of incoming write request (buffer size) from NFS clients.outWriteDataResp
— Size/Length of outgoing write request response (buffer size) to NFS Clients.
Running Guts
guts
on the node for which you need to collect
metrics./opt/mapr/bin/guts
00 01 02 03 04 05 06 07 rpc lpc write lwrite bwrite read lread icache dcache di ic dd dc ior iow rput rputR cput tputR rget rgetR cget tgetR rsc rscR csc
86 90 84 84 87 93 81 84 5 6 0 0 1 0 0 0 0 0 3 0 8 0 163 1 337 22 13 16 1 0 73 4 0 0 0 1 0 0 0 0 0 0 0
62 77 70 82 93 61 50 84 12 20 0 0 3 0 0 0 0 0 10 0 27 0 41 0 6 0 3 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0
63 78 59 56 84 64 32 86 4 5 0 0 5 0 0 0 0 0 0 0 5 0 27 0 8 0 22 0 0 0 0 0 3 1506 0 1506 0 0 0 0 0 0 0
83 76 77 82 68 69 82 67 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
94 49 91 56 75 48 57 92 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
97 96 99 89 93 94 82 95 2 0 0 0 1 0 0 0 0 0 0 0 1 0 8 0 2 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
99 99 96 97 99 98 99 82 19 6 0 0 1 0 0 0 0 0 3 0 186 0 18 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
To stop collecting metrics, press ^C.