20 Apr 23:19
Re: Is there any GC log reader?
From: John Coomes <John.Coomes@...>
Subject: Re: Is there any GC log reader?
Newsgroups: gmane.comp.java.openjdk.hotspot.gc.devel
Date: 2007-04-20 21:19:40 GMT
Subject: Re: Is there any GC log reader?
Newsgroups: gmane.comp.java.openjdk.hotspot.gc.devel
Date: 2007-04-20 21:19:40 GMT
Neo Jia (neojia@...) wrote: > On 4/19/07, John Coomes <John.Coomes@...> wrote: > > Neo Jia (neojia@...) wrote: > > > hi, > > > > > > Are there any scripts to read and parse the GC log dumped out by > > > PrintHeapAtGC? > > > > Hi Neo, > > > > I don't know of anything that parses the output from > > -XX:+PrintHeapAtGC. However, I currently maintain an aging awk script > > that parses the output from -XX:+PrintGCDetails and produces > > statistics (min, avg, stddev) for gc pauses, heap sizes, etc. If > > you'd like a copy, let me know. > > John, > > I would like to get a copy of your script. ... Hi Neo, I've attached three awk scripts: PrintGCFixup, PrintGCStats and CompareGCStats. Here's a brief write-up on each: PrintGCFixup Cleans up certain types of GC output to make analysis more accurate. This is not always necessary, but is generally a good idea. The main things it cleans up are the output from * -XX:+PrintVMOptions (enabled by default in debug builds) * -XX:+TraceClassUnloading (enabled by PrintGCDetails, and which obscures Full GC output) * -XX:+PrintHeapAtGC (which obscures almost everything prior to 1.6.0) * -XX:+PrintTenuringDistribution * -XX:+PrintParallelOldGCPhaseTimes (specific to parallel compaction) One thing to note is that it still has a hard time with the output from both -XX:+PrintHeapAtGC and -XX:+PrintTenuringDistribution, particularly with releases prior to 1.6.0. If this bites you, please submit a patch. The comments at the top have a bit more info. PrintGCStats Scans GC logs and prints statistics about GC and allocation. Works best if the logs were created with -XX:+PrintGCDetails and -XX:+PrintGCTimeStamps. Also has some basic support for creating data files for plotting w/gnuplot or StarOffice. See the comments at the top for usage. CompareGCStats Compares the output from two runs of PrintGCStats (presumably before and after) and shows the percent change for the various statistics. Basic Examples Here's how I typically use these, given 2 log files with -XX:+PrintGCDetails -XX:+PrintGCTimeStamps output (say ref.log is from the reference VM, new.log is from a modified VM that I'm evaluating): $ PrintGCFixup ref.log > ref.fix $ PrintGCFixup new.log > new.fix $ PrintGCStats -v cpus=4 ref.fix > ref.stats $ PrintGCStats -v cpus=4 new.fix > new.stats $ CompareGCStats ref.stats new.stats > new.comp Below is an example of the result, new.comp. In the pairs of lines that have the same label, the first is from ref.stats and the second is from new.stats; the unlabeled line that follows shows the %change. E.g., this run shows that the average 'gen1t(s)' (Full GC time in seconds), improved by 14% and the max improved by 24%. ref.stats vs. new.stats what count total mean max stddev gen0t(s) 1479 167.596 0.11332 0.452 0.0906 gen0t(s) 1487 169.604 0.11406 0.463 0.0910 0.541% 1.198% 0.65302% 2.434% 0.4415% gen1t(s) 80 95.982 1.19977 1.903 0.1945 gen1t(s) 90 92.594 1.02882 1.430 0.1206 12.500% -3.530% -14.24856% -24.855% -37.9949% GC(s) 1559 263.577 0.16907 1.903 0.2592 GC(s) 1577 262.198 0.16626 1.430 0.2317 1.155% -0.523% -1.66203% -24.855% -10.6096% alloc(MB) 1479 96679.169 65.36793 66.000 2.5605 alloc(MB) 1487 97085.999 65.28984 66.000 2.7041 0.541% 0.421% -0.11946% 0.000% 5.6083% promo(MB) 1479 3235.751 2.18780 34.328 4.3802 promo(MB) 1487 3471.243 2.33439 34.305 4.4127 0.541% 7.278% 6.70034% -0.067% 0.7420% used(MB) 1479 235279.202 159.07992 244.425 66.0007 used(MB) 1487 236180.818 158.83041 243.941 65.8061 0.541% 0.383% -0.15685% -0.198% -0.2948% commit0(MB) 1479 113883.000 77.00000 77.000 0.0000 commit0(MB) 1487 114499.000 77.00000 77.000 0.0000 0.541% 0.541% 0.00000% 0.000% 0.0000% commit1(MB) 1479 254388.000 172.00000 172.000 0.0000 commit1(MB) 1487 255764.000 172.00000 172.000 0.0000 0.541% 0.541% 0.00000% 0.000% 0.0000% commit(MB) 1479 368271.000 249.00000 249.000 0.0000 commit(MB) 1487 370263.000 249.00000 249.000 0.0000 0.541% 0.541% 0.00000% 0.000% 0.0000% alloc/elapsed_time = 96679.169 MB / 4920.718 s = 19.647 MB/s alloc/elapsed_time = 97085.999 MB / 4920.171 s = 19.732 MB/s 0.433% alloc/tot_cpu_time = 96679.169 MB / 19682.872 s = 4.912 MB/s alloc/tot_cpu_time = 97085.999 MB / 19680.684 s = 4.933 MB/s 0.428% alloc/mut_cpu_time = 96679.169 MB / 18628.563 s = 5.190 MB/s alloc/mut_cpu_time = 97085.999 MB / 18631.892 s = 5.211 MB/s 0.405% promo/elapsed_time = 3235.751 MB / 4920.718 s = 0.658 MB/s promo/elapsed_time = 3471.243 MB / 4920.171 s = 0.706 MB/s 7.295% promo/gc0_time = 3235.751 MB / 167.596 s = 19.307 MB/s promo/gc0_time = 3471.243 MB / 169.604 s = 20.467 MB/s 6.008% gc_seq_load = 1054.309 s / 19682.872 s = 5.356% gc_seq_load = 1048.792 s / 19680.684 s = 5.329% -0.027% gc_conc_load = 0.000 s / 19682.872 s = 0.000% gc_conc_load = 0.000 s / 19680.684 s = 0.000% 0.000% gc_tot_load = 1054.309 s / 19682.872 s = 5.356% gc_tot_load = 1048.792 s / 19680.684 s = 5.329% -0.027% Note that the alloc (allocation) and promo (promotion) statistics are not exact; they're useful only as a general indication of how much allocation/promotion was done. You can aggregate the statistics from multiple runs by including additional file names on the command line. $ PrintGCStats -v cpus=4 ref-1.fix ref-2.fix ref-3.fix > ref.stats $ PrintGCStats -v cpus=4 new-1.fix new-2.fix new-3.fix > new.stats > Maybe the openJDK project > needs this kind of script to generate some visual display, or > something else similar to the GCSpy > (http://www.cs.kent.ac.uk/pubs/2002/1426/) used in JikesRVM project. Yes, we're definitely looking for/working toward something to easily enable visual display. IIRC, the overhead of GCSpy was higher than many would be willing to tolerate in production, at least for some of the features. Tony may be able to provide more detail. -John
#! /usr/xpg4/bin/awk -f
#
# Usage: PrintGCFixup.pc [file ...]
#
# Cleanup HotSpot GC logs so they are more readable and amenable to analysis
# with PrintGCStats. The following cleanups are performed:
#
# 1) Strip -XX:+PrintHeapAtGCOutput.
#
# 2) Move -XX:+PrintVMOptions output so it doesn't interfere with GC output.
#
# 3) Strip -XX:+TraceClassUnloading output.
#
# 4) Move the output from -XX:+PrintParallelOldGCPhaseTimes from the middle of
# -XX:+PrintGCDetails output to just before it.
#
# The 1.5.0 updates that contain the parallel old collector use the label
# 'post compact' for the phase that updates interior oops in objects that
# were deferred because they cross compaction boundaries. Also, no timers
# cover the phase in which data structures are reset after compaction.
# Later VMs use the label 'deferred updates' for the phase in which deferred
# objects are updated, and label the time resetting data structures as 'post
# compact.' If 1.5.0 output is detected, the label 'post compact' is
# converted to 'deferred updates' to allow comparison of 1.5.0 and later VMs.
#
# 5) Move -XX:+PrintTenuringDistribution output from the middle of the GC
# output to just before it.
#
# 6) Move g1 'evac failed' messages from the middle of the GC output to just
# before it.
#
# 7) Fix misleading 'Full GC' messages from JNI critical sections w/CMS.
#
# 8) Change cms concurrent mode failure output so the concurrent phase line is
# before the full gc output instead of in the middle.
#
# 9) Move the output from -XX:+TraceParallelOldGCSummaryPhase from the middle of
# GC output.
# ------------------------------------------------------------------------------
BEGIN {
vm_option_prefix = "";
unloading_class_prefix = "";
# Regular expressions to match PrintHeapAtGC header lines.
ph_hdr_beg_ns_re = "\\{Heap before (GC|gc) invocations=";
ph_hdr_end_ns_re = "\\{?Heap after (GC|gc) invocations=";
# Some VM versions print a space at the start of the PrintHeapAtGC
# header line and some don't. These variables are set to the correct
# regular expression (including the space or not) once it is known.
ph_hdr_beg_re = "";
ph_hdr_end_re = "";
# Regular expression to match a 'generation' line in -XX:+PrintHeapAtGC
# output.
ph_gen_re = "^ +";
ph_gen_re = ph_gen_re "(((AS)?PS|Par)(Young|Old|Perm)Gen";
ph_gen_re = ph_gen_re "|(par|def) new generation";
ph_gen_re = ph_gen_re "|concurrent[ -]mark-sweep (generation|perm gen)";
ph_gen_re = ph_gen_re "|tenured generation|compacting perm gen";
ph_gen_re = ph_gen_re ") +total [0-9]+[KMG], used [0-9]+[KMG]";
# Regular expression to match a 'space' line in -XX:+PrintHeapAtGC
# output.
ph_spc_re = "^ +";
ph_spc_re = ph_spc_re "(eden|from|to |object|the|r[ow]) space";
ph_spc_re = ph_spc_re " +[0-9]+[KMG], +[0-9]+% used";
full_gc_re = "\\[Full GC( \\(System\\))? ";
full_gc_ns_re = substr(full_gc_re, 1, length(full_gc_re) - 1);
heap_size_re = "[0-9]+[KM]"; # 8K
heap_size_paren_re = "\\(" heap_size_re "\\)"; # (8K)
heap_size_change_re = heap_size_re "->" heap_size_re; # 8K->4K
# 8K->4K(96K), or 8K->4K (96K)
heap_size_status_re = heap_size_change_re " ?" heap_size_paren_re;
gc_time_re = "[0-9]+\\.[0-9]+";
gc_time_secs_re = gc_time_re " secs";
gc_time_ms_re = gc_time_re " ms";
timestamp_re = "(" gc_time_re ": *)?";
timestamp_range_re = "(" gc_time_re "-" gc_time_re ": *)?";
# Heap size status plus elapsed time: 8K->4K(96K), 0.0517089 secs
heap_report_re = heap_size_status_re ", " gc_time_secs_re;
# Size printed at CMS initial mark and remark.
cms_heap_size_re = heap_size_re heap_size_paren_re; # 6K(9K)
cms_heap_report_re = cms_heap_size_re ", " gc_time_secs_re;
cms_concurrent_phase_re = "(AS)?CMS-concurrent-(mark|(abortable-)?preclean|sweep|reset)";
# Generations which print optional messages.
promo_failed_re = "( \\(promotion failed\\))?"
cms_gen_re = "(AS)?CMS( \\(concurrent mode failure\\))?";
parnew_gen_re = "(AS)?ParNew";
# 'Framework' GCs: DefNew, ParNew, Tenured, CMS
fw_yng_gen_re = "(DefNew|" parnew_gen_re ")" promo_failed_re;
fw_old_gen_re = "(Tenured|" cms_gen_re ")";
# Garbage First (G1) pauses:
# [GC pause (young), 0.0082 secs]
# or [GC pause (partial), 0.082 secs]
# or [GC pause (young) (initial mark), 0.082 secs]
# or [GC remark, 0.082 secs]
# or [GC cleanup 11M->11M(25M), 0.126 secs]
g1_cleanup_re = "cleanup " heap_size_status_re;
g1_pause_re = "pause \\((young|partial)\\)";
g1_pause_re = g1_pause_re "( \\((initial-mark|evacuation failed)\\))?";
g1_stw_re = "\\[GC (" g1_pause_re "|remark|" g1_cleanup_re "), " \
gc_time_secs_re "\\]";
# PrintTenuringDistribution (ptd)
ptd_initial_re = "^Desired survivor size [0-9]+ [a-zA-Z]+, new threshold [0-9]+";
ptd_age_re = "^- age +[0-9]+: +[0-9]+ [a-zA-Z]+, +[0-9]+ total$";
# Used to determine if the label 'post compact' should be converted to
# 'deferred updates'
deferred_updates_line = -1;
}
# ------------------------------------------------------------------------------
# Strip -XX:+PrintHeapAtGCOutput
# ------------------------------------------------------------------------------
function is_print_heap_at_gc_hdr(string) {
if (ph_hdr_beg_re != "") {
if (match(string, ph_hdr_beg_re)) return 1;
if (match(string, ph_hdr_end_re)) return 2;
return 0;
}
# Try to determine whether or not this VM prints a space at the start
# of the header line or not.
if (match(string, "^ " ph_hdr_beg_ns_re) || \
match(string, " " ph_hdr_beg_ns_re)) {
# This VM prints a space before the header.
ph_hdr_beg_re = " " ph_hdr_beg_ns_re;
ph_hdr_end_re = " " ph_hdr_end_ns_re;
match(string, ph_hdr_beg_re); # Set RSTART, RLENGTH correctly.
return 1;
}
if (match(string, "^ " ph_hdr_end_ns_re) || \
match(string, " " ph_hdr_end_ns_re)) {
# This VM prints a space before the header.
ph_hdr_beg_re = " " ph_hdr_beg_ns_re;
ph_hdr_end_re = " " ph_hdr_end_ns_re;
match(string, ph_hdr_end_re); # Set RSTART, RLENGTH correctly.
return 2;
}
if (match(string, ph_hdr_beg_ns_re)) {
# This VM does not print a space before the header.
ph_hdr_beg_re = ph_hdr_beg_ns_re;
ph_hdr_end_re = ph_hdr_end_ns_re;
return 1;
}
if (match(string, ph_hdr_end_ns_re)) {
# This VM does not print a space before the header.
ph_hdr_beg_re = " " ph_hdr_beg_ns_re;
ph_hdr_end_re = " " ph_hdr_end_ns_re;
return 2;
}
return 0;
}
function is_print_heap_at_gc_line(string) {
if (string == "Heap" || string == "}") return 1;
if (match(string, ph_gen_re)) return 1;
if (match(string, ph_spc_re)) return 1;
if (string == "No shared spaces configured.") return 1; # Ugh.
return 0;
}
function skip_print_heap_at_gc_lines() {
while (is_print_heap_at_gc_line($0)) {
getline;
}
if (match($0, "^} ")) {
$0 = substr($0, 3);
}
}
{
rc = is_print_heap_at_gc_hdr($0);
while (rc) {
if (RSTART > 1) {
ph_saved_text = substr($0, 1, RSTART - 1);
} else {
ph_saved_text = "";
}
getline;
skip_print_heap_at_gc_lines();
$0 = ph_saved_text $0;
rc = is_print_heap_at_gc_hdr($0);
}
}
# ------------------------------------------------------------------------------
# Strip gc overhead messages.
# ------------------------------------------------------------------------------
match($0, "\tGC (overhead|time) would exceed GC(Overhead|Time)Limit of [0-9]+%") && RSTART > 1 {
gcol_saved_txt = substr($0, 1, RSTART - 1);
print substr($0, RSTART + 1, RLENGTH);
getline;
$0 = gcol_saved_txt $0;
}
# ------------------------------------------------------------------------------
# Strip i-cms duty cycle output.
#
# 23.136: [GC 23.137: [ParNew: 65408K->0K(65472K), 0.0737562 secs] 72847K->13616K(1048512K)
icms_dc=5 , 0.0741636 secs]
# By default, icms duty cycle output is stripped. Allow
# '-v keep_icms_dc=1' on the command line to override that.
keep_icms_dc == 0 && match($0, heap_size_status_re " icms_dc=[0-9]+ ") {
sub(" icms_dc=[0-9]+ ", "");
# Note: no 'next' here; need to allow other patterns see the input.
}
# ------------------------------------------------------------------------------
# Fix -XX:+PrintVMOptions output.
#
# Change this:
#
# [GCVM option '+UseParallelOldGC'
# VM option '+UseParallelGC'
# VM option '+PrintGCDetails'
# VM option 'ParallelGCThreads=4'
# [PSYoungGen: 3004K->503K(3584K)] 3004K->2850K(7680K), 0.0191534 secs]
#
# to this:
#
# VM option '+UseParallelOldGC'
# VM option '+UseParallelGC'
# VM option '+PrintGCDetails'
# VM option 'ParallelGCThreads=4'
# [GC [PSYoungGen: 3004K->503K(3584K)] 3004K->2850K(7680K), 0.0191534 secs]
# ------------------------------------------------------------------------------
vm_option_prefix == "" && match($0, "VM option '.+'") {
if (RSTART > 1) {
vm_option_prefix = substr($0, 1, RSTART - 1);
# Print the VM option.
print substr($0, RSTART);
next;
}
}
vm_option_prefix != "" {
if (match($0, "^VM option '.+'")) {
print;
next;
} else {
$0 = vm_option_prefix $0;
vm_option_prefix = "";
}
}
# ------------------------------------------------------------------------------
# Strip -XX:+TraceClassUnloading output which, unfortunately, is enabled as a
# side-effect of -XX:+PrintGCDetails.
# ------------------------------------------------------------------------------
unloading_class_prefix == "" && match($0, "\\[Unloading class [^]]+\\]") {
if (RSTART > 1) {
unloading_class_prefix = substr($0, 1, RSTART - 1);
# print substr($0, RSTART, length($0) - RSTART + 1);
next;
}
}
unloading_class_prefix != "" {
if (match($0, "^\\[Unloading class [^]]+\\]")) {
# print;
next;
} else {
$0 = unloading_class_prefix $0;
unloading_class_prefix = "";
}
}
# ------------------------------------------------------------------------------
# Move the output from -XX:+PrintParallelOldGCPhaseTimes from the middle of
# -XX:+PrintGCDetails output to just before the PrintGCDetails output. For
# example, change this:
#
# 0.012: [pre compact, 0.0000074 secs]
# 0.012: [Full GC0.012: [par marking phase0.012: [par marking main, 0.0071655 secs]
# 0.019: [reference processing, 0.0000417 secs]
# 0.019: [class unloading, 0.0010241 secs]
# , 0.0083858 secs]
# 0.021: [summary phase, 0.0014581 secs]
# 0.022: [adjust roots, 0.0014760 secs]
# 0.024: [compact perm gen, 0.0109628 secs]
# 0.035: [drain task setup, 0.0012706 secs]
# 0.036: [dense prefix task setup, 0.0000058 secs]
# 0.036: [steal task setup, 0.0000013 secs]
# 0.036: [par compact, 0.0059447 secs]
# 0.042: [deferred updates, 0.0026215 secs]
# 0.045: [post compact, 0.0008886 secs]
# [PSYoungGen: 503K->0K(3584K)] [ParOldGen: 2347K->2838K(7744K)] 2850K->2838K(11328K)
[PSPermGen: 1457K->1456K(16384K)], 0.0340510 secs]
#
# to this:
#
# 0.012: [pre compact, 0.0000074 secs]
# 0.012: [par marking main, 0.0071655 secs]
# 0.019: [reference processing, 0.0000417 secs]
# 0.019: [class unloading, 0.0010241 secs]
# 0.012: [par marking phase, 0.0083858 secs]
# 0.021: [summary phase, 0.0014581 secs]
# 0.022: [adjust roots, 0.0014760 secs]
# 0.024: [compact perm gen, 0.0109628 secs]
# 0.035: [drain task setup, 0.0012706 secs]
# 0.036: [dense prefix task setup, 0.0000058 secs]
# 0.036: [steal task setup, 0.0000013 secs]
# 0.036: [par compact, 0.0059447 secs]
# 0.042: [deferred updates, 0.0026215 secs]
# 0.045: [post compact, 0.0008886 secs]
# 0.012: [Full GC [PSYoungGen: 503K->0K(3584K)] [ParOldGen: 2347K->2838K(7744K)]
2850K->2838K(11328K) [PSPermGen: 1457K->1456K(16384K)], 0.0340510 secs]
# ------------------------------------------------------------------------------
# The marking phase may have nested items; fix them up.
/\[(par )?marking phase.*\[par mark *[,:]/ {
match($0, timestamp_re "\\[par mark *[,:] " gc_time_secs_re "\\]");
print substr($0, RSTART, RLENGTH); # [par marking main, ...]
# Reunite the first (outer) 'marking phase' text with its time.
tmp1 = substr($0, 1, RSTART - 1);
tmp2 = substr($0, RSTART + RLENGTH);
while (getline tmp3 > 0 && !match(tmp3, "^[,:] " gc_time_secs_re "\\]")) {
print tmp3;
}
$0 = tmp1 tmp2 tmp3;
}
match($0, timestamp_re "\\[(par )?marking phase *[,:] " gc_time_secs_re "\\]") {
# print NR, RSTART, RLENGTH ":" $0;
pc_saved_text = substr($0, 1, RSTART - 1);
print substr($0, RSTART, length($0) - RSTART + 1);
next;
}
# The compaction phase may have nested items; fix them up.
/\[compaction phase.*\[drain task setup *[,:]/ {
match($0, timestamp_re "\\[drain task setup *[,:] " gc_time_secs_re "\\]");
print substr($0, RSTART, RLENGTH); # [par compact, ...]
# Reunite the 'compaction phase' text with the associated time.
tmp1 = substr($0, 1, RSTART - 1);
tmp2 = substr($0, RSTART + RLENGTH);
while (getline tmp3 > 0 && !match(tmp3, "^[,:] " gc_time_secs_re "\\]")) {
if (match(tmp3, "\\[deferred updates *[,:] " gc_time_secs_re "\\]")) {
deferred_updates_line = NR;
}
print tmp3;
}
$0 = tmp1 tmp2 tmp3;
}
/\[deferred updates *[,:] [0-9][0-9.]* secs\]/ {
print;
deferred_updates_line = NR;
next;
}
/\[post compact *[,:] [0-9][0-9.]* secs\]/ {
# If the deferred updates line has not been seen, this is an older VM
# that labels the deferred updates phase as 'post compact.' Fix it up.
if (deferred_updates_line < 0) {
sub("post compact", "deferred updates");
} else {
# Reset deferred_updates_line, since we may process multiple
# files, some from older from VMs.
deferred_updates_line = -1;
}
print;
printf("%s", pc_saved_text); # Note: no newline.
next;
}
# ------------------------------------------------------------------------------
# Strip -XX:+PrintTenuringDistribution output.
# ------------------------------------------------------------------------------
# DefNew/Parnew before:
#
# [GC [DefNew
# Desired survivor size 32768 bytes, new threshold 1 (max 31)
# - age 1: 5280 bytes, 65280 total
# - age 2: 9820 bytes, 65280 total
# - age 3: 50180 bytes, 65280 total
# : 2111K->63K(2112K), 0.0003662 secs] 3415K->1383K(3520K), 0.0004358 secs]
#
# After:
#
# [GC [DefNew: 2111K->63K(2112K), 0.0003662 secs] 3415K->1383K(3520K), 0.0004358 secs]
#
# Parallel scavenge before:
#
# [GC
# Desired survivor size 1572864 bytes, new threshold 7 (max 15)
# [PSYoungGen: 8571K->1299K(10752K)] 8571K->8331K(35328K), 0.0257617 secs]
#
# After:
#
# [GC [PSYoungGen: 8571K->1299K(10752K)] 8571K->8331K(35328K), 0.0257617 secs]
match($0, "^" timestamp_re "\\[GC( " timestamp_re "\\[" fw_yng_gen_re ")?$") {
td_saved_text = $0;
getline;
if (match ($0, ptd_initial_re)) {
print; getline;
while (match($0, ptd_age_re)) {
print; getline;
}
}
# This may not be the correct result if the ptd_initial_re line
# ("Desired survivor size") was not matched. However, since we've done
# a getline it's too late to do anything else.
$0 = td_saved_text $0;
}
# ------------------------------------------------------------------------------
# G1 evac failed messages.
# ------------------------------------------------------------------------------
# [GC pause (young)evac failed in heap region 01051e8 [0xf7400000,0xf7500000), (F: age 0) first obj = 0xf74d1440
# evac failed in heap region 0104cc8 [0xf7300000,0xf7400000), (F: age 0) first obj = 0xf73db610
# ...
# (evacuation failed), 0.16025130 secs]
match($0, "\\[GC " g1_pause_re "evac failed in heap region") {
g1_saved_len = RLENGTH - 26;
g1_saved_txt = substr($0, RSTART, g1_saved_len);
print substr($0, RSTART + g1_saved_len);
getline;
while (match($0, "^evac failed in heap region")) {
print;
getline;
}
$0 = g1_saved_txt $0;
}
# ------------------------------------------------------------------------------
# Fix misleading 'Full GC' messages from JNI critical sections w/CMS.
# ------------------------------------------------------------------------------
# Get rid of the 'Full' in the following:
#
# 5806.397: [Full GC 5806.397: [ParNew: 21817K->9930K(183680K), 0.1656553 secs]
246236K->240893K(1022336K), 0.1660009 secs]
match($0, full_gc_re timestamp_re "\\[ParNew: " heap_report_re "\\] " heap_report_re "\\]$") {
t1 = substr($0, 1, RSTART);
t2 = substr($0, RSTART + 6);
$0 = t1 t2;
}
# ------------------------------------------------------------------------------
# Change cms concurrent mode failure output so the concurrent phase line is
# before the full gc output instead of in the middle.
# ------------------------------------------------------------------------------
# Before:
#
# 2.139: [GC 2.139: [ParNew: 7814K->7814K(8128K), 0.0000347 secs]2.139: [CMS2.139:
[CMS-concurrent-abortable-preclean: 0.050/1.184 secs]
# (concurrent mode failure): 248160K->251282K(251400K), 0.3042057 secs]
255974K->255969K(259528K), 0.3045270 secs]
#
# After:
#
# 2.139: [CMS-concurrent-abortable-preclean: 0.050/1.184 secs]
# 2.139: [GC 2.139: [ParNew: 7814K->7814K(8128K), 0.0000347 secs]2.139: [CMS (concurrent mode
failure): 248160K->251282K(251400K), 0.3042057 secs] 255974K->255969K(259528K), 0.3045270 secs]
match($0, timestamp_re "\\[CMS" timestamp_re "\\[" cms_concurrent_phase_re ": +" gc_time_re "/"
gc_time_secs_re "\\]$") {
match($0, timestamp_re "\\[" cms_concurrent_phase_re ": +");
cms_cmf_saved_txt = substr($0, 1, RSTART - 1);
print substr($0, RSTART);
getline;
$0 = cms_cmf_saved_txt $0;
}
# ------------------------------------------------------------------------------
# Change 'Perm :' to 'Perm:'
# ------------------------------------------------------------------------------
# Remove the space before the colon.
match($0, "\\[(CMS )?Perm :") {
$0 = substr($0, 1, RSTART + RLENGTH - 3) \
substr($0, RSTART + RLENGTH - 1);
}
# ------------------------------------------------------------------------------
# Clean up the output from -XX:+TraceParallelOldGCSummaryPhase
# ------------------------------------------------------------------------------
match($0, "^" timestamp_re full_gc_ns_re "sb=") {
pc_saved_txt = substr($0, RSTART, RLENGTH - 3);
$0 = substr($0, RSTART + RLENGTH - 3);
}
match($0, "^ \\[PSYoungGen: " heap_size_status_re "\\]") {
$0 = pc_saved_txt $0;
}
# ------------------------------------------------------------------------------
{ print; }
#! /usr/xpg4/bin/awk -f
# Note: /bin/nawk on Solaris 9 seems to have a bug; the RSTART and RLENGTH
# vars are not set correctly during the last call to match() in
# recordGCPauseEVM.
# PrintGCStats - summarize statistics about garbage collection, in particular gc
# pause time totals, averages, maximum and standard deviations.
#
# Attribution: written by John Coomes, based on earlier work by Peter Kessler,
# Ross Knippel and Jon Masamitsu.
#
# The input to this script should be the output from the HotSpot(TM)
# Virtual Machine when run with one or more of the following flags:
#
# -verbose:gc # produces minimal output so statistics are
# # limited, but available in all VMs
#
# -XX:+PrintGCTimeStamps # enables time-based statistics (e.g.,
# # allocation rates, intervals), but only
# # available in JDK 1.4.0 and later.
#
# -XX:+PrintGCDetails # enables more detailed statistics gathering,
# # but only available in JDK 1.4.1 and later.
#
# -XX:-TraceClassUnloading # [1.5.0 and later] disable messages about class
# # unloading, which are enabled as a side-effect
# # by -XX:+PrintGCDetails. The class unloading
# # messages confuse this script and will cause
# # some GC information in the log to be ignored.
# #
# # Note: This option only has an effect in 1.5.0
# # and later. Prior to 1.5.0, the option is
# # accepted, but is overridden by
# # -XX:+PrintGCDetails. In 1.4.2 and earlier
# # releases, use -XX:-ClassUnloading instead (see
# # below).
#
# -XX:-ClassUnloading # disable class unloading, since PrintGCDetails
# # turns on TraceClassUnloading, which cannot be
# # overridden from the command line until 1.5.0.
#
# Recommended command-line with JDK 1.5.0 and later:
#
# java -verbose:gc -XX:+PrintGCTimeStamps -XX:+PrintGCDetails \
# -XX:-TraceClassUnloading ...
#
# Recommended command-line with JDK 1.4.1 and 1.4.2:
#
# java -verbose:gc -XX:+PrintGCTimeStamps -XX:+PrintGCDetails \
# -XX:-ClassUnloading ...
#
# ------------------------------------------------------------------------------
#
# Usage:
#
# PrintGCStats -v cpus=<n> [-v interval=<seconds>] [-v verbose=1] [file ...]
# PrintGCStats -v plot=name [-v plotcolumns=<n>] [-v verbose=1] [file ...]
#
# cpus - number of cpus on the machine where java was run, used to
# compute cpu time available and gc 'load' factors. No default;
# must be specified on the command line (defaulting to 1 is too
# error prone).
#
# ncpu - synonym for cpus, accepted for backward compatibility
#
# interval - print statistics at the end of each interval; requires
# output from -XX:+PrintGCTimeStamps. Default is 0 (disabled).
#
# plot - generate data points useful for plotting one of the collected
# statistics instead of the normal statistics summary. The name
# argument is the name of one of the output statistics, e.g.,
# "gen0t(s)", "cmsRM(s)", "commit0(MB)", etc.
#
# The default output format for time-based statistics such as
# "gen0t(s)" includes four columns, described below. The
# default output format for size-based statistics such as
# "commit0(MB)" includes just the first two columns. The
# number of columns in the output can be set on the command
# line with -v plotcolumns=<N>.
#
# The output columns are:
#
# 1) the starting timestamp if timestamps are present, or a
# simple counter if not
#
# 2) the value of the desired statistic (e.g., the length of a
# cms remark pause).
#
# 3) the ending timestamp (or counter)
#
# 4) the value of the desired statistic (again)
#
# The last column is to make plotting start & stop events
# easier.
#
# plotcolumns - the number of columns to include in the plot data.
#
# verbose - if non-zero, print each item on a separate line in addition
# to the summary statistics
#
# Typical usage:
#
# PrintGCStats -v cpus=4 gc.log > gc.stats
#
# ------------------------------------------------------------------------------
#
# Basic Output statistics:
#
# gen0(s) - young gen collection time, excluding gc_prologue & gc_epilogue.
# gen0t(s) - young gen collection time, including gc_prologue & gc_epilogue
# gen1i(s) - train generation incremental collection
# gen1t(s) - old generation collection/full GC
# cmsIM(s) - CMS initial mark pause
# cmsRM(s) - CMS remark pause
# cmsRS(s) - CMS resize pause
# GC(s) - all stop-the-world GC pauses
# cmsCM(s) - CMS concurrent mark phase
# cmsCP(s) - CMS concurrent preclean phase
# cmsCS(s) - CMS concurrent sweep phase
# cmsCR(s) - CMS concurrent reset phase
# alloc(MB) - object allocation in MB (approximate***)
# promo(MB) - object promotion in MB (approximate***)
# used0(MB) - young gen used memory size (before gc)
# used1(MB) - old gen used memory size (before gc)
# used(MB) - heap space used memory size (before gc) (excludes perm gen)
# commit0(MB) - young gen committed memory size (after gc)
# commit1(MB) - old gen committed memory size (after gc)
# commit(MB) - heap committed memory size (after gc) (excludes perm gen)
# apptime(s) - amount of time application threads were running
# safept(s) - amount of time the VM spent at safepoints (app threads stopped)
#
# *** - these values are approximate because there is no way to track
# allocations that occur directly into older generations.
#
# Some definitions:
#
# 'mutator' or 'mutator thread': a gc-centric term referring to a non-GC
# thread that modifies or 'mutates' the heap by allocating memory and/or
# updating object fields.
#
# promotion: when an object that was allocated in the young generation has
# survived long enough, it is copied, or promoted, into the old generation.
#
# Time-based Output Statistics (require -XX:+PrintGCTimeStamps):
#
# alloc/elapsed_time - allocation rate, based on elapsed time
# alloc/tot_cpu_time - allocation rate, based on total cpu time
# alloc/mut_cpu_time - allocation rate, based on cpu time available to mutators
# promo/elapsed_time - promotion rate, based on elapsed time
# promo/gc0_time - promotion rate, based on young gen gc time
# gc_seq_load - the percentage of cpu cycles used by gc 'serially'
# (i.e., while java application threads are stopped)
# gc_conc_load - the percentage of cpu cycles used by gc 'concurrently'
# (i.e., while java application threads are also running)
# gc_tot_load - the percentage of cpu cycles spent in gc
BEGIN {
usage_msg = "PrintGCStats -v cpus=<n> [-v interval=<seconds>] " \
"[-v verbose=1] [file ...]\n" \
"PrintGCStats -v plot=name [-v plotcolumns=<n>] "\
"[-v verbose=1] [file ...]";
# Seconds between printing per-interval statistics; a negative value disables
# intervals (the default). Allow command line to override.
timeStampDelta = interval == 0 ? -1 : interval;
# Number of cpus. Require this on the command line since defaulting to 1 is
# too error prone. Older versions used ncpu as the var name; accept it for
# compatibility.
if (cpus == 0) cpus = ncpu;
if (cpus == 0 && plot == "") {
print usage_msg;
exit(1);
}
# A note on time stamps: the firstTimeStamp is not always assumed to be 0 so
# that we can get accurate elapsed time measurement for a partial log (e.g.,
# from the steady-state portion of a log from a long running server). This
# means that the elapsed time measurement can be wrong if a program runs for a
# significant amount of time before the first gc time stamp is reported. The
# best way to fix this is to have the VM emit a time stamp when heap
# initialization is complete.
firstTimeStamp = -1.0; # sentinel
prevTimeStamp = lastTimeStamp = firstTimeStamp;
lastFileName = ""; # Used to detect when the input file has changed.
# This value is added to time stamps so that input from multiple files appears
# to have monotonically increasing timestamps.
timeStampOffset = 0.0;
i = -1;
gen0c_idx = ++i; # With PrintGCDetails, DefNew collection time only.
gen0t_idx = ++i; # Includes gc_prologue() & gc_epilogue().
gen1i_idx = ++i; # Train incremental collection time.
gen1t_idx = ++i; # Full GCs or Tenured GCs
cmsIM_idx = ++i; # CMS Initial Mark
cmsRM_idx = ++i; # CMS Remark
cmsRS_idx = ++i; # CMS Resize (evm only)
totgc_idx = ++i; # Total gc pause time
# These must be greater than the totgc_idx.
cmsCM_idx = ++i; # CMS Concurrent Mark
cmsCP_idx = ++i; # CMS Concurrent Preclean
cmsCS_idx = ++i; # CMS Concurrent Sweep
cmsCR_idx = ++i; # CMS Concurrent Reset
MB_a_idx = ++i; # MB allocated
MB_p_idx = ++i; # MB promoted
MB_used0_idx = ++i; # MB used in young gen
MB_used1_idx = ++i; # MB used in old gen
MB_usedh_idx = ++i; # MB used in heap (occupancy)
MB_c0_idx = ++i; # MB committed for gen0
MB_c1_idx = ++i; # MB committed for gen1
MB_ch_idx = ++i; # MB committed for entire heap
safept_idx = ++i; # Time application threads were stopped at a safepoint,
# from -XX:+TraceGCApplicationStoppedTime
apptime_idx = ++i; # Time application threads were running, from
# -XX:+PrintGCApplicationConcurrentTime
# Parallel old phases from PrintParallelOldGCPhaseTimes.
PO_precomp_idx = ++i;
PO_marking_idx = ++i;
PO_parmark_idx = ++i;
PO_mark_flush_idx = ++i;
PO_summary_idx = ++i;
PO_adjroots_idx = ++i;
PO_permgen_idx = ++i;
PO_compact_idx = ++i;
PO_drain_ts_idx = ++i;
PO_dpre_ts_idx = ++i;
PO_steal_ts_idx = ++i;
PO_parcomp_idx = ++i;
PO_deferred_idx = ++i;
PO_postcomp_idx = ++i;
last_idx = ++i; # This is just the last *named* index; a corresponding
# delta_* array item exists for each of the above items
# starting at this point in the array.
plot_cnt = -1; # Used to identify plot lines if timestamps are not
# available.
# Init arrays.
name_v[gen0c_idx] = "gen0(s)";
name_v[gen0t_idx] = "gen0t(s)";
name_v[gen1i_idx] = "gen1i(s)";
name_v[gen1t_idx] = "gen1t(s)";
name_v[cmsIM_idx] = "cmsIM(s)";
name_v[cmsRM_idx] = "cmsRM(s)";
name_v[cmsRS_idx] = "cmsRS(s)";
name_v[totgc_idx] = "GC(s)";
name_v[cmsCM_idx] = "cmsCM(s)";
name_v[cmsCP_idx] = "cmsCP(s)";
name_v[cmsCS_idx] = "cmsCS(s)";
name_v[cmsCR_idx] = "cmsCR(s)";
name_v[MB_a_idx] = "alloc(MB)";
name_v[MB_p_idx] = "promo(MB)";
name_v[MB_used0_idx] = "used0(MB)";
name_v[MB_used1_idx] = "used1(MB)";
name_v[MB_usedh_idx] = "used(MB)";
name_v[MB_c0_idx] = "commit0(MB)";
name_v[MB_c1_idx] = "commit1(MB)";
name_v[MB_ch_idx] = "commit(MB)";
name_v[safept_idx] = "safept(s)";
name_v[apptime_idx] = "apptime(s)";
name_v[PO_precomp_idx] = "precomp(s)";
name_v[PO_marking_idx] = "marking(s)";
name_v[PO_parmark_idx] = "parmark(s)";
name_v[PO_mark_flush_idx] = "mark-f(s)";
name_v[PO_summary_idx] = "summary(s)";
name_v[PO_adjroots_idx] = "adjroots(s)";
name_v[PO_permgen_idx] = "permgen(s)";
name_v[PO_compact_idx] = "compact(s)";
name_v[PO_drain_ts_idx] = "drain_ts(s)";
name_v[PO_dpre_ts_idx] = "dpre_ts(s)";
name_v[PO_steal_ts_idx] = "steal_ts(s)";
name_v[PO_parcomp_idx] = "parcomp(s)";
name_v[PO_deferred_idx] = "deferred(s)";
name_v[PO_postcomp_idx] = "postcomp(s)";
for (i = 0; i < last_idx; ++i) {
count_v[i] = 0;
sum_v[i] = 0.0;
max_v[i] = 0.0;
sum_of_sq_v[i] = 0.0;
name_v[last_idx + i] = name_v[i]; # Copy names.
}
plot_idx = -1;
if (plot != "") {
# Convert the plot=name value to a plot_idx. The default is no plotting,
# which occurs when plot_idx < 0.
for (i = 0; plot_idx < 0 && i < last_idx; ++i) {
if (plot == name_v[i]) {
plot_idx = i;
}
}
if (plot_idx < 0) {
print "PrintGCStats: unrecognized plot name plot=" plot ".";
print usage_msg;
exit(1);
}
}
# If plotting, set plotcolumns based on the statistic being plotted (unless
# plotcolumns was set on the command line).
if (plot_idx >= 0 && plotcolumns == 0) {
if (plot_idx >= MB_a_idx && plot_idx <= MB_ch_idx) {
# Use 2 columns for size-based statistics.
plotcolumns = 2;
} else {
# Use 4 columns for time-based statistics.
plotcolumns = 4;
}
}
# Heap sizes at the start & end of the last gen0 collection.
gen0_sizes[0] = 0.0;
gen0_sizes[1] = 0.0;
gen0_sizes[2] = 0.0;
initIntervalVars();
last_cmsRcount = 0;
printFirstHeader = 1;
# Six columns: name, count, total, mean, max, standard deviation
headfmt = "%-11s" " %7s" " %13s" " %12s" " %12s" " %9s" "\n";
datafmt = "%-11s" " %7d" " %13.3f" " %12.5f" " %12.3f" " %9.4f" "\n";
# Frequently-used regular expressions.
# These are replicated in PrintGCFixup; keep them in sync.
full_gc_re = "\\[Full GC (\\(System\\) )?";
heap_size_re = "[0-9]+[KM]"; # 8K
heap_size_paren_re = "\\(" heap_size_re "\\)"; # (8K)
heap_size_change_re = heap_size_re "->" heap_size_re; # 8K->4K
# 8K->4K(96K), or 8K->4K (96K)
heap_size_status_re = heap_size_change_re " ?" heap_size_paren_re;
gc_time_re = "[0-9]+\\.[0-9]+";
gc_time_secs_re = gc_time_re " secs";
gc_time_ms_re = gc_time_re " ms";
timestamp_re = "(" gc_time_re ": *)?";
timestamp_range_re = "(" gc_time_re "-" gc_time_re ": *)?";
# Heap size status plus elapsed time: 8K->4K(96K), 0.0517089 secs
heap_report_re = heap_size_status_re ", " gc_time_secs_re;
# Size printed at CMS initial mark and remark.
cms_heap_size_re = heap_size_re heap_size_paren_re; # 6K(9K)
cms_heap_report_re = cms_heap_size_re ", " gc_time_secs_re;
cms_concurrent_phase_re = "(AS)?CMS-concurrent-(mark|(abortable-)?preclean|sweep|reset)";
# Generations which print optional messages.
promo_failed_re = "( \\(promotion failed\\))?"
cms_gen_re = "(AS)?CMS( \\(concurrent mode failure\\))?";
parnew_gen_re = "(AS)?ParNew";
# 'Framework' GCs: DefNew, ParNew, Tenured, CMS
fw_yng_gen_re = "(DefNew|" parnew_gen_re ")" promo_failed_re;
fw_old_gen_re = "(Tenured|" cms_gen_re ")";
# Garbage First (G1) pauses:
# [GC pause (young), 0.0082 secs]
# or [GC pause (partial), 0.082 secs]
# or [GC pause (young) (initial mark), 0.082 secs]
# or [GC remark, 0.082 secs]
# or [GC cleanup 11M->11M(25M), 0.126 secs]
g1_cleanup_re = "cleanup " heap_size_status_re;
g1_pause_re = "pause \\((young|partial)\\)";
g1_pause_re = g1_pause_re "( \\((initial-mark|evacuation failed)\\))?";
g1_stw_re = "\\[GC (" g1_pause_re "|remark|" g1_cleanup_re "), " \
gc_time_secs_re "\\]";
}
function initIntervalVars() {
for (i = 0; i < last_idx; ++i) {
count_v[last_idx + i] = 0;
sum_v[last_idx + i] = 0.0;
max_v[last_idx + i] = 0.0;
sum_of_sq_v[last_idx + i] = 0.0;
}
}
function ratio(dividend, divisor) {
result = 0.0;
if (divisor != 0.0) {
result = dividend / divisor;
}
return result;
}
function stddev(count, sum, sum_of_squares) {
if (count < 2) return 0.0;
sum_squared_over_count = (sum * sum) / count;
# This has happened on occasion--not sure why--but only for total gc time,
# which includes samples from different populations.
if (sum_of_squares < sum_squared_over_count) return -1.0;
# print "stddev", count, sum, sum_of_squares, sum_squared_over_count;
return sqrt((sum_of_squares - sum_squared_over_count) / (count - 1));
}
function printHeader() {
printf(headfmt, "what", "count", "total", "mean", "max", "stddev");
}
function printData(idx) {
cnt = count_v[idx];
sec = sum_v[idx];
sd = stddev(cnt, sec, sum_of_sq_v[idx]);
printf(datafmt, name_v[idx], cnt, sec, ratio(sec, cnt), max_v[idx], sd);
}
function printRate(name, tot, tot_units, period, period_units) {
printf("%-21s = %10.3f %-2s / %10.3f %-2s = %7.3f %s/%s\n",
name, tot, tot_units, period, period_units, ratio(tot, period),
tot_units, period_units);
}
function printPercent(name, tot, tot_units, period, period_units) {
printf("%-21s = %10.3f %-2s / %10.3f %-2s = %7.3f%%\n",
name, tot, tot_units, period, period_units, ratio(tot * 100.0, period));
}
function getTimeStamp() {
gts_tmp_str = $0;
# Note: want to match the time stamp just before the '[GC' or '[Full GC' or
# '[CMS-' string on the line, and there may be time stamps that appear
# earlier. Thus there is no beginning-of-line anchor ('^') in the regexp used
# with match().
if (sub(/:? ? ?\[((Full )?GC|(AS)?CMS-).*/, "", gts_tmp_str) != 1) return -1.0;
if (! match(gts_tmp_str, "[0-9]+\\.[0-9]+(e[+-][0-9]+)?$")) return -1.0;
gts_tmp = substr(gts_tmp_str, RSTART, RLENGTH) + 0.0;
return gts_tmp;
}
function recordStatsInternal(idx, seconds) {
count_v[idx] += 1;
sum_v[idx] += seconds;
sum_of_sq_v[idx] += seconds * seconds;
if (seconds > max_v[idx]) max_v[idx] = seconds;
}
function writePlotData(tstamp, value) {
if (plotcolumns == 4) {
# Column 1 = start time, 2 = duration, 3 = end time, 4 = duration.
printf("%9.7f %9.7f %9.7f %9.7f\n", tstamp, value, tstamp + value, value);
return;
}
if (plotcolumns == 2) {
# Column 1 = start time, 2 = value.
printf("%9.7f %9.7f\n", tstamp, value);
return;
}
if (plotcolumns == 1) {
# Column 1 = start time.
printf("%9.7f\n", tstamp);
return;
}
if (plotcolumns == 3) {
# Column 1 = start time, 2 = duration, 3 = end time.
printf("%9.7f %9.7f %9.7f\n", tstamp, value, tstamp + value);
return;
}
}
function recordStats(idx, value) {
if (verbose) print name_v[idx] ":" NR ":" value;
if (plot_idx < 0) {
# Plotting disabled; record statistics.
recordStatsInternal(idx, value);
recordStatsInternal(idx + last_idx, value);
if (idx < totgc_idx) recordStatsInternal(totgc_idx, value);
} else if (idx == plot_idx || plot_idx == totgc_idx && idx < totgc_idx) {
# Plotting enabled; skip statistics and just print a plot line.
rs_tstamp = getTimeStamp();
if (rs_tstamp < 0.0) rs_tstamp = ++plot_cnt;
writePlotData(rs_tstamp, value);
}
}
function parseHeapSizes(sizes, str) {
sizes[0] = sizes[1] = 0.0;
if (!match(str, heap_size_re "->")) return -1;
sizes[0] = substr(str, RSTART, RLENGTH - 3) + 0.0;
if (substr(str, RSTART + RLENGTH - 3, 1) == "K") {
sizes[0] = sizes[0] / 1024.0;
}
if (!match(str, "[KM]->" heap_size_re)) return -1;
sizes[1] = substr(str, RSTART + 3, RLENGTH - 4) + 0.0;
if (substr(str, RSTART, 1) == "K") {
sizes[1] = sizes[1] / 1024.0;
}
if (!match(str, heap_size_paren_re)) return -1;
sizes[2] = substr(str, RSTART + 1, RLENGTH - 3) + 0.0;
if (substr(str, RSTART + RLENGTH - 2, 1) == "K") {
sizes[2] = sizes[2] / 1024.0;
}
return 0;
}
function recordHeapKb(str) {
if (parseHeapSizes(tmp_mb, str) < 0) return;
recordStats(MB_a_idx, tmp_mb[0] - gen0_sizes[1]);
# Occupancy (the before gc value is used).
recordStats(MB_usedh_idx, tmp_mb[0]);
# Total heap committed size.
recordStats(MB_ch_idx, tmp_mb[2]);
gen0_sizes[0] = tmp_mb[0];
gen0_sizes[1] = tmp_mb[1];
gen0_sizes[2] = tmp_mb[2];
}
function recordGen0Kb(str, allow_3_sizes) {
# Allocation info.
if (parseHeapSizes(tmp_mb, str) < 0) { return; }
str = substr(str, RSTART + RLENGTH);
# print $0;
# print tmp_mb[0],tmp_mb[1],gen0_sizes[0],gen0_sizes[1];
recordStats(MB_used0_idx, tmp_mb[0]);
recordStats(MB_a_idx, tmp_mb[0] - gen0_sizes[1]);
# Gen0 committed size.
recordStats(MB_c0_idx, tmp_mb[2]);
gen0_sizes[0] = tmp_mb[0];
gen0_sizes[1] = tmp_mb[1];
gen0_sizes[2] = tmp_mb[2];
# If there isn't a second heap size figure (4096K->1024K) on the line,
# promotion and occupancy info aren't available.
if (parseHeapSizes(tmp_mb, str) < 0) return;
# Promotion info. Amount promoted is inferred from the last nnnK->nnnK
# on the line, taking into account the amount collected:
#
# promoted = change-in-overall-heap-occupancy - change-in-gen0-occupancy -
# change-in-gen1-occupancy
#
str = substr(str, RSTART + RLENGTH);
if (match(str, heap_size_change_re) && allow_3_sizes) {
# There is a 3rd heap size on the line; the 2nd one just parsed is assumed
# to be from the old gen. Get the 3rd one and use that for the overall
# heap.
gen1_sizes[0] = tmp_mb[0];
gen1_sizes[1] = tmp_mb[1];
gen1_sizes[2] = tmp_mb[2];
parseHeapSizes(tmp_mb, str);
mb_promo = tmp_mb[1] - tmp_mb[0] - (gen0_sizes[1] - gen0_sizes[0]);
mb_promo -= (gen1_sizes[1] - gen1_sizes[0]);
} else {
# Only gen0 was collected.
mb_promo = tmp_mb[1] - tmp_mb[0] - (gen0_sizes[1] - gen0_sizes[0]);
}
recordStats(MB_p_idx, mb_promo);
# Occupancy (the before gc value is used).
recordStats(MB_usedh_idx, tmp_mb[0]);
# Total heap committed size.
recordStats(MB_ch_idx, tmp_mb[2]);
# Gen1 committed size.
recordStats(MB_c1_idx, tmp_mb[2] - gen0_sizes[2]);
}
function recordParOldPhaseTime(str, phase_label, idx) {
sub(".*" phase_label " *[,:] *", "", str);
sub(" secs\\].*", "", str);
recordStats(idx, str + 0.0);
next;
}
function printInterval() {
# No intervals if plotting.
if (plot_idx >= 0) return;
# Check for a time stamp.
pi_tmp = getTimeStamp();
if (pi_tmp < 0.0) return;
# Update the global time stamp vars.
if (lastFileName == FILENAME) {
lastTimeStamp = timeStampOffset + pi_tmp;
} else {
if (firstTimeStamp < 0) {
# First call of the run; initialize.
lastTimeStamp = pi_tmp;
firstTimeStamp = prevTimeStamp = lastTimeStamp;
} else {
# First call after the input file changed.
timeStampOffset = lastTimeStamp;
lastTimeStamp = timeStampOffset + pi_tmp;
}
lastFileName = FILENAME;
# printf("%10.3f %10.6f %s %s\n", timeStampOffset, pi_tmp,
# pi_tmp_str, FILENAME);
}
# Print out the statistics every timeStampDelta seconds.
if (timeStampDelta < 0) return;
if ((lastTimeStamp - prevTimeStamp) > timeStampDelta) {
prevTimeStamp = lastTimeStamp;
if ((printFirstHeader == 1) ||
((last_cmsRcount == 0) && (count_v[cmsRM_idx] != 0))) {
printf("Incremental statistics at %d second intervals\n", timeStampDelta);
printHeader();
last_cmsRcount = count_v[cmsRM_idx];
printFirstHeader = 0;
}
printf("interval=%d, time=%5.3f secs, line=%d\n",
int((lastTimeStamp - firstTimeStamp) / timeStampDelta),
lastTimeStamp, NR);
for (i = 0; i < last_idx; ++i) {
if (count_v[last_idx + i] > 0) {
printData(last_idx + i);
}
}
initIntervalVars();
}
}
# Match CMS initial mark output from PrintGCDetails.
#
# [GC [1 CMS-initial-mark: 14136K(23568K)] 14216K(25680K), 0.0062443 secs]
#
#/\[GC \[1 (AS)?CMS-initial-mark: [0-9]+K\([0-9]+K\)\] [0-9]+K\([0-9]+K\), [0-9][0-9.]* secs\]/ {
# /\[1 (AS)?CMS-initial-mark: [0-9]+K\([0-9]+K\)\] [0-9]+K\([0-9]+K\), [0-9][0-9.]* secs\]/ {
#
match($0, "\\[1 (AS)?CMS-initial-mark: " cms_heap_size_re "\\] " cms_heap_report_re "\\]") {
tString = substr($0, RSTART, RLENGTH);
match(tString, gc_time_secs_re);
secs = substr(tString, RSTART, RLENGTH - 5) + 0;
recordStats(cmsIM_idx, secs);
next;
}
# Match cms remark output from PrintGCDetails.
#[GC[dirty card accumulation, 0.0006214 secs][dirty card rescan, 0.1919700 secs] [1 CMS-remark:
10044K(16744K)] 10412K(18856K), 0.2095526 secs]
#
# /\[GC.*CMS-remark.*, [0-9][0-9.]*\ secs\]/ {
# /\[1 CMS-remark.*, [0-9][0-9.]*\ secs\]/ {
# /\[1 (AS)?CMS-remark: [0-9]+K\([0-9]+K\)\] [0-9]+K\([0-9]+K\), [0-9][0-9.]* secs\]/ {
match($0, "\\[1 (AS)?CMS-remark: " cms_heap_size_re "\\] " cms_heap_report_re "\\]") {
tString = substr($0, RSTART, RLENGTH);
match(tString, gc_time_secs_re);
secs = substr(tString, RSTART, RLENGTH - 5) + 0;
recordStats(cmsRM_idx, secs);
# recordStats incremented the total gc count; undo that.
count_v[totgc_idx] -= 1;
next;
}
# Match CMS initial mark or remark output from -verbose:gc.
#
# [GC 43466K(68920K), 0.0002577 secs]
match($0, "\\[GC " cms_heap_report_re "\\]") {
match($0, gc_time_secs_re);
secs = substr($0, RSTART, RLENGTH - 5) + 0;
recordStats(gen1t_idx, secs);
# XXX - this updates the count of gen1 collections for both initial mark and
# remark. Would like to update it only once per cms cycle (i.e., for initial
# mark only). Doing the increment every other time would be more accurate,
# but still subject to error because of aborted CMS cycles.
next;
}
# Match cms concurrent phase output
# [CMS-concurrent-mark: 6.422/9.360 secs]
# 10820.4: [CMS-concurrent-mark: 6.422/9.360 secs]
# /\[(AS)?CMS-concurrent-(mark|preclean|sweep|reset): [0-9.]+\/[0-9.]+ secs\]/ {
$0 ~ "\\[" cms_concurrent_phase_re ": " gc_time_re "/" gc_time_secs_re "\\]" {
match($0, cms_concurrent_phase_re ": ");
t_time_idx = RSTART + RLENGTH;
tString = substr($0, RSTART, RLENGTH);
if (match(tString, "-mark:")) {
tIdx = cmsCM_idx;
} else if (match(tString, "-sweep:")) {
tIdx = cmsCS_idx;
} else if (match(tString, "-preclean:")) {
tIdx = cmsCP_idx;
} else {
tIdx = cmsCR_idx;
}
tString = substr($0, t_time_idx);
match(tString, "/" gc_time_secs_re);
secs = substr(tString, 1, RSTART - 1) + 0.0;
recordStats(tIdx, secs);
printInterval(); # Must do this before the getline below.
if (match($0, "\\[(AS)?CMS" timestamp_re "\\[" cms_concurrent_phase_re)) {
# If CMS is in the middle of a concurrent phase when System.gc() is called
# or concurrent mode failure causes a bail out to mark-sweep,
# the output is split across 2 lines, e.g.:
#
# 164.092: [Full GC 164.093: [CMS164.341: [CMS-concurrent-mark: 0.302/0.304 secs]
# : 26221K->24397K(43704K), 0.8347158 secs] 26285K->24397K(64952K), [CMS Perm :
2794K->2794K(16384K)], 0.8350998 secs]
#
getline;
tString = $0;
tInt = sub(".*" heap_size_status_re "\\]?, ", "", tString);
tInt += sub(" secs.*", "", tString);
if (tInt == 2) {
secs = tString + 0.0;
recordStats(gen1t_idx, secs);
}
}
next;
}
# Match PrintGCDetails output for Tenured or CMS full GC
# [GC [DefNew: 2048K->64K(2112K), 0.1517089 secs][Tenured: 1859K->1912K(1920K), 0.1184458 secs]
2048K->1923K(4032K), 0.2710333 secs]
# or with time stamps
# 0.177656: [GC 0.177728: [DefNew: 2112K->0K(2176K), 0.1006331 secs]0.278442: [Tenured:
4092K->4092K(4208K), 0.1372500 secs] 4096K->4092K(6384K), 0.2385750 secs]
# 549.281: [GC 549.281: [ParNew: 14783K->14783K(14784K), 0.0000680 secs]549.281: [CMS:
275188K->136280K(290816K), 3.7791360 secs] 289972K->136280K(305600K), 3.7795440 secs]
#
# /\[GC.*\[(DefNew|(AS)?ParNew): [0-9]+K->[0-9]+K\([0-9]+K\),.*secs\].*\[((AS)?CMS|Tenured):
[0-9]+K->[0-9]+K\([0-9]+K\),.*secs\]/ {
#
# [GC [ParNew: 7812K->7812K(8128K), 0.0000310 secs][CMS (concurrent mode failure):
382515K->384858K(385024K), 0.0657172 secs] 390327K->390327K(393152K), 0.0658860 secs]
$0 ~ "\\[GC.*\\[" fw_yng_gen_re ": " heap_report_re "\\].*\\[" fw_old_gen_re ": " heap_report_re "\\]" {
tString = $0;
tInt = sub(".*" heap_size_status_re ", ", "", tString);
tInt = sub(" secs.*", "", tString);
secs = tString + 0.0;
recordStats(gen1t_idx, secs);
# Old gen occupancy before GC.
tString = $0;
tInt = sub(".*\\[" fw_old_gen_re ": ", "", tString);
parseHeapSizes(tmp_mb, tString);
recordStats(MB_used1_idx, tmp_mb[0]);
# Heap occupancy before GC.
tInt = sub(heap_report_re "\\] ", "", tString);
parseHeapSizes(tmp_mb, tString);
recordStats(MB_usedh_idx, tmp_mb[0]);
printInterval();
next;
}
# [Full GC [Tenured: 43464K->43462K(43712K), 0.0614658 secs] 63777K->63775K(64896K), [Perm :
1460K->1459K(16384K)], 0.0615839 secs]
#
# 0.502: [Full GC 0.502: [Tenured: 43464K->43462K(43712K), 0.0724391 secs] 63777K->63775K(64896K),
[Perm : 1460K->1459K(16384K)], 0.0725792 secs]
$0 ~ full_gc_re timestamp_re ".*\\[" fw_old_gen_re ": " heap_report_re "\\] " heap_size_status_re ",
\\[((AS)?CMS )?Perm *: " heap_size_status_re "\\], " gc_time_secs_re "\\]" {
tString = $0;
tInt = sub(".*" heap_size_status_re "\\], ", "", tString);
tInt = sub(" secs.*", "", tString);
secs = tString + 0.0;
recordStats(gen1t_idx, secs);
# Old gen occupancy before GC.
tString = $0;
tInt = sub(".*\\[" fw_old_gen_re ": ", "", tString);
parseHeapSizes(tmp_mb, tString);
recordStats(MB_used1_idx, tmp_mb[0]);
# Heap occupancy before GC.
tInt = sub(heap_report_re "\\] ", "", tString);
parseHeapSizes(tmp_mb, tString);
recordStats(MB_usedh_idx, tmp_mb[0]);
printInterval();
next;
}
# Full GC (System.gc()) w/CMS when the perm gen is *not* being collected.
#
# [Full GC [ParNew: 161K->238K(12288K), 0.0014651 secs] 161K->238K(61440K), 0.0015972 secs]
# 0.178: [Full GC 0.178: [ParNew: 161K->238K(12288K), 0.0014651 secs] 161K->238K(61440K), 0.0015972 secs]
$0 ~ full_gc_re timestamp_re ".*\\[" fw_yng_gen_re ": " heap_report_re "\\] " heap_report_re "\\]" {
match($0, ".*" heap_size_status_re ", ");
tString = substr($0, RSTART + RLENGTH);
tInt = sub(" secs.*", "", tString);
secs = tString + 0.0;
recordStats(gen1t_idx, secs);
printInterval();
next;
}
# Match PrintGCDetails output for Train incremental GCs.
/\[GC.*\[(Def|Par)New: [0-9]+K->[0-9]+K\([0-9]+K\),.*secs\].*\[Train:
[0-9]+K->[0-9]+K\([0-9]+K\),.*secs\]/ {
# Young gen part.
tString = $0;
tInt = sub(/ secs.*/, "", tString);
tInt = sub(".*" heap_size_status_re "\\], ", "", tString);
secs = tString + 0.0;
recordStats(gen0t_idx, secs);
# Train incremental part.
tString = $0;
tInt = sub(".*Train: [^,]+, ", "", tString);
tInt = sub(" secs.*", "", tString);
secs = tString + 0.0;
if (plot_idx < 0) {
# Skip the update of the totgc numbers, that was handled above.
recordStatsInternal(gen1i_idx, secs);
recordStatsInternal(gen1i_idx + last_idx, secs);
} else if (plot_idx == gen1i_idx) {
recordStats(gen1i_idx, secs);
}
recordGen0Kb($0, 1);
printInterval();
next;
}
# Match PrintGCDetails output for Train Full GC.
/.*\[Train MSC: [0-9]+K->[0-9]+K\([0-9]+K\),.*secs\]/ {
# Get the last number of seconds on the line.
tString = $0;
tInt = sub(".*" heap_size_status_re ", ", "", tString);
tInt = sub(" secs.*", "", tString);
secs = tString + 0.0;
recordStats(gen1t_idx, secs);
printInterval();
next;
}
# Match PrintGCDetails output for DefNew or ParNew
#[GC [DefNew: 2880K->63K(2944K), 0.4626167 secs] 16999K->16999K(26480K), 0.4627703 secs]
# or with time stamps
# 0.431984: [GC 0.432051: [DefNew: 2112K->0K(2176K), 0.0911555 secs] 6204K->6201K(9000K), 0.0912899 secs]
# /\[GC.*\[(DefNew|(AS)?ParNew): [0-9]+K->[0-9]+K\([0-9]+K\),.*secs\]
[0-9]+K->[0-9]+K\([0-9]+K\),.*secs\]/ {
$0 ~ "\\[GC " timestamp_re "\\[" fw_yng_gen_re ": " heap_report_re "\\] " heap_report_re "\\]" {
# The first time on the line is for DefNew/ParNew gc work only.
match($0, gc_time_secs_re);
secs = substr($0, RSTART, RLENGTH - 5) + 0;
tString = substr($0, RSTART + RLENGTH);
if (plot_idx < 0) {
# Skip the update of the totgc numbers, that will be handled below.
recordStatsInternal(gen0c_idx, secs);
recordStatsInternal(gen0c_idx + last_idx, secs);
} else if (plot_idx == gen0c_idx) {
recordStats(gen0c_idx, secs);
}
# The second time is the total time, which includes prologue, epilogue and
# safepoint time.
match(tString, gc_time_secs_re);
secs = substr(tString, RSTART, RLENGTH - 5) + 0;
recordStats(gen0t_idx, secs);
recordGen0Kb($0, 0);
printInterval();
next;
}
# 17.438: [Full GC [PSYoungGen: 48K->0K(9536K)] [PSOldGen: 173K->179K(87424K)] 221K->179K(96960K)
[PSPermGen: 1928K->1928K(16384K)], 0.0824100 secs]
# 17.438: [Full GC [PSYoungGen: 48K->0K(9536K)] [ParOldGen: 173K->179K(87424K)] 221K->179K(96960K)
[PSPermGen: 1928K->1928K(16384K)], 0.0824100 secs]
$0 ~ full_gc_re "\\[PSYoungGen: +" heap_size_status_re "\\] \\[(PS|Par)OldGen: +"
heap_size_status_re "\\] " heap_size_status_re " \\[PSPermGen: +" heap_size_status_re "\\], "
gc_time_secs_re "\\]" {
match($0, gc_time_secs_re);
secs = substr($0, RSTART, RLENGTH - 5) + 0;
recordStats(gen1t_idx, secs);
# Old gen occupancy before GC.
tString = $0;
tInt = sub(".*\\[(PS|Par)OldGen: +", "", tString);
parseHeapSizes(tmp_mb, tString);
recordStats(MB_used1_idx, tmp_mb[0]);
# Heap occupancy before GC.
tInt = sub(heap_size_status_re "\\] ", "", tString);
parseHeapSizes(tmp_mb, tString);
recordStats(MB_usedh_idx, tmp_mb[0]);
printInterval();
next;
}
# [GC [PSYoungGen: 1070K->78K(1091K)] 3513K->2521K(4612K), 0.2177698 secs]
$0 ~ "\\[GC(--)? \\[PSYoungGen: +" heap_size_status_re "\\] " heap_report_re "\\]" {
match($0, gc_time_secs_re);
secs = substr($0, RSTART, RLENGTH - 5) + 0;
recordStats(gen0t_idx, secs);
recordGen0Kb($0, 0);
printInterval();
next;
}
# [GC[0: 511K->228K(1984K)], 0.0087278 secs]
# or [GC[1: 308K->230K(1984K)], 0.0212333 secs]
# or [GC[0: 8313K->8313K(8944K)][1: 8313K->8313K(8944K)], 0.2044176 secs]
# but this only handles generations 0 and 1.
#/\[GC\[.*\], [0-9][0-9.]* secs\]/ {
/\[GC.*\[.*\], [0-9][0-9.]* secs\]/ {
tString = $0;
tInt = sub(".*, ", "", tString);
tInt = sub(" secs.*", "", tString);
secs = tString + 0.0;
# If a line is for generation 1, we give it all the time.
# If it is just for generation 0, we give that generation the time.
if ($0 ~ /\[1: /) {
recordStats(gen1t_idx, secs);
} else if ($0 ~ /\[0: /) {
recordStats(gen0c_idx, secs);
recordGen0Kb($0, 0);
}
printInterval();
next;
}
# Match Garbage-First output:
# [GC pause (young), 0.0082 secs]
# or [GC pause (partial), 0.082 secs]
# or [GC pause (young) (initial mark), 0.082 secs]
# or [GC remark, 0.082 secs]
# or [GC cleanup 11M->11M(25M), 0.126 secs]
# /\[GC.*, [0-9][0-9.]* secs\]/ {
# $0 ~ g1_stw_re {
$0 ~ g1_stw_re {
match($0, gc_time_secs_re);
secs = substr($0, RSTART, RLENGTH - 5) + 0;
recordStats(gen1t_idx, secs);
printInterval();
next;
}
# Match -verbose:gc and pre-GC-interface output
# [GC 17648K->12496K(31744K), 0.0800696 secs]
# [GC-- 17648K->12496K(31744K), 0.0800696 secs]
# [ParNew 17648K->12496K(31744K), 0.0800696 secs]
# /\[(GC(--)?|(AS)?ParNew) [0-9]+K->[0-9]+K\([0-9]+K\), [0-9][0-9.]* secs\]/ {
$0 ~ "\\[(GC(--)?|" parnew_gen_re ") " heap_report_re "\\]" {
match($0, gc_time_secs_re);
secs = substr($0, RSTART, RLENGTH - 5) + 0;
recordStats(gen0c_idx, secs);
recordHeapKb($0);
printInterval();
next;
}
# Match -verbose:gc and pre-GC-interface output
# [Full GC 14538K->535K(31744K), 0.1335093 secs]
$0 ~ full_gc_re heap_report_re "\\]" {
match($0, gc_time_secs_re);
secs = substr($0, RSTART, RLENGTH - 5) + 0;
recordStats(gen1t_idx, secs);
recordHeapKb($0);
printInterval();
next;
}
# Parallel Old Gen phases.
# 0.547: [par marking phase, 0.0400133 secs]
# 0.547: [par marking main, 0.0400133 secs]
# 0.547: [par marking flush, 0.0400133 secs]
# 0.587: [summary phase, 0.0022902 secs]
# 0.590: [adjust roots, 0.0056697 secs]
# 0.596: [compact perm gen, 0.1242983 secs]
# 0.720: [draining task setup , 0.0031352 secs]
# -or-
# 0.720: [drain task setup, 0.0031352 secs]
# 0.724: [dense prefix task setup , 0.0000029 secs]
# 0.724: [steal task setup , 0.0000227 secs]
# 0.724: [par compact, 0.0154057 secs]
# 0.739: [post compact, 0.0009566 secs]
/\[pre compact *[,:] *[0-9][0-9.]* secs\]/ {
recordParOldPhaseTime($0, "pre compact", PO_precomp_idx);
}
/\[(par )?marking phase *[,:] *[0-9][0-9.]* secs\]/ {
recordParOldPhaseTime($0, "(par )?marking phase", PO_marking_idx);
}
/\[par mark *[,:] *[0-9][0-9.]* secs\]/ {
recordParOldPhaseTime($0, "par mark", PO_parmark_idx);
}
/\[marking flush *[,:] *[0-9][0-9.]* secs\]/ {
recordParOldPhaseTime($0, "marking flush", PO_mark_flush_idx);
}
/\[summary phase *[,:] *[0-9][0-9.]* secs\]/ {
recordParOldPhaseTime($0, "summary phase", PO_summary_idx);
}
/\[adjust roots *[,:] *[0-9][0-9.]* secs\]/ {
recordParOldPhaseTime($0, "adjust roots", PO_adjroots_idx);
}
/\[compact perm gen *[,:] *[0-9][0-9.]* secs\]/ {
recordParOldPhaseTime($0, "compact perm gen", PO_permgen_idx);
}
/\[compaction phase *[,:] *[0-9][0-9.]* secs\]/ {
recordParOldPhaseTime($0, "compaction phase", PO_compact_idx);
}
/\[drain(ing)? task setup *[,:] *[0-9][0-9.]* secs\]/ {
recordParOldPhaseTime($0, "drain(ing)? task setup", PO_drain_ts_idx);
}
/\[dense prefix task setup *[,:] *[0-9][0-9.]* secs\]/ {
recordParOldPhaseTime($0, "dense prefix task setup", PO_dpre_ts_idx);
}
/\[steal task setup *[,:] *[0-9][0-9.]* secs\]/ {
recordParOldPhaseTime($0, "steal task setup", PO_steal_ts_idx);
}
/\[par compact *[,:] *[0-9][0-9.]* secs\]/ {
recordParOldPhaseTime($0, "par compact", PO_parcomp_idx);
}
/\[deferred updates *[,:] *[0-9][0-9.]* secs\]/ {
recordParOldPhaseTime($0, "deferred updates", PO_deferred_idx);
}
/\[post compact *[,:] *[0-9][0-9.]* secs\]/ {
recordParOldPhaseTime($0, "post compact", PO_postcomp_idx);
}
# Match output from -XX:+TraceGCApplicationStoppedTime.
/Total time for which application threads were stopped: [0-9][0-9.]* seconds/ {
match($0, "were stopped: [0-9][0-9.]* seconds");
secs = substr($0, RSTART + 14, RLENGTH - 14 - 8) + 0;
recordStats(safept_idx, secs);
next;
}
# Match output from -XX:+TraceGCApplicationConcurrentTime.
/Application time:[ ]+[0-9][0-9.]* seconds/ {
match($0, "Application time:[ ]+[0-9][0-9.]* seconds");
secs = substr($0, RSTART + 18, RLENGTH - 18 - 8) + 0;
recordStats(apptime_idx, secs);
next;
}
# Match +TraceGen*Time output
# $1 $2 $3 $4 $5 $6 $7
/\[Accumulated GC generation [0-9]+ time [0-9.]+ secs\]/ {
if ($4 == 0) {
gc0caccum = $6 + 0;
} else if ($4 == 1) {
gc1taccum = $6 + 0;
} else {
printf("Accumulated GC generation out of bounds\n");
}
next;
}
# BEA JRockit GC times (very basic).
# Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_04-b05)
# BEA JRockit(R) (build R26.1.0-22-54592-1.5.0_04-20051213-1629-solaris-sparcv9, )
#
# [memory ] <s>-<end>: GC <before>K-><after>K (<heap>K), <pause> ms
# [memory ] <s/start> - start time of collection (seconds since jvm start)
# [memory ] <end> - end time of collection (seconds since jvm start)
# [memory ] <before> - memory used by objects before collection (KB)
# [memory ] <after> - memory used by objects after collection (KB)
# [memory ] <heap> - size of heap after collection (KB)
# [memory ] <pause> - total pause time during collection (milliseconds)
#
# [memory ] 14.229-16.647: GC 1572864K->718K (10485760K), 2418.000 ms
# [memory ] 133.299-133.568: GC 10485760K->762425K (10485760K), 269.000 ms
match($0, "\\[memory ?\\] " timestamp_range_re "GC " heap_size_status_re ", " gc_time_ms_re) {
match($0, gc_time_ms_re);
secs = substr($0, RSTART, RLENGTH - 3) / 1000.0;
recordStats(gen1t_idx, secs);
recordHeapKb($0);
# printInterval();
next;
}
# EVM allocation info.
#
# Starting GC at Tue Nov 12 10:44:23 2002; suspending threads.
# Gen[0](semi-spaces): size=12Mb(50% overhead), free=0kb, maxAlloc=0kb.
/Starting GC at .*; suspending threads/ {
evm_last_was_starting_gc = 1;
next;
}
evm_last_was_starting_gc && \
$0 ~ /Gen\[0\]\([a-z-]+\): size=[0-9]+Mb.*, free=[0-9]+kb,/ {
evm_last_was_starting_gc = 0;
match($0, "size=[0-9]+Mb");
tString = substr($0, RSTART + 5, RLENGTH - 7);
tmp_size = tString * 512; # * 1024 / 2: the size includes both semispaces
match($0, "free=[0-9]+kb");
tString = substr($0, RSTART + 5, RLENGTH - 7);
tmp_free = tString + 0;
recordStats(MB_a_idx, (tmp_size - tmp_free) / 1024);
next;
}
# EVM promotion info
#
# Gen0(semi-spaces)-GC #2030 tenure-thresh=0 42ms 0%->100% free, promoted 46712 obj's/1978kb
# /Gen0\([a-z-]+\)-GC.* promoted / {
/% free, promoted [0-9]+ obj.s\/[0-9]+kb/ {
tString = $0;
match($0, "obj.s/[0-9]+kb");
tString = substr($0, RSTART + 6, RLENGTH - 8);
kb_promo = tString + 0;
recordStats(MB_p_idx, kb_promo / 1024.0);
next;
}
function recordGCPauseEVM(idx) {
evm_last_was_starting_gc = 0;
match($0, "in [0-9]+ ms:");
tString = substr($0, RSTART + 3, RLENGTH - 7);
recordStats(idx, tString / 1000.0);
# Record committed and used stats for young gen collections.
if (idx == gen0t_idx && match($0, "ms: \\([0-9]+[MmKk][Bb],")) {
tString = substr($0, RSTART + 5, RLENGTH - 8);
evm_units = tolower(substr($0, RSTART + RLENGTH - 3, 2));
evm_factor = evm_units == "kb" ? 1024 : 1;
evm_val = tString + 0;
recordStats(MB_ch_idx, tString / evm_factor);
if (match($0, ", [0-9]+% free\\)")) {
tString = substr($0, RSTART + 2, RLENGTH - 7);
evm_val = evm_val * (100 - tString) / 100;
recordStats(MB_usedh_idx, evm_val / evm_factor);
}
}
# Application time.
if (match($0, "\\[application [0-9]+ ms")) {
tString = substr($0, RSTART + 13, RLENGTH - 16);
recordStats(apptime_idx, tString / 1000.0);
}
}
# EVM output with -verbose:gc -verbose:gc
#
# GC[0] in 50 ms: (656Mb, 80% free) -> (656Mb, 80% free) [application 353 ms requested 6 words]
# GC[1] in 3963 ms: (656Mb, 0% free) -> (656Mb, 81% free) [application 380 ms requested 28 words]
# GC[1: initial mark] in 84 ms: (656Mb, 97% free) -> (656Mb, 97% free) [application 65 ms requested 0 words]
# GC[1: remark] in 36 ms: (656Mb, 93% free) -> (656Mb, 93% free) [application 241 ms requested 0 words]
# GC[1: resize heap] in 0 ms: (656Mb, 97% free) -> (656Mb, 97% free) [application 213 ms requested 0 words]
/GC\[0\] in [0-9]+ ms: / {
recordGCPauseEVM(gen0t_idx);
printInterval();
next;
}
/GC\[1\] in [0-9]+ ms: / {
recordGCPauseEVM(gen1t_idx);
next;
}
/GC\[1: initial mark\] in [0-9]+ ms: / {
recordGCPauseEVM(cmsIM_idx);
next;
}
/GC\[1: remark\] in [0-9]+ ms: / {
recordGCPauseEVM(cmsRM_idx);
next;
}
/GC\[1: resize heap\] in [0-9]+ ms: / {
recordGCPauseEVM(cmsRS_idx);
next;
}
END {
# No summary stats if plotting.
if (plot_idx >= 0) exit(0);
if (interval >= 0) print "";
printHeader();
for (i = 0; i < last_idx; ++i) {
if (count_v[i] > 0) {
printData(i);
}
}
if (lastTimeStamp != firstTimeStamp) {
# Elapsed time.
tot_time = lastTimeStamp - firstTimeStamp;
# Elapsed time scaled by cpus.
tot_cpu_time = tot_time * cpus;
# Sequential gc time scaled by cpus.
seq_gc_cpu_time = sum_v[totgc_idx] * cpus;
# Concurrent gc time (scaling not necessary).
cms_gc_cpu_time = sum_v[cmsCM_idx] + sum_v[cmsCP_idx] + \
sum_v[cmsCS_idx] + sum_v[cmsCR_idx];
# Cpu time available to mutators.
mut_cpu_time = tot_cpu_time - seq_gc_cpu_time - cms_gc_cpu_time;
print "";
printRate("alloc/elapsed_time",
sum_v[MB_a_idx], "MB", tot_time, "s");
printRate("alloc/tot_cpu_time",
sum_v[MB_a_idx], "MB", tot_cpu_time, "s");
printRate("alloc/mut_cpu_time",
sum_v[MB_a_idx], "MB", mut_cpu_time, "s");
printRate("promo/elapsed_time",
sum_v[MB_p_idx], "MB", tot_time, "s");
printRate("promo/gc0_time",
sum_v[MB_p_idx], "MB", sum_v[gen0t_idx], "s");
printPercent("gc_seq_load",
seq_gc_cpu_time, "s", tot_cpu_time, "s");
printPercent("gc_conc_load",
cms_gc_cpu_time, "s", tot_cpu_time, "s");
printPercent("gc_tot_load",
seq_gc_cpu_time + cms_gc_cpu_time, "s", tot_cpu_time, "s");
}
if (gc0caccum != 0 || gc1taccum != 0) {
genAccum = gc0caccum + gc1taccum;
printf("Accum\t%7.3f\t\t\t%7.3f\t\t\t%7.3f\n",
gc0caccum, gc1taccum, genAccum);
gc0cdelta = gc0cseconds - gc0caccum;
gc1tdelta = gc1tseconds - gc1taccum;
gcDelta = gcSeconds - genAccum;
printf("Delta\t%7.3f\t\t\t%7.3f\t\t\t%7.3f\n",
gc0cdelta, gc1tdelta, gcDelta);
}
}
#! /bin/nawk -f
# Usage: CompareGCStats file1 file2
#
# where file1 and file2 are the output from PrintGCstats runs.
BEGIN {
# Keep these in sync with PrintGCStats.
i = 0;
name_v[i++] = "gen0(s)";
name_v[i++] = "gen0t(s)";
name_v[i++] = "gen1i(s)";
name_v[i++] = "gen1t(s)";
name_v[i++] = "cmsIM(s)";
name_v[i++] = "cmsRM(s)";
name_v[i++] = "cmsRS(s)";
name_v[i++] = "GC(s)";
name_v[i++] = "cmsCM(s)";
name_v[i++] = "cmsCP(s)";
name_v[i++] = "cmsCS(s)";
name_v[i++] = "cmsCR(s)";
name_v[i++] = "alloc(MB)";
name_v[i++] = "promo(MB)";
name_v[i++] = "used0(MB)";
name_v[i++] = "used1(MB)";
name_v[i++] = "used(MB)";
name_v[i++] = "commit0(MB)";
name_v[i++] = "commit1(MB)";
name_v[i++] = "commit(MB)";
name_v[i++] = "safept(s)";
name_v[i++] = "apptime(s)";
# Par Old phases from PrintParallelOldGCPhaseTimes.
name_v[i++] = "precomp(s)";
name_v[i++] = "marking(s)";
name_v[i++] = "parmark(s)";
name_v[i++] = "mark-f(s)";
name_v[i++] = "summary(s)";
name_v[i++] = "adjroots(s)";
name_v[i++] = "permgen(s)";
name_v[i++] = "compact(s)";
name_v[i++] = "drain_ts(s)";
name_v[i++] = "dpre_ts(s)";
name_v[i++] = "steal_ts(s)";
name_v[i++] = "parcomp(s)";
name_v[i++] = "deferred(s)";
name_v[i++] = "postcomp(s)";
last_basic_idx = i;
name_v[i++] = "alloc/elapsed_time";
name_v[i++] = "alloc/tot_cpu_time";
name_v[i++] = "alloc/mut_cpu_time";
name_v[i++] = "promo/elapsed_time";
name_v[i++] = "promo/gc0_time";
last_rate_idx = i;
name_v[i++] = "gc_seq_load";
name_v[i++] = "gc_conc_load";
name_v[i++] = "gc_tot_load";
last_idx = i;
headfmt = "%-11s" " %7s" " %13s" " %12s" " %12s" " %9s" "\n";
datafmt = "%-11s" " %7.3f%%" " %13.3f%%" " %12.5f%%" " %12.3f%%" " %9.4f%%""\n";
ratefmt = " %7.3f%%\n"
file_num = -1;
prev_line_idx = last_idx;
initarrays(0);
initarrays(1);
}
function initarrays(fileno) {
for (ia_idx = 0; ia_idx < last_idx; ++ia_idx) {
filename_v[fileno] = FILENAME;
line_v[fileno, ia_idx] = "";
count_v[fileno, ia_idx] = 0;
sum_v[fileno, ia_idx] = 0.0;
mean_v[fileno, ia_idx] = 0.0;
max_v[fileno, ia_idx] = 0.0;
std_dev_v[fileno, ia_idx] = 0.0;
}
}
function percent_change(orig, new) {
if (orig == 0.0 || new == 0.0) return 0.0;
return (new - orig) * 100.0 / orig;
}
function compare_basic(idx) {
printf(datafmt, " ",
percent_change(count_v[0, idx], count_v[1, idx]),
percent_change(sum_v[0, idx], sum_v[1, idx]),
percent_change(mean_v[0, idx], mean_v[1, idx]),
percent_change(max_v[0, idx], max_v[1, idx]),
percent_change(std_dev_v[0, idx], std_dev_v[1, idx]));
}
function record_basic_line(line, idx, pos) {
line_v[file_num, idx] = line;
count_v[file_num, idx] = $2 + 0.0;
sum_v[file_num, idx] = $3 + 0.0;
mean_v[file_num, idx] = $4 + 0.0;
max_v[file_num, idx] = $5 + 0.0;
std_dev_v[file_num, idx] = $6 + 0.0;
}
function record_rate_line(line, idx, pos) {
line_v[file_num, idx] = line;
sum_v[file_num, idx] = $(NF - 1) + 0.0;
}
function record_load_line(line, idx, pos) {
line_v[file_num, idx] = line;
tmpstr = $NF;
sub("%", "", tmpstr);
sum_v[file_num, idx] = tmpstr + 0.0;
}
function make_zero_line(str) {
mzl_label = str;
sub(" .*$", "", mzl_label);
mzl_tmp = str;
sub("^[^ ]+", "", mzl_tmp);
gsub("[0-9]", "0", mzl_tmp);
return mzl_label mzl_tmp;
}
function compare() {
printf("%s vs. %s\n", filename_v[0], filename_v[1]);
printf(headfmt, "what", "count", "total", "mean", "max", "stddev");
for (c_idx = 0; c_idx < last_idx; ++c_idx) {
if (line_v[0, c_idx] != "" || line_v[1, c_idx] != "") {
if (c_idx == last_basic_idx) print "";
if (line_v[0, c_idx] == "") {
line_v[0, c_idx] = \
make_zero_line(line_v[1, c_idx]);
} else if (line_v[1, c_idx] == "") {
line_v[1, c_idx] = \
make_zero_line(line_v[0, c_idx]);
}
print line_v[0, c_idx];
printf("%s", line_v[1, c_idx]); # No newline.
if (c_idx < last_basic_idx) {
print "";
compare_basic(c_idx);
} else if (c_idx < last_rate_idx) {
printf(ratefmt,
percent_change(sum_v[0, c_idx],
sum_v[1, c_idx]));
} else {
printf(ratefmt,
sum_v[1, c_idx] - sum_v[0, c_idx]);
}
}
}
}
{
for (cur_line_idx = 0; cur_line_idx < last_idx; ++cur_line_idx) {
pos = index($0, name_v[cur_line_idx]);
if (pos > 0) {
if (cur_line_idx <= prev_line_idx) {
# Starting a new input set.
if (++file_num == 2) {
compare();
print "";
--file_num;
}
initarrays(file_num);
}
if (cur_line_idx < last_basic_idx) {
record_basic_line($0, cur_line_idx, pos);
} else if (cur_line_idx < last_rate_idx) {
record_rate_line($0, cur_line_idx, pos);
} else {
record_load_line($0, cur_line_idx, pos);
}
prev_line_idx = cur_line_idx;
continue;
}
}
}
END { compare(); }
.
The comments at the top have a bit more info.
PrintGCStats
Scans GC logs and prints statistics about GC and allocation. Works
best if the logs were created with -XX:+PrintGCDetails and
-XX:+PrintGCTimeStamps. Also has some basic support for creating data
files for plotting w/gnuplot or StarOffice. See the comments at the
top for usage.
CompareGCStats
Compares the output from two runs of PrintGCStats (presumably before
and after) and shows the percent change for the various statistics.
Basic Examples
Here's how I typically use these, given 2 log files with
-XX:+PrintGCDetails -XX:+PrintGCTimeStamps output (say ref.log is from
the reference VM, new.log is from a modified VM that I'm evaluating):
$ PrintGCFixup ref.log > ref.fix
$ PrintGCFixup new.log > new.fix
$ PrintGCStats -v cpus=4 ref.fix > ref.stats
$ PrintGCStats -v cpus=4 new.fix > new.stats
$ CompareGCStats ref.stats new.stats > new.comp
Below is an example of the result, new.comp. In the pairs of lines that
have the same label, the first is from ref.stats and the second is from
new.stats; the unlabeled line that follows shows the %change. E.g.,
this run shows that the average 'gen1t(s)' (Full GC time in seconds),
improved by 14% and the max improved by 24%.
ref.stats vs. new.stats
what count total mean max stddev
gen0t(s) 1479 167.596 0.11332 0.452 0.0906
gen0t(s) 1487 169.604 0.11406 0.463 0.0910
0.541% 1.198% 0.65302% 2.434% 0.4415%
gen1t(s) 80 95.982 1.19977 1.903 0.1945
gen1t(s) 90 92.594 1.02882 1.430 0.1206
12.500% -3.530% -14.24856% -24.855% -37.9949%
GC(s) 1559 263.577 0.16907 1.903 0.2592
GC(s) 1577 262.198 0.16626 1.430 0.2317
1.155% -0.523% -1.66203% -24.855% -10.6096%
alloc(MB) 1479 96679.169 65.36793 66.000 2.5605
alloc(MB) 1487 97085.999 65.28984 66.000 2.7041
0.541% 0.421% -0.11946% 0.000% 5.6083%
promo(MB) 1479 3235.751 2.18780 34.328 4.3802
promo(MB) 1487 3471.243 2.33439 34.305 4.4127
0.541% 7.278% 6.70034% -0.067% 0.7420%
used(MB) 1479 235279.202 159.07992 244.425 66.0007
used(MB) 1487 236180.818 158.83041 243.941 65.8061
0.541% 0.383% -0.15685% -0.198% -0.2948%
commit0(MB) 1479 113883.000 77.00000 77.000 0.0000
commit0(MB) 1487 114499.000 77.00000 77.000 0.0000
0.541% 0.541% 0.00000% 0.000% 0.0000%
commit1(MB) 1479 254388.000 172.00000 172.000 0.0000
commit1(MB) 1487 255764.000 172.00000 172.000 0.0000
0.541% 0.541% 0.00000% 0.000% 0.0000%
commit(MB) 1479 368271.000 249.00000 249.000 0.0000
commit(MB) 1487 370263.000 249.00000 249.000 0.0000
0.541% 0.541% 0.00000% 0.000% 0.0000%
alloc/elapsed_time = 96679.169 MB / 4920.718 s = 19.647 MB/s
alloc/elapsed_time = 97085.999 MB / 4920.171 s = 19.732 MB/s 0.433%
alloc/tot_cpu_time = 96679.169 MB / 19682.872 s = 4.912 MB/s
alloc/tot_cpu_time = 97085.999 MB / 19680.684 s = 4.933 MB/s 0.428%
alloc/mut_cpu_time = 96679.169 MB / 18628.563 s = 5.190 MB/s
alloc/mut_cpu_time = 97085.999 MB / 18631.892 s = 5.211 MB/s 0.405%
promo/elapsed_time = 3235.751 MB / 4920.718 s = 0.658 MB/s
promo/elapsed_time = 3471.243 MB / 4920.171 s = 0.706 MB/s 7.295%
promo/gc0_time = 3235.751 MB / 167.596 s = 19.307 MB/s
promo/gc0_time = 3471.243 MB / 169.604 s = 20.467 MB/s 6.008%
gc_seq_load = 1054.309 s / 19682.872 s = 5.356%
gc_seq_load = 1048.792 s / 19680.684 s = 5.329% -0.027%
gc_conc_load = 0.000 s / 19682.872 s = 0.000%
gc_conc_load = 0.000 s / 19680.684 s = 0.000% 0.000%
gc_tot_load = 1054.309 s / 19682.872 s = 5.356%
gc_tot_load = 1048.792 s / 19680.684 s = 5.329% -0.027%
Note that the alloc (allocation) and promo (promotion) statistics are
not exact; they're useful only as a general indication of how much
allocation/promotion was done.
You can aggregate the statistics from multiple runs by including
additional file names on the command line.
$ PrintGCStats -v cpus=4 ref-1.fix ref-2.fix ref-3.fix > ref.stats
$ PrintGCStats -v cpus=4 new-1.fix new-2.fix new-3.fix > new.stats
> Maybe the openJDK project
> needs this kind of script to generate some visual display, or
> something else similar to the GCSpy
> (
RSS Feed