java - CMS collector not keeping pace with Old Gen -


on moderately busy production server (50 app threads, 30% cpu utilisation), we're seeing scenario cms collector doesn't keep pace objects promoted old generation.

my initial thoughts these objects still referenced, not eligible collection - when old gen fills , prompts serial collection, 5.5 gib of 6 gib recovered.

the eden space sized @ 3 gib, , takes around 20-30 seconds fill enough prompt young collection. survivor space usage fluctuates between 800 - 1250 mib, 1.5 gib maximum (each).

with objects in old gen eligible collection, , server having plenty of (apparent) resources, don't understand why cms collector isn't keeping on top of old gen size:

visual vm

what cause scenario , there solutions?

i'm aware of occupancy fraction, don't understand implications of cmsincrementalsafetyfactor - i've read oracle documentation, don't know "add[ing] conservatism when computing duty cycle" means..?

alternatives

switching parallel / throughput collector yields low gc overhead (1.8%) leaves occasional (50 times per day) long pauses - around 20 seconds each full gc. tuning, isn't meet our max pause target.

in ideal world, we'd able experiment g1 collector, various reasons stuck java 6 jvm.

when cms collector doesn't keep pace object promotion rate, means should should see "concurrent mode failures" in gc logs. when cms collector "loses race" , run out of memory before completes.

2014-02-27t01:09:52.408-0600: 847.004: [gc 847.005: [parnew  (promotion failed) desired survivor size 78512128 bytes, new threshold 2 (max 15) - age   1:   60284680 bytes,   60284680 total - age   2:   32342648 bytes,   92627328 total : 1380096k->1380096k(1380096k), 0.7375510 secs]847.743:  [cms2014-02-27t01:09:54.133-0600: 848.729: [cms-concurrent-s weep: 5.467/6.765 secs] [times: user=21.59 sys=0.73, real=6.76  secs]   (concurrent mode failure): 2363866k->1763900k(4409856k), 10.6658960 secs] 3697627k->1763900k(5789952k), [cms perm :  118666k->117980k(125596k)], 11.4061610 secs]  [times: user=11.34 sys=0.02, real=11.57 secs] 

by default, cms collector trigger @ 92% occupancy in old generation. judging memory growth rate in graph of old generation usage, growing @ 500 mb every 5 minutes. 92% of 6gb gives 500 mb of headroom, means cms has win race in less 5 minutes, will. unless...

...you have happening behind scenes other smooth traffic profile see in graph. example, have background processes refresh in-memory data structures caches? types of activities create sudden, huge amount of new, long-lived objects need promoted old gen. make smooth graph go vertical of sudden, , can exhaust available memory. cms collector @ handling smooth, steady traffic, vulnerable quick bursts of activity. @ responding gradual changes in garbage generation rate, cannot anticipate "bursty" behavior, , have seen many cases cause lose race.

apart avoiding background processes produce sudden bursts of new objects, can give cms collector head start decreasing cmsinitiatingoccupancyfraction parameter down between 60-80 rather default of 92%.

http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html#cms.starting_a_cycle

also, keep eye on permgen space too. unlike parallel throughput collector, cms collector not collect permgen default, if ever fills up, end stop-the-world full gc. parameter makes cms collector collect permgen space well: cmsclassunloadingenabled.

other that, recommend turning on gc logging , setting: -xx:+printgcdetails prints details every minor , major garbage collection

this great parameter lets see every jvm setting @ startup: -xx:+printflagsfinal prints value of jvm configuration options @ startup


Comments

Popular posts from this blog

qt - Using float or double for own QML classes -

Create Outlook appointment via C# .Net -

ios - Swift Array Resetting Itself -