Can jmap -histo trigger full garbage collection? - garbage-collection

We know that jmap -histo:live triggers a full gc in order to determine live objects:
Does jmap force garbage collection when the live option is used?
Since jmap -histo considers all objects in the heap (those in the young and old generation), my point is, jmap -histo can also trigger a full gc, too. However, I could not encounter a solid documentation about whether jmap -histo may trigger a full gc or not.
Can jmap -histo trigger full garbage collection?

jmap -histo will not trigger a full gc, but jmap -histo:live will.

Someone with more JDK experience should verify this, but I'm fairly confident it does trigger full GC, at least in OpenJDK 1.7. Start with jdk/src/share/classes/sun/tools/jmap/JMap.java:
public class JMap {
...
private static String LIVE_HISTO_OPTION = "-histo:live";
...
...
} else if (option.equals(LIVE_HISTO_OPTION)) {
histo(pid, true);
...
private static final String LIVE_OBJECTS_OPTION = "-live";
private static final String ALL_OBJECTS_OPTION = "-all";
private static void histo(String pid, boolean live) throws IOException {
VirtualMachine vm = attach(pid);
InputStream in = ((HotSpotVirtualMachine)vm).
heapHisto(live ? LIVE_OBJECTS_OPTION : ALL_OBJECTS_OPTION);
drain(vm, in);
}
The ternary operator in Jmap.histo() makes a call to heapHisto in jdk/src/share/classes/sun/tools/attach/HotSpotVirtualMachine.java with the -live argument:
// Heap histogram (heap inspection in HotSpot)
public InputStream heapHisto(Object ... args) throws IOException {
return executeCommand("inspectheap", args);
}
And if we look at inspectheap itself, in hotspot/src/share/vm/services/attachListener.cpp:
// Implementation of "inspectheap" command
//
// Input arguments :-
// arg0: "-live" or "-all"
static jint heap_inspection(AttachOperation* op, outputStream* out) {
bool live_objects_only = true; // default is true to retain the behavior before this change is made
const char* arg0 = op->arg(0);
if (arg0 != NULL && (strlen(arg0) > 0)) {
if (strcmp(arg0, "-all") != 0 && strcmp(arg0, "-live") != 0) {
out->print_cr("Invalid argument to inspectheap operation: %s", arg0);
return JNI_ERR;
}
live_objects_only = strcmp(arg0, "-live") == 0;
}
VM_GC_HeapInspection heapop(out, live_objects_only /* request full gc */, true /* need_prologue */);
VMThread::execute(&heapop);
return JNI_OK;
}
Note, in particular, the live_objects_only strcmp and the resulting heapop call two lines later. If inspectheap gets the -live argument via any avenue, it requests a full gc.

No, jmap -histo will not trigger a FullGC. I am printing Histogram quite regularly and do not see any Full GCs in my GC logs.
I do not know how it is implemented in the VM but you do not need to worry about full GCs.

In my experience:yes,it is.when you do this experiment,you can use the command:
sudo -utomcat jstat -gcutil {PID} 1000 1000
description:
The first parameter 1000 after pid is time interval for print.
second parameter 1000 after pid is loop count.
use this command you can monitor the jvm gc activity.you can see the full gc time and count like below:
S0 S1 E O P YGC YGCT FGC FGCT GCT
0.00 18.45 13.12 84.23 47.64 206149 5781.308 83 115.479 5896.786
0.00 21.84 5.64 84.24 47.64 206151 5781.358 83 115.479 5896.837
0.00 32.27 1.66 84.24 47.64 206153 5781.409 83 115.479 5896.888
0.00 13.96 53.54 84.24 47.64 206155 5781.450 83 115.479 5896.929
0.00 21.56 91.77 84.24 47.64 206157 5781.496 83 115.479 5896.974
and now you can execute the jmap command in other terminal,firstly,you execute the command without :live parameter and then execute it again with this parameter ,you should see a full gc activity when the command is executed with ;live parameter , in other word , the full gc count will increment.
The second command maybe like this:
sudo -u tomcat /home/path/to/jmap -histo:live {pid} | head -n 40
By the way,my JDK version is JDK7

Related

gtk_css_value_inherit_free: code should not be reached

I making one plugin with Python for Rhythmbox.
I get an randomly error when start the plugin. Then of some seconds I reset Rhythmbox and the plugin run ok.
What will the cases that probably originy the error?
Error:
gtkcssinheritvalue.c:33:gtk_css_value_inherit_free: code should not be reached
gtkcssinheritvalue.c:
29 static void
30 gtk_css_value_inherit_free (GtkCssValue *value)
31 {
32 /* Can only happen if the unique value gets unreffed too often */
33 g_assert_not_reached ();
34 }
https://github.com/GNOME/gtk/blob/gtk-3-6/gtk/gtkcssinheritvalue.c
All suggestion is welcome. Thanks.

DocumentDB performance issues

When running from DocumentDB queries from C# code on my local computer a simple DocumentDB query takes about 0.5 seconds in average. Another example, getting a reference to a document collection takes about 0.7 seconds in average. Is this to be expected? Below is my code for checking if a collection exists, it is pretty straight forward - but is there any way of improving the bad performance?
// Create a new instance of the DocumentClient
var client = new DocumentClient(new Uri(EndpointUrl), AuthorizationKey);
// Get the database with the id=FamilyRegistry
var database = client.CreateDatabaseQuery().Where(db => db.Id == "FamilyRegistry").AsEnumerable().FirstOrDefault();
var stopWatch = new Stopwatch();
stopWatch.Start();
// Get the document collection with the id=FamilyCollection
var documentCollection = client.CreateDocumentCollectionQuery("dbs/"
+ database.Id).Where(c => c.Id == "FamilyCollection").AsEnumerable().FirstOrDefault();
stopWatch.Stop();
// Get the elapsed time as a TimeSpan value.
var ts = stopWatch.Elapsed;
// Format and display the TimeSpan value.
var elapsedTime = String.Format("{0:00} seconds, {1:00} milliseconds",
ts.Seconds,
ts.Milliseconds );
Console.WriteLine("Time taken to get a document collection: " + elapsedTime);
Console.ReadKey();
Average output on local computer:
Time taken to get a document collection: 0 seconds, 752 milliseconds
In another piece of my code I'm doing 20 small document updates that are about 400 bytes each in JSON size and it still takes 12 seconds in total. I'm only running from my development environment but I was expecting better performance.
In short, this can be done end to end in ~9 milliseconds with DocumentDB. I'll walk through the changes required, and why/how they impact results below.
The very first query always takes longer in DocumentDB because it does some setup work (fetching physical addresses of DocumentDB partitions). The next couple requests take a little bit longer to warm the connection pools. The subsequent queries will be as fast as your network (the latency of reads in DocumentDB is very low due to SSD storage).
For example, if you modify your code above to measure, for example 10 readings instead of just the first one like shown below:
using (DocumentClient client = new DocumentClient(new Uri(EndpointUrl), AuthorizationKey))
{
long totalRequests = 10;
var database = client.CreateDatabaseQuery().Where(db => db.Id == "FamilyRegistry").AsEnumerable().FirstOrDefault();
Stopwatch watch = new Stopwatch();
for (int i = 0; i < totalRequests; i++)
{
watch.Start();
var documentCollection = client.CreateDocumentCollectionQuery("dbs/"+ database.Id)
.Where(c => c.Id == "FamilyCollection").AsEnumerable().FirstOrDefault();
Console.WriteLine("Finished read {0} in {1}ms ", i, watch.ElapsedMilliseconds);
watch.Reset();
}
}
Console.ReadKey();
I get the following results running from my desktop in Redmond against the Azure West US data center, i.e. about 50 milliseconds. These numbers may vary based on the network connectivity and distance of your client from the Azure DC hosting DocumentDB:
Finished read 0 in 217ms
Finished read 1 in 46ms
Finished read 2 in 51ms
Finished read 3 in 47ms
Finished read 4 in 46ms
Finished read 5 in 93ms
Finished read 6 in 48ms
Finished read 7 in 45ms
Finished read 8 in 45ms
Finished read 9 in 51ms
Next, I switch to Direct/TCP connectivity from the default of Gateway to improve the latency from two hops to one, i.e., change the initialization code to:
using (DocumentClient client = new DocumentClient(new Uri(EndpointUrl), AuthorizationKey, new ConnectionPolicy { ConnectionMode = ConnectionMode.Direct, ConnectionProtocol = Protocol.Tcp }))
Now the operation to find the collection by ID completes within 23 milliseconds:
Finished read 0 in 197ms
Finished read 1 in 117ms
Finished read 2 in 23ms
Finished read 3 in 23ms
Finished read 4 in 25ms
Finished read 5 in 23ms
Finished read 6 in 31ms
Finished read 7 in 23ms
Finished read 8 in 23ms
Finished read 9 in 23ms
How about when you run the same results from an Azure VM or Worker Role also running in the same Azure DC? The same operation completes with about 9 milliseconds!
Finished read 0 in 140ms
Finished read 1 in 10ms
Finished read 2 in 8ms
Finished read 3 in 9ms
Finished read 4 in 9ms
Finished read 5 in 9ms
Finished read 6 in 9ms
Finished read 7 in 9ms
Finished read 8 in 10ms
Finished read 9 in 8ms
Finished read 9 in 9ms
So, to summarize:
For performance measurements, please allow for a few measurement samples to account for startup/initialization of the DocumentDB client.
Please use TCP/Direct connectivity for lowest latency.
When possible, run within the same Azure region.
If you follow these steps, you can get great performance and you'll be able to get the best performance numbers with DocumentDB.

Force lshosts command to return megabytes for "maxmem" and "maxswp" parameters

When I type "lshosts" I am given:
HOST_NAME type model cpuf ncpus maxmem maxswp server RESOURCES
server1 X86_64 Intel_EM 60.0 12 191.9G 159.7G Yes ()
server2 X86_64 Intel_EM 60.0 12 191.9G 191.2G Yes ()
server3 X86_64 Intel_EM 60.0 12 191.9G 191.2G Yes ()
I am trying to return maxmem and maxswp as megabytes, not gigabytes when lshosts is called. I am trying to send Xilinx ISE jobs to my LSF, however the software expects integer, megabyte values for maxmem and maxswp. By doing debugging, it appears that the software grabs these parameters using the lshosts command.
I have already checked in my lsf.conf file that:
LSF_UNIT_FOR_LIMTS=MB
I have tried searching the IBM Knowledge Base, but to no avail.
Do you use a specific command to specify maxmem and maxswp units within the lsf.conf, lsf.shared, or other config files?
Or does LSF force return the most practical unit?
Any way to override this?
LSF_UNIT_FOR_LIMITS should work, if you completely drained the cluster of all running, pending, and finished jobs. According to the docs, MB is the default, so I'm surprised.
That said, you can use something like this to transform the results:
$ cat to_mb.awk
function to_mb(s) {
e = index("KMG", substr(s, length(s)))
m = substr(s, 0, length(s) - 1)
return m * 10^((e-2) * 3)
}
{ print $1 " " to_mb($6) " " to_mb($7) }
$ lshosts | tail -n +2 | awk -f to_mb.awk
server1 191900 159700
server2 191900 191200
server3 191900 191200
The to_mb function should also handle 'K' or 'M' units, should those pop up.
If LSF_UNIT_FOR_LIMITS is defined in lsf.conf, lshosts will always print the output as a floating point number, and in some versions of LSF the parameter is defined as 'KB' in lsf.conf upon installation.
Try searching for any definitions of the parameter in lsf.conf and commenting them all out so that the parameter is left undefined, I think in that case it defaults to printing it out as an integer in megabytes.
(Don't ask me why it works this way)

Oprofile warning "could not check that the binary file"

We do profiling kernel modules with Oprofile and there is a warning in the opreport as follow
warning: could not check that the binary file /lib/modules/2.6.32-191.el6.x86_64/kernel/fs/ext4/ext4.ko has not been modified since the profile was taken. Results may be inaccurate.
1591 samples % symbol name
1592 1622 9.8381 ext4_iget
1593 1591 9.6500 ext4_find_entry
1594 1231 7.4665 __ext4_get_inode_loc
1595 783 4.7492 ext4_ext_get_blocks
1596 752 4.5612 ext4_check_dir_entry
1597 644 3.9061 ext4_mark_iloc_dirty
1598 583 3.5361 ext4_get_blocks
1599 583 3.5361 ext4_xattr_get
May anyone please explain what is the warning, does it impact the accuracy of the oprofile output and is there anyway to avoid this warning?
Any suggestions are appreciated. Thank a lot!
Add more information:
In daemon/opd_mangling.c
if (!sf->kernel)
binary = find_cookie(sf->cookie);
else
binary = sf->kernel->name;
...
fill_header(odb_get_data(file), counter,
sf->anon ? sf->anon->start : 0, last_start,
!!sf->kernel, last ? !!last->kernel : 0,
spu_profile, sf->embedded_offset,
binary ? op_get_mtime(binary) : 0);
For kernel module file, the sf->kernel->name is the kernel module name, so the fill header will always fill mtime with 0 and generate the unwanted warning
This failure indicates that a stat of the file in question failed. Do an strace -e stat to see the specific failure mode.
time_t op_get_mtime(char const * file)
{
struct stat st;
if (stat(file, &st))
return 0;
return st.st_mtime;
}
...
if (!header.mtime) {
// FIXME: header.mtime for JIT sample files is 0. The problem could be that
// in opd_mangling.c:opd_open_sample_file() the call of fill_header()
// think that the JIT sample file is not a binary file.
if (is_jit_sample(file)) {
cverb << vbfd << "warning: could not check that the binary file "
<< file << " has not been modified since "
"the profile was taken. Results may be inaccurate.\n";
does it impact the accuracy of the oprofile output and is there anyway to avoid this warning?
Yes, it impacts the output in that it has no opportunity to warn you whether "the last modified time of the binary file does not match that of the sample file...". As long as you're certain what you measured then matches which binary is installed now, the warning you're seeing is harmless.

How to get the parent thread in WinDBG?

When I analyzed a crush dump file, I often got such errors:
0:025> kP
Child-SP RetAddr Call Site
00000000`05a4fc78 00000000`77548638 ntdll!DbgBreakPoint(void) [d:\w7rtm\minkernel\ntos\rtl\amd64\debugstb.asm # 51]
00000000`05a4fc80 00000000`774b39cb ntdll!DbgUiRemoteBreakin(
void * Context = 0x00000000`00000000)+0x38 [d:\w7rtm\minkernel\ntdll\dlluistb.c # 310]
00000000`05a4fcb0 00000000`00000000 ntdll!RtlUserThreadStart(
<function> * StartAddress = 0x00000000`00000000,
void * Argument = 0x00000000`00000000)+0x25 [d:\w7rtm\minkernel\ntos\rtl\rtlexec.c # 3179]
It seems that the process crushed when creating a thread. So, I want to find who or which thread created the current thread. How can I get it?
You can look at the other threads in the process with ~*k to see if there's anything interesting. Other than that, this info simply isn't there in the dump.
-scott

Resources