Solr uses too much memory - memory-leaks

We have a Solr 3.4 instance running on Windows 2008 R2 with Oracle Java 6 Hotspot JDK that becomes unresponsive. When we looked at the machine, we noticed that the available physical memory went to zero.
The Tomcat7.exe process was using ~70Gigs (Private Working Set) but Working Set (Memory) was using all the memory on the system. There were no errors in the Tomcat / Solr logs. We used VMMap to identify that the memory was being used for memory mapping the Solr segement files.
Restarting Tomcat fixed the problem temporarily, but it eventually came back.
We then tried decreasing the JVM size to give more space for the memory mapped files, but then the Solr eventually becomes unresponsive with the old generation at 100%. Again resetting fixed the problem, but it did not throw an out-of-memory exception before we reset.
Currently our spidey sense is telling us that the cache doesn't shrink when there is memory pressure, and that maybe there are too many MappedByteBuffers hanging around so that the OS can not free up the memory from memory mapped files.

There are too many parameters and too little information to help with any details. This answer is also quite old as are the mentioned systems.
Here are some things that helped in my experience:
rather decrease RAM usage in Tomcat as well as in SOLR to decrease the risk of swapping. Leave the system space to breath.
if this "starts" to appear without any changes to Tomcat or the SOLR config - than maybe it is due to the fact that the amount of data that SOLR has to index and query has increased. This can either mean that the original config was never good to begin with or that the limit of the current resources have been reached and have to be reviewed. Which one is it?
check the queries (if you can influence them): move any subquery constructs that are often requested into filterqueries, move very individual request constructs into the regular query parameter. Decrease query cache, increase/keep filter query cache - or decrease filter cache in case filter queries aren't used that much in your system.
check the schema.xml of SOLR on configuration errors (could just be misconceptions really). I ran into this once: while importing fields would be created en masse causing the RAM to overflow.
if it happens during import: check whether the import process is set to autocommit and commits rather often and does also optimize - maybe it is possible to commit less often and optimize only once at the end.
upgrade Java, Tomcat and SOLR

It's probably due to the new default policy SOLR uses for directories (it tries to map them to RAM or something like that).
Read this:
http://grokbase.com/t/lucene/solr-user/11789qq3nm/virtual-memory-usage-increases-beyond-xmx-with-solr-3-3

Related

Does RAM affect the time taken to sort an array?

I have an array of a 500k to million items to be sorted. Does going with a configuration of increased RAM be beneficial or not, say 8GB to 32GB or above. Im using a node.JS/mongoDB environment.
Adding RAM for an operation like that would only make a difference if you have filled up the available memory with everything that was running on your computer and the OS was swapping data out to disk to make room for your sort operation. Chances are, if that was happening, you would know because your computer would become pretty sluggish.
So, you just need enough memory for the working set of whatever applications you're running and then enough memory to hold the data you are sorting. Adding additional memory beyond that will not make any difference.
If you had an array of a million numbers to be sorted in Javascript, that array would likely take (1,000,000 * 8 bytes per number) + some overhead for a JS data structure = ~8MB. If your array values were larger than 8 bytes, then you'd have to account for that in the calculation, but hopefully you can see that this isn't a ton of memory in a modern computer.
If you have only an 8GB system and you have a lot of services and other things configured in it and are perhaps running a few other applications at the time, then it's possible that by the time you run nodejs, you don't have much free memory. You should be able to look at some system diagnostics to see how much free memory you have. As long as you have some free memory and are not causing the system to do disk swapping, adding more memory will not increase performance of the sort.
Now, if the data is stored in a database and you're doing some major database operation (such as creating a new index), then it's possible that the database may adjust how much memory it can use based on how much memory is available and it might be able to go faster by using more RAM. But, for a Javascript array which is already all in memory and is using a fixed algorithm for the sort, this would not be the case.

Why swap needs to be turned off in Datastax Cassandra?

I am new to Datastax cassandra. While going through the installation procedure of cassandra. It is recommended that the swap area of OS should be turned off. Does anyone provide the reason for that? Will it affect any OS level operations ?
In production, if your database is using swap you will have very bad performance. In a ring of Cassandra nodes, you are better off having one node go completely down than allowing it to limp along in swap.
The easiest way to ensure that you never go into swap is to simply disable it.
If you dont disable swap space, when there is an Out of Memory(when the address space is used up by cassandra mmap) problem at OS level, the OS will try to take a slice of the JVM which is in turn actually trying to clear it via JNI by default. Now , your JVM is slowed down as a slice of its heap memory is lost. Now GC will be happening along with the cassandra write operation with lesser heap memory. This brings the overall performance of the cassandra node down and gradually kills it one point of time where there is no more memory left at OS level.
Thats why they suggest you two things.
Bundle jna.jar. When there is a GC operation for mmap of cassandra, it is all done by java code and not JNI portion that is shipped by default in cassandra. So, it avoids a portion of address space that JNI tries to store in case of native operations.
Disable swap space . A low performing cassandra node will slow down all the operations at its client end. Even the replication to this node will be slowed down and hence you read/write will appear slower than you think. A node shall die and restart when such an Out of Memory occurs instead of taking a portion of JVM that slows down the entire process.

linux CPU cache slowdown

We're getting overnight lockups on our embedded (Arm) linux product but are having trouble pinning it down. It usually takes 12-16 hours from power on for the problem to manifest itself. I've installed sysstat so I can run sar logging, and I've got a bunch of data, but I'm having trouble interpreting the results.
The targets only have 512Mb RAM (we have other models which have 1Gb, but they see this issue much less often), and have no disk swap files to avoid wearing the eMMCs.
Some kind of paging / virtual memory event is initiating the problem. In the sar logs, pgpin/s, pgnscand/s and pgsteal/s, and majflt/s all increase steadily before snowballing to crazy levels. This puts the CPU up correspondingly high levels (30-60 on dual core Arm chips). At the same time, the frmpg/s values go very negative, whilst campg/s go highly positive. The upshot is that the system is trying to allocate a large amount of cache pages all at once. I don't understand why this would be.
The target then essentially locks up until it's rebooted or someone kills the main GUI process or it crashes and is restarted (We have a monolithic GUI application that runs all the time and generally does all the serious work on the product). The network shuts down, telnet blocks forever, as do /proc filesystem queries and things that rely on it like top. The memory allocation profile of the main application in this test is dominated by reading data in from file and caching it as textures in video memory (shared with main RAM) in an LRU using OpenGL ES 2.0. Most of the time it'll be accessing a single file (they are about 50Mb in size), but I guess it could be triggered by having to suddenly use a new file and trying to cache all 50Mb of it all in one go. I haven't done the test (putting more logging in) to correlate this event with these system effects yet.
The odd thing is that the actual free and cached RAM levels don't show an obvious lack of memory (I have seen oom-killer swoop in the kill the main application with >100Mb free and 40Mb cache RAM). The main application's memory usage seems reasonably well-behaved with a VmRSS value that seems pretty stable. Valgrind hasn't found any progressive leaks that would happen during operation.
The behaviour seems like that of a system frantically swapping out to disk and making everything run dog slow as a result, but I don't know if this is a known effect in a free<->cache RAM exchange system.
My problem is superficially similar to question: linux high kernel cpu usage on memory initialization but that issue seemed driven by disk swap file management. However, dirty page flushing does seem plausible for my issue.
I haven't tried playing with the various vm files under /proc/sys/vm yet. vfs_cache_pressure and possibly swappiness would seem good candidates for some tuning, but I'd like some insight into good values to try here. vfs_cache_pressure seems ill-defined as to what the difference between setting it to 200 as opposed to 10000 would be quantitatively.
The other interesting fact is that it is a progressive problem. It might take 12 hours for the effect to happen the first time. If the main app is killed and restarted, it seems to happen every 3 hours after that fact. A full cache purge might push this back out, though.
Here's a link to the log data with two files, sar1.log, which is the complete output of sar -A, and overview.log, a extract of free / cache mem, CPU load, MainGuiApp memory stats, and the -B and -R sar outputs for the interesting period between midnight and 3:40am:
https://drive.google.com/folderview?id=0B615EGF3fosPZ2kwUDlURk1XNFE&usp=sharing
So, to sum up, what's my best plan here? Tune vm to tend to recycle pages more often to make it less bursty? Are my assumptions about what's happening even valid given the log data? Is there a cleverer way of dealing with this memory usage model?
Thanks for your help.
Update 5th June 2013:
I've tried the brute force approach and put a script on which echoes 3 to drop_caches every hour. This seems to be maintaining the steady state of the system right now, and the sar -B stats stay on the flat portion, with very few major faults and 0.0 pgscand/s. However, I don't understand why keeping the cache RAM very low mitigates a problem where the kernel is trying to add the universe to cache RAM.

How to shrink the page talbe size of a process?

mongodb server map all db files into RAM. Along with size of database becoming bigger, the server will has a huge page table which is up to 3G bytes.
Is there a way to shrink it when the server is running?
mongodb version is 2.0.4
Mongodb will memory-map all of the data files that it creates, plus the journal files (if you're using journaling). There is no way to prevent this from happening. This means that the virtual memory size of the MongoDB process will always be roughly twice the size of the data files.
Note that the OS memory management system will page out unused RAM pages, so that the physical memory size of the process will typically be much less than the virtual memory size.
The only way to reduce the virtual memory size of the 'mongod' process is to reduce the size of the MongoDB data files. The only way to reduce the size of the data files is to take the node offline and perform a 'repair'.
See here for more details:
- http://www.mongodb.org/display/DOCS/Excessive+Disk+Space#ExcessiveDiskSpace-RecoveringDeletedSpace
Basically you are asking to do something that the MongoDB manual recommends not to: http://docs.mongodb.org/manual/administration/ulimit/ in this specific scenario. Recommended however does not mean required and it is just a guideline really.
This is just the way MongoDB runs and something you have got to accept unless you wish to toy around and test out different scenarios and how they work.
You probably want to reduce the used memory of the process. You could use the ulimit bash builtin (before starting your server, perhaps in some /etc/rc.d/mongodb script) which calls the setrlimit(2) syscall

Mongo suffering from a huge number of faults

I'm seeing a huge (~200++) faults/sec number in my mongostat output, though very low lock %:
My Mongo servers are running on m1.large instances on the amazon cloud, so they each have 7.5GB of RAM ::
root:~# free -tm
total used free shared buffers cached
Mem: 7700 7654 45 0 0 6848
Clearly, I do not have enough memory for all the cahing mongo wants to do (which, btw, results in huge CPU usage %, due to disk IO).
I found this document that suggests that in my scenario (high fault, low lock %), I need to "scale out reads" and "more disk IOPS."
I'm looking for advice on how to best achieve this. Namely, there are LOTS of different potential queries executed by my node.js application, and I'm not sure where the bottleneck is happening. Of course, I've tried
db.setProfilingLevel(1);
However, this doesn't help me that much, because the outputted stats just show me slow queries, but I'm having a hard time translating that information into which queries are causing the page faults...
As you can see, this is resulting in a HUGE (nearly 100%) CPU wait time on my PRIMARY mongo server, though the 2x SECONDARY servers are unaffected...
Here's what the Mongo docs have to say about page faults:
Page faults represent the number of times that MongoDB requires data not located in physical memory, and must read from virtual memory. To check for page faults, see the extra_info.page_faults value in the serverStatus command. This data is only available on Linux systems.
Alone, page faults are minor and complete quickly; however, in aggregate, large numbers of page fault typically indicate that MongoDB is reading too much data from disk and can indicate a number of underlying causes and recommendations. In many situations, MongoDB’s read locks will “yield” after a page fault to allow other processes to read and avoid blocking while waiting for the next page to read into memory. This approach improves concurrency, and in high volume systems this also improves overall throughput.
If possible, increasing the amount of RAM accessible to MongoDB may help reduce the number of page faults. If this is not possible, you may want to consider deploying a shard cluster and/or adding one or more shards to your deployment to distribute load among mongod instances.
So, I tried the recommended command, which is terribly unhelpful:
PRIMARY> db.serverStatus().extra_info
{
"note" : "fields vary by platform",
"heap_usage_bytes" : 36265008,
"page_faults" : 4536924
}
Of course, I could increase the server size (more RAM), but that is expensive and seems to be overkill. I should implement sharding, but I'm actually unsure what collections need sharding! Thus, I need a way to isolate where the faults are happening (what specific commands are causing faults).
Thanks for the help.
We don't really know what your data/indexes look like.
Still, an important rule of MongoDB optimization:
Make sure your indexes fit in RAM. http://www.mongodb.org/display/DOCS/Indexing+Advice+and+FAQ#IndexingAdviceandFAQ-MakesureyourindexescanfitinRAM.
Consider that the smaller your documents are, the higher your key/document ratio will be, and the higher your RAM/Disksize ratio will need to be.
If you can adjust your schema a bit to lump some data together, and reduce the number of keys you need, that might help.

Resources