GraphX Disk size running low

GraphX Disk size running low - apache-spark

I am currently using Apache Spark with Graphx, I have noticed lately that when I run my application with a lots of data the application is using a large part of my disk, for example before I start the program the disk is around 8 GB and during the application runs it goes down to 1 GB, when I close the application the disk is restored but not in full. I have lost some GB, at first I though that it had to do with swap memory and logs, but I can not find what is stored to my disk after the execution of the application.
Can someone explain why is this happening?

Related

Buffer/cache exhaustion Spark standalone inside a Docker container

I have a very weird memory issue (which is what a lot of people will most
likely say ;-)) with Spark running in standalone mode inside a Docker
container. Our setup is as follows: We have a Docker container in which we have a Spring boot application that runs Spark in standalone mode. This Spring boot app also contains a few scheduled tasks (managed by Spring). These tasks trigger Spark jobs. The Spark jobs scrape a SQL database, shuffles the data a bit and then writes the results to a different SQL table (writing the results doesn't go through Spark). Our current data set is very small (the table contains a few million rows).
The problem is that the Docker host (a CentOS VM) that runs the Docker
container crashes after a while because the memory gets exhausted. I currently have limited the Spark memory usage to 512M (I have set both executor and driver memory) and in the Spark UI I can see that the largest job only takes about 10 MB of memory. I know that Spark runs best if it has 8GB of memory or more available. I have tried that as well but the results are the same.
After digging a bit further I noticed that Spark eats up all the buffer / cache memory on the machine. After clearing this manually by forcing Linux to drop caches (echo 2 > /proc/sys/vm/drop_caches) (clearing the dentries and inodes) the cache usage drops considerably but if I don't keep doing this regularly I see that the cache usage slowly keeps going up until all memory is used in buffer/cache.
Does anyone have an idea what I might be doing wrong / what is going on here?
Big thanks in advance for any help!

GPDB:Out of memory at segment

we re facing OOM error when trying to execute multiple SQL query session via scheduled job .
Detailed error:
The error message is: org.postgresql.util.PSQLException:ERROR: Out of memory (seg6 slice5 sungpmsh0:40002 pid=13610)
Detail: VM protect failed to allocate 65584 bytes from system, VM Protect 5835 MB available
We tried
After reading the pivotal support doc, we are doing basic troubleshoot here
validated two memory parameters here
current setting in GPdb
GPDB vmprotect limit :8 GB
GPB statemen_mem: based on the vmprotect limit.as per reading it is responsible for running the query in the segment.
Test 2 Did Tuning the SQL queries. also, what should I tune here please guide?
Based on source
https://discuss.pivotal.io/hc/en-us/articles/201947018-Pivotal-Greenplum-GPDB-Memory-Configuration
https://discuss.pivotal.io/hc/en-us/articles/204268778-What-are-VM-Protect-failed-to-allocate-d-bytes-d-MB-available-error-
But still getting the same OOM error.
Do we need to increase the vmprotect limit? if Yes, then by which amount should we increase it?
How to handle concurrency at gpdb?
How much swap we need to add here when we are already running with 30 GB RAM.
currently, we have added 15GB swap here? is that ok ?
What is the query to identify host connection with Greenplum database ?
Thanks in advance

Do we need to increase the vmprotect limit? if Yes, then by which amount should we increase it?
There is a nice calculator on setting gp_vmem_protect_limit on Greenplum.org. The setting depends on how much memory, swap, and segments per host you have.
http://greenplum.org/calc/
You can be getting OOM errors for several reasons.
Bad query
Bad table distribution (skew)
Bad settings (like gp_vmem_protect_limit)
Not enough resources (RAM)
How to handle concurrency at gpdb?
More RAM, less segments per host, and workload management to limit the number of concurrent queries running.
How much swap we need to add here when we are already running with 30 GB RAM. currently, we have added 15GB swap here? is that ok ?
Only 30GB of RAM? That is pretty small. You can add more swap but it will slow down the queries compared to real RAM. I wouldn't use much more than 8GB of swap.
I recommend using 256GB of RAM or more especially if you are worried about concurrency.
What is the query to identify host connection with Greenplum database
select * from pg_stat_activity;

Cassandra keep using 100% of CPU and not utilizing memoery?

We have setup Cassandra single node of 3.11 with JDK 1.8 on ec2 with instance type t2.large which has 2 CPU and 7 GB of RAM.
We facing the issue that Cassandra keeps reaching CPU 100% even we do not have that much load.
We have 7GB of RAM but Cassandra not utilizing that Memory.it only uses 1.7-1.8 GB of RAM.
What configuration needs to change to reduce CPU utilization to not reach to 100%.
what best configuration to get better performance out of Cassandra.
Right now we able to get only about 100-120 read and 50-100 write operation per sec.
Please, some one helps us to understand the issue and what ways to improve performance configuration.

Spark streaming on yarn - Container running beyond physical memory limits

I'm running a spark streaming application on Yarn, It works well for several days and after that I encountered a problem, the error message from yarn list below:
Application application_1449727361299_0049 failed 2 times due to AM Container for appattempt_1449727361299_0049_000002 exited with exitCode: -104
For more detailed output, check application tracking page:https://sccsparkdev03:26001/cluster/app/application_1449727361299_0049Then, click on links to logs of each attempt.
Diagnostics: Container [pid=25317,containerID=container_1449727361299_0049_02_000001] is running beyond physical memory limits. Current usage: 3.5 GB of 3.5 GB physical memory used; 5.3 GB of 8.8 GB virtual memory used. Killing container.
And here is my memory configuration:
spark.driver.memory = 3g
spark.executor.memory = 3g
mapred.child.java.opts -Xms1024M -Xmx3584M
mapreduce.map.java.opts -Xmx2048M
mapreduce.map.memory.mb 4096
mapreduce.reduce.java.opts -Xmx3276M
mapreduce.reduce.memory.mb 4096
This OOM error is strange because I didn't maintain any data in memory since it's a streaming program, does anyone encountered the same question like it? Or who know what cause it?

Check the mem on the box/vm instance you're running it on. My guess is the host machine is red lining it.
...due to, it appears, over-allocating memory.
Where do you think the streaming gets executed? Regardless of whether you store anything there? Yup. memory. Not cats or dancing Viking either (add "e").
Guess what? You're allocating 7 GB of memory that is heavily weighted towards physical over virtual mem.
Check your logging, as that would have similar build up time.
What's spark.yarn.am.memory value?
Get your VM and container memory allocation in balance :)
Another thought is to adjust memoryOverhead so as physical & virtual can be more proportional

GC in Server Mode Not Collecting the Memory

IIS hosted WCF service is consuming Large memory like 18 GB and the server has slowed down.
I Analyzed Mini dump file and it shows only 1 GB or active objects. I understand the GC is not clearing the memory and GC must be running in server mode in 64 bit System. Any idea why the whole computer is stalling and app is taking huge memory?

The GC was running on Server Mode it was configured for better performance. I Understand GC running in Server mode will have a performance improvement because the GC's will not be triggered frequently due to high available memory and in server mode it will have high limit on memory usage. Here the problem was when the high limit is reached for the process CLR triggered the GC and it was trying to clear the Huge 18 GB of memory in one shot, so it was using 90% of system resource and rest applications were lagging.
We tried restarting but it was forever going so We had to kill the process. and now with Workstation mode GC smooth and clean. The only difference is response time has some delay due to GC after 1.5 GB allocation.
One more info: .NET 4.5 version has revision regarding this which has resolved this issue in GC.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string