OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable - python-3.x

I am facing a problem now, unable to run any program in the cluster. It gives error.
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 64 current, 64 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 64 current, 64 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 64 current, 64 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 64 current, 64 max
Traceback (most recent call last):
File "hello-world.py", line 1, in <module>
from keras.models import Sequential
File "/home/amalli2s/anaconda3/lib/python3.6/site-packages/keras/__init__.py", line 3, in <module>
from . import utils
File "/home/amalli2s/anaconda3/lib/python3.6/site-packages/keras/utils/__init__.py", line 2, in <module>
from . import np_utils
File "/home/amalli2s/anaconda3/lib/python3.6/site-packages/keras/utils/np_utils.py", line 6, in <module>
import numpy as np
File "/home/amalli2s/.local/lib/python3.6/site-packages/numpy/__init__.py", line 142, in <module>
from . import add_newdocs
File "/home/amalli2s/.local/lib/python3.6/site-packages/numpy/add_newdocs.py", line 13, in <module>
from numpy.lib import add_newdoc
File "/home/amalli2s/.local/lib/python3.6/site-packages/numpy/lib/__init__.py", line 8, in <module>
from .type_check import *
File "/home/amalli2s/.local/lib/python3.6/site-packages/numpy/lib/type_check.py", line 11, in <module>
import numpy.core.numeric as _nx
File "/home/amalli2s/.local/lib/python3.6/site-packages/numpy/core/__init__.py", line 16, in <module>
from . import multiarray
SystemError: initialization of multiarray raised unreported exception
This problem, i assume to be same as this one Multiple instances of Python running simultaneously limited to 35
So according to the solution when I set
export OPENBLAS_NUM_THREADS=1
then I end up getting following error:
terminate called after throwing an instance of 'std::system_error'
what(): Resource temporarily unavailable
Aborted
Is there anybody else facing same issue or has idea on how to solve this ? Thank you.
EDIT:
Ok, Seems like this happened because of some config restrictions the admins were trying to implement. Now it works again.

I had this problem running numpy on an ubuntu server. I got all of the following errors, depending on whether I tried to import numpy in a shell or running my django app:
PyCapsule_Import could not import module "datetime" from
numpy.core._multiarray_umath import ( OpenBLAS blas_thread_init:
pthread_create failed for thread 25 of 32: Resource temporarily unavailable
I'm posting this answer since it drove me crazy. What helped for me was to add:
import os
os.environ['OPENBLAS_NUM_THREADS'] = '1'
before
import numpy as np
I guess the server had some limit for the amount of threads it allows(?). Hope it helps someone!

This is for others in the future who encounter this error. The cluster setup most likely limits the number processes that can be run by a user on an interactive node. The clue is in the second line of the error:
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 64 current, 64 max
Here the limit is set to 64. While this is quite sufficient for normal CLI use, it's probably not sufficient for interactively running Keras jobs (like the OP); or in my case, trying to run an interactive Dask cluster.
It may be possible to increase the limit from your shell using, say ulimit -u 10000, but that's not guaranteed to work. It's best to notify the admins like the OP.

Often this issue is related to the limit number of processes available through ulimit (in Linux):
→ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 127590
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 4096 # <------------------culprit
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
A temporary solution is to increase this limit:
ulimit -u unlimited
Most servers I've encountered have this values set to the number of pending signals. E.g. ulimit -i. So, on my example above I did instead:
ulimit -u 127590
Then, added a line on my ~/.bashrc file to set it on login.
For more info on how to permanently fix this, check out: https://serverfault.com/a/485277

If you are a manager, you can:
change temporarily the limit number of processes with the command ulimit -u [number]
change permanently the limit number of processes, i.e. the parameter nproc in /etc/security/limit.conf
If you are a user, you can:
In bash
$ export OPENBLAS_NUM_THREADS=2
$ export GOTO_NUM_THREADS=2
$ export OMP_NUM_THREADS=2
In Python
>>> import os
>>> os.environ['OPENBLAS_NUM_THREADS'] = '1`
Then the problem due to multiple-threads in Python should be solved. The key is to set the number of threads to be less than the limit for you in the cluster.

Building on Ylor's answer, rather than limiting yourself to a single thread, read through the error outputs (here are the first few lines of mine):
OpenBLAS blas_thread_init: pthread_create failed for thread 13 of 64: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 2048 current, 384066 max
OpenBLAS blas_thread_init: pthread_create failed for thread 58 of 64: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 2048 current, 384066 max
...
and find the minimum thread number which failed–then set the thread count to one fewer (12 for me here):
>>> import os
>>> os.environ['OPENBLAS_NUM_THREADS'] = '12`
This will maximize your code's ability to use threads while still staying within current system limits (if unable to change).

Related

Python script on ubuntu - OSError: [Errno 12] Cannot allocate memory

I am running a script on AWS (Ubunut) EC2 instance. It's a web scraper that uses selenium/chromedriver and headless chrome to scrape some webpages. I've had this script running previously with no problems, but today I'm getting an error. Here's the script:
options = Options()
options.add_argument('--no-sandbox')
options.add_argument('--window-size=1420,1080')
options.add_argument('--headless')
options.add_argument('--disable-dev-shm-usage')
options.add_argument('--disable-gpu')
options.add_argument("--disable-notifications")
options.binary_location='/usr/bin/chromium-browser'
driver = webdriver.Chrome(chrome_options=options)
#Set base url (SAN FRANCISCO)
base_url = 'https://www.bandsintown.com/en/c/san-francisco-ca?page='
events = []
for i in range(1,90):
#cycle through pages in range
driver.get(base_url + str(i))
pageURL = base_url + str(i)
print(pageURL)
When I run this script from ubuntu, I get this error:
Traceback (most recent call last):
File "BandsInTown_Scraper_SF.py", line 91, in <module>
driver = webdriver.Chrome(chrome_options=options)
File "/home/ubuntu/.local/lib/python3.6/site-packages/selenium/webdriver/chrome/webdriver.py", line 73, in __init__
self.service.start()
File "/home/ubuntu/.local/lib/python3.6/site-packages/selenium/webdriver/common/service.py", line 76, in start
stdin=PIPE)
File "/usr/lib/python3.6/subprocess.py", line 729, in __init__
restore_signals, start_new_session)
File "/usr/lib/python3.6/subprocess.py", line 1295, in _execute_child
restore_signals, start_new_session, preexec_fn)
OSError: [Errno 12] Cannot allocate memory
I confirmed that I'm running the same version of Chromedriver/Chromium Browser:
ChromeDriver 79.0.3945.130 (e22de67c28798d98833a7137c0e22876237fc40a-refs/branch-heads/3945#{#1047})
Chromium 79.0.3945.130 Built on Ubuntu , running on Ubuntu 18.04
For what it's worth, I have this running on a mac, and I do have multiple web scraping scripts like this one running on the same EC2 instance (only 2 scripts so far, so not that much).
Update
I'm now getting these errors as well when trying to run this script on ubuntu:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 141, in _new_conn
(self.host, self.port), self.timeout, **extra_kw)
File "/usr/lib/python3/dist-packages/urllib3/util/connection.py", line 60, in create_connection
for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
File "/usr/lib/python3.6/socket.py", line 745, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -3] Temporary failure in name resolution
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 601, in urlopen
chunked=chunked)
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 346, in _make_request
self._validate_conn(conn)
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 852, in _validate_conn
conn.connect()
File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 284, in connect
conn = self._new_conn()
File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 150, in _new_conn
self, "Failed to establish a new connection: %s" % e)
urllib3.exceptions.NewConnectionError: <urllib3.connection.VerifiedHTTPSConnection object at 0x7f90945757f0>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ubuntu/.local/lib/python3.6/site-packages/requests/adapters.py", line 449, in send
timeout=timeout
^[[B File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 639, in urlopen
^[[B^[[A^[[A _stacktrace=sys.exc_info()[2])
File "/usr/lib/python3/dist-packages/urllib3/util/retry.py", line 398, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='www.bandsintown.com', port=443): Max retries exceeded with url: /en/c/san-francisco-ca?page=6 (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f90945757f0>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution',))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "BandsInTown_Scraper_SF.py", line 39, in <module>
res = requests.get(url)
File "/home/ubuntu/.local/lib/python3.6/site-packages/requests/api.py", line 75, in get
return request('get', url, params=params, **kwargs)
File "/home/ubuntu/.local/lib/python3.6/site-packages/requests/api.py", line 60, in request
return session.request(method=method, url=url, **kwargs)
File "/home/ubuntu/.local/lib/python3.6/site-packages/requests/sessions.py", line 533, in request
resp = self.send(prep, **send_kwargs)
File "/home/ubuntu/.local/lib/python3.6/site-packages/requests/sessions.py", line 646, in send
r = adapter.send(request, **kwargs)
File "/home/ubuntu/.local/lib/python3.6/site-packages/requests/adapters.py", line 516, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='www.bandsintown.com', port=443): Max retries exceeded with url: /en/c/san-francisco-ca?page=6 (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f90945757f0>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution',))
Finally, here's my currently monthly AWS usage, which doesn't show any memory quota being exceed.
This error message...
restore_signals, start_new_session, preexec_fn)
OSError: [Errno 12] Cannot allocate memory
...implies that the operating system was unable to allocate memory to initiate/spawn a new session.
Additionally, this error message...
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='www.bandsintown.com', port=443): Max retries exceeded with url: /en/c/san-francisco-ca?page=6 (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f90945757f0>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution',))
...implies that your program have successfully iterated till Page 5 and while on Page 6 you see this error.
I don't see any issues in your code block as such. I have taken your code, made some minor adjustments and here is the execution result:
Code Block:
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')
base_url = 'https://www.bandsintown.com/en/c/san-francisco-ca?page='
for i in range(1,10):
#cycle through pages in range
driver.get(base_url + str(i))
pageURL = base_url + str(i)
print(pageURL)
Console Output:
https://www.bandsintown.com/en/c/san-francisco-ca?page=1
https://www.bandsintown.com/en/c/san-francisco-ca?page=2
https://www.bandsintown.com/en/c/san-francisco-ca?page=3
https://www.bandsintown.com/en/c/san-francisco-ca?page=4
https://www.bandsintown.com/en/c/san-francisco-ca?page=5
https://www.bandsintown.com/en/c/san-francisco-ca?page=6
https://www.bandsintown.com/en/c/san-francisco-ca?page=7
https://www.bandsintown.com/en/c/san-francisco-ca?page=8
https://www.bandsintown.com/en/c/san-francisco-ca?page=9
Deep dive
This error is coming from subprocess.py:
self.pid = _posixsubprocess.fork_exec(
args, executable_list,
close_fds, tuple(sorted(map(int, fds_to_keep))),
cwd, env_list,
p2cread, p2cwrite, c2pread, c2pwrite,
errread, errwrite,
errpipe_read, errpipe_write,
restore_signals, start_new_session, preexec_fn)
However, as per the discussion in OSError: [Errno 12] Cannot allocate memory this error OSError: [Errno 12] Cannot allocate memory is related to RAM / SWAP.
Swap Space
Swap Space is the memory space in the system hard drive that has been designated as a place for the os to temporarily store data which it can no longer hold with in the RAM. This gives you the ability to increase the amount of data your program can keep in its working memory. The swap space on the hard drive will be used primarily when there is no longer sufficient space in RAM to hold in-use application data. However, the information written to I/O will be significantly slower than information kept in RAM, but the operating system will prefer to keep running application data in memory and use swap space for the older data. Deploying swap space as a fall back for when your system’s RAM is depleted is a safety measure against out-of-memory issues on systems with non-SSD storage available.
System Check
To check if the system already has some swap space available, you need to execute the following command:
$ sudo swapon --show
If you don’t get any output, that means your system does not have swap space available currently. You can also verify that there is no active swap using the free utility as follows:
$ free -h
If there is no active swap in the system you will see an output as:
Output
total used free shared buff/cache available
Mem: 488M 36M 104M 652K 348M 426M
Swap: 0B 0B 0B
Creating Swap File
In these cases you need to allocate space for swap to use as a separate partition devoted to the task and you can create a swap file that resides on an existing partition. To create a 1 Gigabyte file you need to execute the following command:
$ sudo fallocate -l 1G /swapfile
You can verify that the correct amount of space was reserved by executing the following command:
$ ls -lh /swapfile
#Output
$ -rw-r--r-- 1 root root 1.0G Mar 08 10:30 /swapfile
This confirms the swap file has been created with the correct amount of space set aside.
Enabling the Swap Space
Once the correct size file is available we need to actually turn this into swap space. Now you need to lock down the permissions of the file so that only the users with specific privileges can read the contents. This prevents unintended users from being able to access the file, which would have significant security implications. So you need to follow the steps below:
Make the file only accessible to specific user e.g. root by executing the following command:
$ sudo chmod 600 /swapfile
Verify the permissions change by executing the following command:
$ ls -lh /swapfile
#Output
-rw------- 1 root root 1.0G Apr 25 11:14 /swapfile
This confirms only the root user has the read and write flags enabled.
Now you need to mark the file as swap space by executing the following command:
$ sudo mkswap /swapfile
#Sample Output
Setting up swapspace version 1, size = 1024 MiB (1073737728 bytes)
no label, UUID=6e965805-2ab9-450f-aed6-577e74089dbf
Next you need to enable the swap file, allowing the system to start utilizing it executing the following command:
$ sudo swapon /swapfile
You can verify that the swap is available by executing the following command:
$ sudo swapon --show
#Sample Output
NAME TYPE SIZE USED PRIO
/swapfile file 1024M 0B -1
Finally check the output of the free utility again to validate the settings by executing the following command:
$ free -h
#Sample Output
total used free shared buff/cache available
Mem: 488M 37M 96M 652K 354M 425M
Swap: 1.0G 0B 1.0G
Conclusion
Once the Swap Space has been set up successfully the underlying operating system will begin to use it as necessary.
Probably what has happened is that Chromium browser is updated, now takes more memory (or perhaps leaks memory worse..you don't say how many urls it gets before dying)
As a work around, launch a larger instance size. Do don't say what instance size you are using but if you have a t3.micro try a t3.medium instead.
There is an easy to understand chart here https://www.ec2instances.info/?region=eu-west-1
If you have launched an instance and want to resize it without rebuilding from scratch then use the console to take it to state stopped, alter the size and start again

Is setting the linux memory to unlimit will have an adverse effect?

I am running MPI job in linux server. I got error:
--------------------------------------------------------------------------
The OpenFabrics (openib) BTL failed to initialize while trying to
allocate some locked memory. This typically can indicate that the
memlock limits are set too low. For most HPC installations, the
memlock limits should be set to "unlimited". The failure occured
here:
Local host: yw0431
OMPI source: ../../../../../ompi/mca/btl/openib/btl_openib_component.c:1216
Function: ompi_free_list_init_ex_new()
Device: mlx4_0
Memlock limit: 65536
You may need to consult with your system administrator to get this
problem fixed. This FAQ entry on the Open MPI web site may also be
helpful:
http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages
--------------------------------------------------------------------------
--------------------------------------------------------------------------
WARNING: There was an error initializing an OpenFabrics device.
Local host: yw0431
Local device: mlx4_0
--------------------------------------------------------------------------
[yw0431:20193] 11 more processes have sent help message help-mpi-btl-openib.txt / init-fail-no-mem
[yw0431:20193] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
[yw0431:20193] 11 more processes have sent help message help-mpi-btl-openib.txt / error in device init
forrtl: error (78): process killed (SIGTERM)
it means that my linux server have locked memory with 65M, but my job needed more memory. I think 2G should be emough.
I have found a solution about ulimiting the memory:
ulimit -l unlimited
But i am worried that i will cause system crash or some bad things happen.
so can i set "ulimit -l umlimited"?
When you set ulimit as unlimited and your process starting using memory exhaustively then OOM killer will kill ur job for system stability,I would set the ulimit as 80 to 90% of RAM of instead of unlimited.

Hadoop error log jvm sqoop

My mistake - after 6-8 hours of running programs on Java i get this log hs_err_pid6662.log
and this
[testuser#apus ~]$ sh /home/progr/work/import.sh
/usr/bin/hadoop: fork: retry: Resource temporarily unavailable
/usr/bin/hadoop: fork: retry: Resource temporarily unavailable
/usr/bin/hadoop: fork: retry: Resource temporarily unavailable
/usr/bin/hadoop: fork: retry: Resource temporarily unavailable
/usr/bin/hadoop: fork: Resource temporarily unavailable
Programs run every five minutes and try to import/export from oracle
How to fix this?
# There is insufficient memory for the Java Runtime Environment to continue.
# Cannot create GC thread. Out of system resources.
# Possible reasons:
# The system is out of physical RAM or swap space
# In 32 bit mode, the process size limit was hit
# Possible solutions:
# Reduce memory load on the system
# Increase physical memory or swap space
# Check if swap backing store is full
# Use 64 bit Java on a 64 bit OS
# Decrease Java heap size (-Xmx/-Xms)
# Decrease number of Java threads
# Decrease Java thread stack sizes (-Xss)
# Set larger code cache with -XX:ReservedCodeCacheSize=
# This output file may be truncated or incomplete.
#
# Out of Memory Error (gcTaskThread.cpp:48), pid=6662,
tid=0x00007f429a675700
#
--------------- T H R E A D ---------------
Current thread (0x00007f4294019000): JavaThread "Unknown thread"
[_thread_in_vm, id=6696, stack(0x00007f429a575000,0x00007f429a676000)]
Stack: [0x00007f429a575000,0x00007f429a676000], sp=0x00007f429a674550,
free space=1021k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native
code)
VM Arguments:
jvm_args: -Xmx1000m -Dhadoop.log.dir=/opt/cloudera/parcels/CDH-5.11.1-
1.cdh5.11.1.p0.4/lib/hadoop/logs -Dhadoop.log.file=hadoop.log -
Dhadoop.home.dir=/opt/cloudera/parcels/CDH-5.11.1-
1.cdh5.11.1.p0.4/lib/hadoop -Dhadoop.id.str= -
Dhadoop.root.logger=INFO,console -
Launcher Type: SUN_STANDARD
Environment Variables:
JAVA_HOME=/usr/java/jdk1.8.0_102
# JRE version: (8.0_102-b14) (build )
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.102-b14 mixed mode linux-
amd64 compressed oops)
# Failed to write core dump. Core dumps have been disabled. To enable core
dumping, try "ulimit -c unlimited" before starting Java again
Memory: 4k page, physical 24591972k(6051016k free), swap 12369916k(11359436k
free)
I am running programs like sqoop-import,sqoop-export on Java every 5 minutes.
example:
#!/bin/bash
hadoop jar /home/progr/import_sqoop/oracle.jar.
CDH version 5.11.1
java version jdk1.8.0_102
OS:Red Hat Enterprise Linux Server release 6.9 (Santiago)
Mem free:
total used free shared buffers cached
Mem: 24591972 20080336 4511636 132036 334456 2825792
-/+ buffers/cache: 16920088 7671884
Swap: 12369916 1008664 11361252
Host Memory Usage
enter image description here
The maximum heap memory is (by default) limited to 1GB. You need to increase this
JRE version: (8.0_102-b14) (build )
jvm_args: -Xmx1000m -Dhadoop.log.dir=/opt/cloudera/parcels/CDH-5.11.1-
1.cdh5.11.1.p0.4/lib/hadoop/logs -Dhadoop.log.file=hadoop.log -
Dhadoop.home.dir=/opt/cloudera/parcels/CDH-5.11.1-
1.cdh5.11.1.p0.4/lib/hadoop -Dhadoop.id.str= -
Dhadoop.root.logger=INFO,console -
Try the following for to increase this to 2048MB (or higher if required).
export HADOOP_CLIENT_OPTS="-Xmx2048m ${HADOOP_CLIENT_OPTS}"
Reference:
Pig: Hadoop jobs Fail
https://mail-archives.apache.org/mod_mbox/hadoop-mapreduce-user/201104.mbox/%3C5FFFF0E4-B3BA-420A-ADE3-B422A66E8B11#yahoo-inc.com%3E

apache spark "Py4JError: Answer from Java side is empty"

I get this error every time...
I use sparkling water...
My conf-file:
***"spark.driver.memory 65g
spark.python.worker.memory 65g
spark.master local[*]"***
The amount of data is about 5 Gb.
There is no another information about this error...
Does anybody know why it happens? Thank you!
***"ERROR:py4j.java_gateway:Error while sending or receiving.
Traceback (most recent call last):
File "/data/analytics/Spark1.6.1/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 746, in send_command
raise Py4JError("Answer from Java side is empty")
Py4JError: Answer from Java side is empty
ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java server
Traceback (most recent call last):
File "/data/analytics/Spark1.6.1/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 690, in start
self.socket.connect((self.address, self.port))
File "/usr/local/anaconda/lib/python2.7/socket.py", line 228, in meth
return getattr(self._sock,name)(*args)
error: [Errno 111] Connection refused
ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java server
Traceback (most recent call last):
File "/data/analytics/Spark1.6.1/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 690, in start
self.socket.connect((self.address, self.port))
File "/usr/local/anaconda/lib/python2.7/socket.py", line 228, in meth
return getattr(self._sock,name)(*args)
error: [Errno 111] Connection refused
ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java server
Traceback (most recent call last):
File "/data/analytics/Spark1.6.1/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 690, in start
self.socket.connect((self.address, self.port))
File "/usr/local/anaconda/lib/python2.7/socket.py", line 228, in meth
return getattr(self._sock,name)(*args)
error: [Errno 111] Connection refused"***
Have you tried setting spark.executor.memory and spark.driver.memory in your Spark configuration file?
See https://stackoverflow.com/a/22742982/5453184 for more info.
Usually, you'll see this error when the Java process get silently killed by the OOM Killer.
The OOM Killer (Out of Memory Killer) is a Linux process that kicks in when the system becomes critically low on memory. It selects a process based on its "badness" score and kills it to reclaim memory.
Read more on OOM Killer here.
Increasing spark.executor.memory and/or spark.driver.memory values will only make things worse in this case, i.e. you may want to do the opposite!
Other options would be to:
increase the number of partitions if you're working with very big data sources;
increase the number of worker nodes;
add more physical memory to worker/driver nodes;
Or, if you're running your driver/workers using docker:
increase docker memory limit;
set --oom-kill-disable on your containers, but make sure you understand possible consequences!
Read more on --oom-kill-disable and other docker memory settings here.
Another point to note if you are on wsl2 using pyspark. Ensure that your wsl2 config file has an increased memory.
# Settings apply across all Linux distros running on WSL 2
[wsl2]
# Limits VM memory to use no more than 4 GB, this can be set as whole numbers using GB or MB
memory=12GB # This was originally set to 3gb which caused me to fail since spark.executor.memory and spark.driver.memory was only able to MAX of 3gb regardless of how high i set it.
# Sets the VM to use eight virtual processors
processors=8
for reference. your .wslconfig config file should be located in C:\Users\USERNAME

error with freebsd uwsgi

I have a error with uwsgi
when i start my config - uwsgi bottle.ini
!!! no internal routing support, rebuild with pcre support !!!
setgid() to 80
setuid() to 80
your processes number limit is 5547
your memory page size is 4096 bytes
detected max file descriptor number: 58982
lock engine: ipcsem
uwsgi_lock_ipcsem_init()/semget(): No space left on device [core/lock.c line 507]
uwsgi_ipcsem_clear()/semctl(): Invalid argument [core/lock.c line 631]
my bottle.ini
[uwsgi]
socket = 185.21.214.275:80
chdir = /usr/local/www/myapp/
virtualenv = /usr/local/www/mypython
master = true
wsgi-file = /usr/local/www/myapp/app.py
uid = www
gid = www
I have had reinstalled uwsgi and pcre but proble is still appeare
It is explained here: http://uwsgi-docs.readthedocs.org/en/latest/ThingsToKnow.html
On OpenBSD, NetBSD and FreeBSD < 9, SysV IPC semaphores are used as the locking subsystem. These operating systems tend to limit the number of allocable semaphores to fairly small values. You should raise the default limits if you plan to run more than one uWSGI instance. FreeBSD 9 has POSIX semaphores, so you do not need to bother with that.

Resources