Python aiohttp been receiving SSL transport errors - python-3.x

We have an application running that relies heavily on asyncio.
It sends hundreds of get requests per minute to mostly the same host, but with different urls.
Since about 3 weeks, we observe the following issues:
The process gets stuck, often for up to (exactly) 2400 seconds.
We observe the following error in the logging:
2018-12-07T23:37:33Z ERROR base_events.py: Fatal error on SSL transport protocol:
File "/usr/lib64/python3.6/asyncio/sslproto.py", line 638, in _process_write_backlog ssldata, offset = self._sslpipe.feed_appdata(data, offset)
Python version: 3.6.3
aiohttp version: 3.4.4
Question 1: Does anyone know what is going on here? And how can we get rid of those nasty periods of the process getting stuck ... ? (Or how to debug?)
Question 2: Can this be related?: https://bugs.python.org/issue29406

Related

how to use pythonping with exectutor

I'm trying to capture IMCP ping results using pythonping on my Raspberry PI.
The documentation says it needs to be ran as root (undesirable), or I can use executor to work around this and run as user Pi using the executor.Communicator function.
I cannot find a working example of how to do this.
My test code is simple
from executor import execute
from pythonping import ping
# get average of 10 pings to host
# gives permission error
#ping("1.1.1.1",size=40,count=10)
# test of executor: capture result to variable
c=execute("hostname",capture=True)
print(c)
Somehow I use executor to process the ping request as a wrapper to get around needing to be root.
I'd love for someone to show me a working example of how to do this.
pythonping is exactly what I want because I can tell it to give me the average of 10 pings to a host.
I resolved this by using the icmplib library instead because it fully manages the exchanges and the structure of ICMP packets.
from executor import execute
from icmplib import ping
host=ping("1.1.1.1", count=4, privileged=False)
print(host.avg_rtt)

Python too many subprocesses?

I'm trying to start a lot of python procees on a single machine.
Here is a code snippet:
fout = open(path, 'w')
p = subprocess.Popen((python_path,module_name),stdout=fout,bufsize=-1)
After about 100 processes I'm getting the error below:
Running on win 10 64 bit, Python 3.5. Any Idea what that might be? Already tried to split the start (so start from two scripts) as well as sleep command. After a certain number of processes, the error shows up. Any Idea what that might be? Thanks a lot for any hint!
PS:
Some background. Each process opens database connections as well as does some requests using the requests package. Then some calculations are done using numpy, scipy etc.
PPS: Just discover this error message:
dll load failed the paging file is too small for this operation to complete python (when calling scipy)
Issues solved through reinstalling numpy and scipy + installing mkl.
Strange about this error was that it only appeared after a certain number of processes. Would love to hear if anybody knows why this happened!

Opensips suddenly crash in two-three days running

I am using opensips, it is working fine but after 2-3 days it suddenly crash. Don't understand following log
CRITICAL:core:receive_fd: EOF on 17
INFO:core:handle_sigs: child process 14090 exited by a signal 11
INFO:core:handle_sigs: core was generated
INFO:core:handle_sigs: terminating due to SIGCHLD
CRITICAL:core:receive_fd: EOF on 17
INFO:core:handle_sigs: child process 14090 exited by a signal 11
INFO:core:handle_sigs: core was generated
INFO:core:handle_sigs: terminating due to SIGCHLD
INFO:core:sig_usr: signal 15 received
How can I investigate what is exactly going wrong with my opensips. I am using Ubuntu, should I change it to Centos or Debian? or what above log dictate error? any idea.
The log isn't telling you anything other than that it's crashed. The question is why.
If you run the same version & config on a different environment you'll probably have the same issues.
The time dependence of the crashes would suggest it's crashing when a specific race condition is met. This could be a call coming in with an invalid Caller ID you're trying to parse as an int, a routing block that's seldom called being called, a resource limitation on the system, or something totally different.
This is a pretty generic crash message, so without more debugging it's just guesswork, so let's enable debugging:
The start of the OpenSIPs config file is where we enable, here's how the default config looks (assuming you've built off the standard template):
####### Global Parameters #########
log_level=3
log_stderror=no
log_facility=LOG_LOCAL0
children=4
/* uncomment the following lines to enable debugging */
#debug_mode=yes
If you change yours to:
####### Global Parameters #########
log_level=8
log_stderror=yes
log_facility=LOG_LOCAL0
children=4
/* uncomment the following lines to enable debugging */
debug_mode=yes
You'll have debugging features enabled and a whole lot more info available in syslog.
Once you've done that sit back and wait for 2 days until it crashes, and you'll have an answer as to what module / routing block / packet is causing your instance to crash.
After that you can post the output here along with your config file, but there's a pretty high chance that someone on the OpenSIPs or Kamailio mailing lists will have had the same issue before.

Connect exception in gatling- What does this mean?

I ran the below config in gatling from my local machine to verify 20K requests per second ..
scn
.inject(
atOnceUsers(20000)
)
It gave these below error in reports...What des this mean in gatling?
j.n.ConnectException: Can't assign requested address:
/xx.xx.xx:xxxx 3648 83.881 %
j.n.ConnectException: connection timed out: /xx.xx.xx:xxxx 416 9.565 %
status.find.is(200), but actually found 500 201 4.622 %
j.u.c.TimeoutException: Request timeout to not-connected after
60000ms 84 1.931 %
Are these timeouts happening due to server not processing the requests or requests not going from my local machine
Most probably yes, that's the reason.
Seems your simulation was compiled successfully and started.
If you look to the error messages you will see percentages after each line (83.881%, 9.565%, 1.931 %). This means that actually the requests were generated and were sent and some of them failed. Percentages are counted based on total number of fails.
If some of the requests are OK and you get these errors, then Gatling did its job. It stress tested your application.
Try to simulate with lower number of users,for example:
scn
inject(
rampUsers(20) over (10 seconds)
)
If it works then definitely your application is not capable to handle 20000 requests at once.
For more info on how to setup a simulation see here.

TensorBoard debugger and gRPC max message size between client and server

Trying to debug a TensorFlow model with the new TensorBoard debugger GUI. The communication between client (TensorFlow side) and server (TensorBoard side) fails with the following message:
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with (StatusCode.RESOURCE_EXHAUSTED, Received message larger than max (11373336 vs. 4194304))>
Apparently the issue is well known in general and there are ways tricks to modify the max message size in grpc. However, in TensorFlow this is transparent to the user given that I am using the tf_debug.TensorBoardDebugWrapperSession wrapper.
My question is how to increase the max message size so I can debug my model. I am using TensorFlow 1.6 with Python 3.6.
Thank you!
Can you try creating TensorBoardDebugWrapperSession or TensorBoardDebugHook with the keyword argument send_source=False as a workaround? The root cause is that large source file sizes and/or large number of source files causes the gRPC message to exceed the 4-MB message size limit.
The issue will be fixed in the next release of TensorFlow and TensorBoard.

Resources