Odoo timeout killing cron - linux

I found in logs that timeout set to 120s is killing cronworkers.
Firs issue I have noticed is that plugin which makes backups of db stuck in loop and makes zip after zip so in 1-2h disk is full.
Second thing is scheduled action called Mass Mailing: Process queue in odoo.
It should run every 60mins but it is gettin killed by timeout and run instantly after kill again
Where should I look for this timeout? I raised already all timeouts in odoo.conf to 500sec
Odoo v12 community, ubuntu 18, nginx
2019-12-02 06:43:04,711 4493 ERROR ? odoo.service.server: WorkerCron (4518) timeout after 120s
2019-12-02 06:43:04,720 4493 ERROR ? odoo.service.server: WorkerCron (4518) timeout after 120s

The following timeouts you can find in odoo.conf are usually the ones responsible for the behaviour you experience (in particular the second one).
limit_time_cpu = 60
limit_time_real = 120
Some more explanations on Odoo documentation : https://www.odoo.com/documentation/12.0/reference/cmdline.html#multiprocessing

Related

Cypress UI tests throwing time out for waiting for browser

I am running Cypress UI tests in AzureDevOps CI/CD and some how most of the UI test are getting failed. All of the tests were running fine few days back.
It is throwing a Timed out waiting for the browser to connect. Retrying. error: Any advise on how to resolve the issue.
Environment Details:
Cypress version: 3.4.1,
Node: 10.x,
Azure DevOps CI/CD
Running: report/send-report.spec.js... (12 of 14)
2019-10-10T00:47:31.0294852Z
2019-10-10T00:47:31.0295427Z Warning: Cypress can only record videos when using the built in 'electron' browser.
2019-10-10T00:47:31.0295707Z
2019-10-10T00:47:31.0296579Z You have set the browser to: 'chrome'
2019-10-10T00:47:31.0296837Z
2019-10-10T00:47:31.0297613Z A video will not be recorded when using this browser.
2019-10-10T00:47:31.0313740Z (node:4030) MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 11 end listeners added. Use emitter.setMaxListeners() to increase limit
2019-10-10T00:48:01.0316223Z
2019-10-10T00:48:01.0592004Z Timed out waiting for the browser to connect. Retrying...
2019-10-10T00:48:31.0587550Z
2019-10-10T00:48:31.0839142Z Timed out waiting for the browser to connect. Retrying again...
2019-10-10T00:49:01.0877330Z
2019-10-10T00:49:01.1241198Z The browser never connected. Something is wrong. The tests cannot run. Aborting...
I have noticed that you have set retries value as 2 to enable immediately retry on failure instead of moving on to the next test. So I recommend you to change the value and check if the error still occur.
And you can try another workaround, to change numTestsKeptInMemory down from 50 to something sane like 1 or 0. Here is the offical documentation. https://docs.cypress.io/guides/references/configuration.html#Global
In addition, it seems like an occasional error. Because some users failed on the first pipeline, but succeed on the second pipeline. And this should be a problem with cypress itself or your system's memory, you can report this problem to cypress directly.
Here is the link about cypress-io/cypress. https://github.com/cypress-io/cypress/issues/
And here is the link about the same error message.https://github.com/cypress-io/cypress/issues/1305

Zappa / Async AWS Lambda Function times out in 30s

I have a Python 3.6 - Flask application deployed onto AWS Lambda using Zappa, in which I have an asynchronous task execution function defined using #Task as discussed here
However, I find that the function call still times out at 30 seconds as against the 5 minute timeout that AWS Lambda enforces for non-API calls. I even checked the timeout in my Lambda settings and it is set to 5 minutes.
The way I discovered this is when the lambda's debug output started repeating without a request - something that happens because the lamba is called 2 more times because of either an error or timeout (as per the AWS Lambda documentation).
Can anyone help me with getting this resolved?
[EDIT : The lambda function is also not part of any VPC and is set to be accessible from the internet.]
Here are the logs below. Basically, the countdown is a sleep timer counting to 20 seconds, followed by a #task call to application.reviv_assign_responder, but as we see, there is no outpust past 'NEAREST RESPONDER' and the countdown starts again, indicating that the function has timed out and has been called again by (AWS') design.
Log output in Pastebin : https://pastebin.com/VEbdCALg
Second incident - https://pastebin.com/ScNhbMcn
As we can see in the second log, it clearly states:
[1515842321866] wait_one_and_notify : 30 : 26 [1515842322867]
wait_one_and_notify : 30 : 27 [1515842323868] wait_one_and_notify : 30
: 28 [1515842324865] 2018-01-13T11:18:44.865Z
72a8d34a-f853-11e7-ac2f-dd12a3d35bcb Task timed out after 30.03
seconds
You can check the default settings that Zappa applies to all your lambda functions here, and you will see that by default timeout_seconds is set-up to 30 seconds, This will apply over the default Lambda setup in AWS Console, because by default this is 3 seconds (you can check this limit in AWS Lambda FAQ.
For your #Task you must increase/setup your timeout_seconds in your zappa_settings.(json|yaml) file and redeploy this, You can put 5 mins (5*60==300 seconds) but this increase will be for all your functions defined in your virtualenv deployed with zappa.
You can check more details exposed in this issue in Zappa repo.
The timeout_seconds parameter in Zappa is misleading. That is, it does limit the timeout of the Lambda function, but the requests are served through CloudFront, which has a default timeout of 30 seconds. To verify that, try lowering the timeout_seconds to 20 - it will correctly timeout in 20 seconds. However past 30 there is no effect because of CloudFront limitation.
The default timeout is 30 seconds. You can change the value to be from 4 to 60 seconds. If you need a timeout value outside that range, request a change to the limit.
In other words, there is nothing you can do in either Zappa or Lambda to fix this, because the problem lies elsewhere (CloudFront).
I haven't tried it myself, but you might be able to up the limit by creating the cloudfront distribution in front of Lambda, though it seems you are still limited by max. 60s (unless you request more through AWS support, as indicated in the previous link).

Consumer disappears from queue after 30-40 mins

My app just disappears from the list of consumers in RabbitMQ Admin after working just fine for like 30-40 mins. AMQP lib used: node-amqp. Here's the connection:
const con = amqp.createConnection(options,{defaultExchangeName: 'amq.topic', reconnect: true})
The following event handlers are configured too: connect, ready, close, tag.change, error
The worst part is that i don't get any errors or close events, app just disconnects and logs nothing...
It just seems that connection is terminated out of being 'idle' for a while...
Has anyone had something similar? How did you deal with it?
Perhaps this helps someone. To resolve the issue we have to put heartbeat field to options and specify the interval in seconds the connection has to be checked and refreshed.
The heartbeat is doesn't have any default values, so if it is not explicitly added, amqp won't use it.

Application deployed in weblogic instance is running Slow

My application deployed in Weblogic instance is getting too slow sometimes. At that time, it's hitting the error related to Stuck Thread Time in Managed server log. Initially, when I noticed this, I did some research and increased the value of Max Stuck Thread Time to 800 seconds in place of 600 seconds. But, this didn't fix the issue. I got the following error again.
WatchRule: (SEVERITY = 'Error') AND ((MSGID = 'WL-000337') OR (MSGID = 'BEA-000337'))
WatchData: MESSAGE = [STUCK] ExecuteThread: '58' for queue: 'weblogic.kernel.Default (self-tuning)' has been busy for "812" seconds working on the request "Http Request Information: weblogic.servlet.internal.ServletRequestImpl#42f38088[POST /****/faces/index.jsf]
", which is more than the configured time (StuckThreadMaxTime) of "800" seconds in "server-failure-trigger". Stack trace:
oracle.jbo.pcoll.PCollNode.objectAt(PCollNode.java:1753)
oracle.jbo.pcoll.PCollNode.objectAt(PCollNode.java:1753)
oracle.jbo.pcoll.PCollection.elementAt(PCollection.java:839)
oracle.jbo.server.QueryCollection.get(QueryCollection.java:2556)
oracle.jbo.server.ViewRowSetImpl.getRow(ViewRowSetImpl.java:5540)
oracle.jbo.server.ViewRowSetIteratorImpl.getRangeIndexOf(ViewRowSetIteratorImpl.java:1179)
oracle.jbo.server.ViewRowSetIteratorImpl.notifyRowUpdated(ViewRowSetIteratorImpl.java:3491)
We are using
ADF 12c for application development
Weblogic Version: 12.2.1 in windows server
Database : Oracle 11g
Jdk version : 1.8-65
Can anyone please advise me on the reason and possible solution for this issue?
Thanks in advance.
Increasing the stuck threads time will not resolve your issue. From the stack trace it looks like your application may be reading some file. There can be different reason for slowness
1) May be the file which you are reading is very large
2) May be slow network
3) May be IO latency
If possible you can introduce some debugging messages in your application. This will provide the details on the process which your application is running.
If you are running your application on linux/unix environment then you can monitor process activity inside the /var/proc directory

"Server x timed out" during MongoDB aggregation

I have a script that periodically runs aggregation on a mongodb collection. As the dataset has grown, the amount of time it takes to aggregate has also grown. My aggregation script has recently stopped working consistently, and the error logs show:
error: { [MongoError: server <x> timed out]
name: 'MongoError',
message: 'server <x> timed out' }
I've tried debugging this, and the only pattern I can find is that this timeout seems to only occur when the aggregation takes longer than 2 minutes (it times out right around 2m). Does anyone have additional debugging tips for this? The 2-minute thing is giving me the impression that I just need to configure some timeout somewhere but I can't figure out where or if i'm just falling into a red-herring trap.
About the system configuration: This aggregation script is a node.js (v5.9.1) application running in an alpine-based docker (v1.9.1) container. It uses the mongodb node driver (v2.1.19). Single mongodb server (though this is also happening in a separate environment with a replSet) running mongod (v3.2.6)
I got the same problem for logs time aggregation. I think I have the solution for you.
I found that the option socketTimeoutMS is responsible for that.
Check your mongo_client.js default socketTimeoutMS value. For me it was 2min. Mongodb module version 2.1.18.
So just add this option into your url :
mongodb://localhost:27017/test?maxPoolSize=2&socketTimeoutMS=60000
It will set timeout to 10 mins. That does the trick for me.

Resources