Torque : pbs_Server No such file or directory (2) in recov_attr, read2 - pbs

I try to start pbs_server deamon and I get this message :
05/30/2016 10:26:57;0001;PBS_Server;Svr;PBS_Server;LOG_ERROR::No such file or directory (2) in recov_attr, read2
05/30/2016 10:26:57;0001;PBS_Server;Svr;PBS_Server;LOG_ERROR::que_recov, recov_attr[common] failed
pbs deamon appear to be running
root 4670 1 0 10:27 ? 00:00:40 /usr/local/sbin/pbs_server -d /var/torque
but jobs are stuck
Do you have any idea about the pb ?
Thank you very much for the help
Vince

It looks like you might have a problem with your queues, possibly file corruption. I would check the files in server_priv/queues. If you have a new enough version, the files will be in XML format, in which case you can manually revise them. If you have doubts about any of them, you can just move them out of the way, but if you have jobs running in any queue the server cannot find when it loads up, you may lose those jobs.

Related

Snakemake cannot write metadata

I have troubles to get snakemake-minimal=7.8.5 to run on Windows 10. I can execute rules, but snakemake terminates due to an error regarding the metadata:
Failed to set marker file for job started ([Errno 2] No such file or directory: 'C:\\test\\project\\.snakemake\\incomplete\\cnVucy9leHBlcmltZW50XzAzL2RmX2ludGVuc2l0aWVzX3Byb3RlaW5Hcm91cHNfbG9uZ18yMDE3XzIwMThfMjAxOV8yMDIwX04wNTAxNV9NMDQ1NDcvUV9FeGFjdGl2ZV9IRl9YX09yYml0cmFwX0V4YWN0aXZlX1Nlcmllc19zbG90XyM2MDcwLzE0X2V4cGVyaW1lbnRfMDNfZGF0YS5pcHluYg=='). Snakemake will work, but cannot ensure that output files are complete in case of a kill signal or power loss. Please ensure write permissions for the directory C:\test\project\.snakemake
I tried to troubleshoot doing the following
change the folders: Documents, User folder, and like the above in the root folder of my c drive
I tried to manipulate the security settings: Controlled folder or RandsomWare Access, see discussion -> it is deactivated
If I erase the .snakemake it is re-creating upon execution, so I assume I have write access. However, some security setting is disallowing the long filename with the hash
I tried the same workflow on a different Windows 10 machine and there I don't get the error, so I assume it is some windows issue.
Did anyone encounter the same error and found a solution?
I agree it is due to the length of the filename. It seems the default max filename length is 260. The file you pasted has a length of 262. You can edit the registry to allow longer filenames. Also consider opening an issue in snakemake to improve the documentation or otherwise address this issue for windows machines.

How can I prune executors' logs in spark streaming

I'm working on a spark streaming job which runs on standalone mode. The executors by default append the logs in $SPARK_HOME/work/app_idxxxx/stderr and stdout files. Now the problem comes when app runs for a long time say a month or more and it generates a lot of logs inside stderr file. I would like to rollup the stderr daily for a week and archive(delete) that after that. I changed the log4j.properties with org.apache.log4j.RollingFileAppender and directed the logs to a file instead of stderr but the file doesn't respect the rolling and keeps growing.
Creating a cron job to do that is also not working since spark has a pointer to that specific file and changing the name probably not working.
I could't find any documentations for these specific logs. I really appreciate for any help.
After digging more, I finally found how to resolve the issue and I post it here so that the next person don't go through all this suffer and trial/error.
The setting for those logs are in two different places. One in $SPARK_HOME/conf/spark-default.conf add these three lines below in each executor:
spark.executor.logs.rolling.time.interval daily
spark.executor.logs.rolling.strategy time
spark.executor.logs.rolling.maxRetainedFiles 7
The other file that you need to change in each executor is $SPARK_HOME/conf/spark-env.sh add the following line:
SPARK_WORKER_OPTS="$SPARK_WORKER_OPTS -Dspark.worker.cleanup.enabled=true -Dspark.worker.cleanup.interval=1800
-Dspark.worker.cleanup.appDataTtl=864000
-Dspark.executor.logs.rolling.strategy=time
-Dspark.executor.logs.rolling.time.interval=daily
-Dspark.executor.logs.rolling.maxRetainedFiles=7 "
export SPARK_WORKER_OPTS
After these changes it started working properly. Hope this helps some people :)
if you are in standalone mode, just export an environment is enough:
export SPARK_WORKER_OPTS="-Dspark.executor.logs.rolling.strategy=time -Dspark.executor.logs.rolling.time.interval=daily -Dspark.executor.logs.rolling.maxRetainedFiles=7"
you can also refer to: http://apache-spark-user-list.1001560.n3.nabble.com/Executor-Log-Rotation-Is-Not-Working-td18024.html

Configure Logstash to wait before parsing a file

I wonder if you can configure logstash in the following way:
Background Info:
Every day I get a xml file pushed to my server, which should be parsed.
To indicate a complete file transfer afterwards I get an empty .ctl (custom file) transfered to the same folder.
The files both have the following name schema 'feedback_{year}{yearday}_UTC{hoursminutesseconds}_51.{extention}' (e.g. feedback_16002_UTC235953_51.xml). So they have the same file name but one is with .xml and the other is a .ctl file.
Question:
Is there a way to configure logstash to wait parsing the xml file until the according .ctl file is present?
EDIT:
Is there maybe a way to archiev that with filebeat?
EDIT2:
It would also be enough to be able to configure logstash in a way that it will wait x minutes before starting to process a new file, if that is easier.
Thanks for any help in advance
Your problem is that you don't want to start the parser before the file transfer hasn't been completed. So, why don't push the data to a file (file-complete.xml) when you find your flag file (empty.ctl)?
Here is the possible logic for a script and runs using crontab:
if empty.ctl exists:
Clear file-complete.xml
Add the content of file.xml to file-complete.xml.
Remove empty.ctl
This way, you'd need to parse the data from file-complete.xml. I think is simpler to debug and configure.
Hope it helps,

How to retrive Files generated in the past 120 minutes in Linux and also moved to another location

For one of my Project, I have a certain challenge where I need to take all the reports generated in a certain path, I want this to be an automated process in "Linux". I know the way how to get the file names which have been updated in the past 120 mins, but not the files directly. Now my requirements are in such a way
Take a certain files that have been updated in past 120 mins from the path
/source/folder/which/contains/files
Now do some bussiness logic on this generated files which i can take care of
Move this files to
/destination/folder/where/files/should/go
I know how to achieve #2 and #3 but not sure of #1. Can someone help me how can i achieve this.
Thanks in Advance.
Write a shell script. Sample below. I haven't provided the commands to get the actual list of file names as you said you know how to do that.
#!/bin/sh
files=<my file list>
for file in $files; do
cp $file <destination_dirctory>
done

Maximum execution time of 300 seconds exceeded in

I know this subject has like ... many duplcates here and there, but trust me I've spent time fetching these posts and my question remains answerless.
I'm running a PHP script on a Debian Linux / Nginx / PHP-FPM / APC / I think that's all.
I'm executing the script from my SSH terminal (CLI) like : > php plzrunthisFscript.php &
It used to work flawlessly but now, it returns this famous error.
What I've tried ALREADY and it failed (I mean it didn't change anything) :
Check what PHP.INI was used with a phpinfo(); inside my script (it's /etc/php5/cli/php.ini)
maximum_execution time is hardcoded to 0 (means no limit) for CLI if I'm not mistaken.
Tried to add at begining of my script : set_time_limit(0);
Tried to add : ini_set('max_execution_time', -1);
Tried to execute my php command with parameter -d max_execution_time=0
Tried to execute from a web interface (served by Nginx)
Tried to execute from another web interface (SErver by Apache2 this time) gives same error on page.
tried to configure max_execution_time 15 in /etc/php5/cli/php.ini to check if the php.ini that is supposed to be used (because phpinfo() inside my script is ignored or not : it's ignored
Everytime, it brings this error, and sometimes even after more than 300 seconds I think, which is really confusing.
If someone has any idea on how to fix this, or has some things I can try, please advise.
Thank you for your time.
God I finaly solved it!
I think that could be considered as a Magento Bug in a way.
Here is how to reproduce this :
You have Magento (My version is CE 1.7) running with APACHE 2.
One day, you decide to get rid of Apache and try to adopt Nginx.
You configure everything and it's working great, but you end up with this error one day, trying to rebuild your indexes as you often do.
Thing is, when you run (for example) : php indexer.php --reindex catalog_url &
This script includes another one called abstract.php which contains this awesome function :
protected function _applyPhpVariables()
This function will look for a .htaccess in your Magento root directory, then parse every single configuration parameters, and will execute the index script with these parameters.
How clever ...
Anyway, to solve this you just need to (delete / rename / burn) this .htaccess file, and then everything will go back to normal.
Thanks for your help everyone.

Resources