I've been troubleshooting this issue for about a week and I am nowhere, so I wanted to reach out for some help.
I have a perl script that I execute via command like, usually in a manner of
nohup ./script.pl --param arg --param2 arg2 &
I usually have about ten of these running at once to process the same type of data from different sources (that is specified through parameters). The script works fine and I can see logs for everything in nohup.out and monitor status via ps output. This script also uses a sql database to track status of various tasks, so I can track finishes of certain sources.
However, that was too much work, so I wrote a wrapper script to execute the script automatically and that is where I am running into problems. I want something exactly the same as I have, but automatic.
The getwork.pl script runs ps and parses output to find out how many other processes are running, if it is below the configured thresh it will query the database for the most out of date source and kick off the script.
The problem is that the kicked off jobs aren't running properly, sometimes they terminate without any error messages and sometimes they just hang and sit idle until I kill them.
The getwork script queries sql and gets the entire execution command via sql concatanation, so in the sql query I am doing something like CONCAT('nohup ./script.pl --arg ',param1,' --arg2 ',param2,' &') to get the command string.
I've tried everything to get these kicked off, I've tried using system (), but again, some jobs kick off, some don't, sometimes it gets stuck, sometimes jobs start and then die within a minute. If I take the exact command I used to start the job and run it in bash, it works fine.
I've tried to also open a pipe to the command like
open my $ca, "| $command" or die ($!);
print $ca $command;
close $ca;
That works just about as well as everything else I've tried. The getwork script used to be executed through cron every 30 minutes, but I scrapped that because I needed another shell wrapper script, so now there is an infinite look in the get work script that executes a function every 30 minutes.
I've also tried many variations of the execution command, including redirecting output to different files, etc... nothing seems to be consistent. Any help would be much appreciated, because I am truly stuck here....
EDIT:
Also, I've tried to add separate logging within each script, it would start a new log file with it's PID ($$). There was a bunch of weirdness there too, all log files would get created, but then some of the processes would be running and writing to the file, others would just have an empty text file and some would just have one or two log entries. Sometimes the process would still be running and just not doing anything, other times it would die with nothing in the log. Me, running the command in shell directly always works though.
Thanks in advance
You need a kind of job managing framework.
One of the bigest one is Gearman: http://www.slideshare.net/andy.sh/gearman-and-perl
Related
I am making a simple ping script in python, and was looking to add functionality for getting host names for IPs that are up. To do this, im getting the output of nmblookup -A {ip} using os.popen, and parsing the output. the problem im running into is that for systems where nmblookup wont work (such as routers), the command takes a long time to get an error, whereas when the command runs successfully, it retruns results in under a second. My question is how to only wait N seconds for the nmblookup command to return something, and if it doesn't, move on with the program? PS, this is all in linux.
You can prefix the command with timeout.
Refer to the man page of it.
You can do a Popen and run your command as
root#ak-dev:~# timeout 1 sleep 20
root#ak-dev:~# echo $?
124
I have a program that i need to collect 300 pieces of data from, but to manually do the collecting i have to run the program on my ubuntu virtual machine and record the data on excel. It takes a long time to do this whole process. I was wondering if there was a command in linux that i could use to call commands make and to kill me program.
I search watch and tried it but it doesnt work for me:
watch -n 20 make play
where make play runs my program
Yet this doesnt fo everything i want to do. I want to do this every 20 seconds so i have enough time to write my data to my excel file
1. make play (run my program so it prints what i need to record)
2. kill my program
Is there a command for this?
I think you should rethink what you are doing - I can't think of a setting where running and killing a program every 20 seconds makes any sense.
That being said, the standard way to run programs periodically in linux is a cron job. Cron has a 1 minute minimum though, so you would have to write a script that starts 3 instances of your program with 20 second delay, and run this script with cron every minute. You can combine this with the timeout utility, which will kill your program if it is still running after a given time. A quick google search should provide you with further details.
I think you Could use crontab, man crontab to get the manual of crontab. However, you may not be able to run and kill every 20s, at least every 1 min. Hope It could help.
NOTICE: Feedback on how the question can be improved would be great as I am still learning, I understand there is no code because I am confident it does not need fixing. I have researched online a great deal and cannot seem to find the answer to my question. My script works as it should when I change the parameters to produce less outputs so I know it works just fine. I have debugged the script and got no errors. When my parameters are changed to produce more outputs and the script runs for hours then it stops. My goal for the question below is to determine if linux will timeout a process running over time (or something related) and, if, how it can be resolved.
I am running a shell script that has several for loops which does the following:
- Goes through existing files and copies data into a newly saved/named file
- Makes changes to the data in each file
- Submits these files (which number in the thousands) to another system
The script is very basic (beginner here) but so long as I don't give it too much to generate, it works as it should. However if I want it to loop through all possible cases which means I will generates 10's of thousands of files, then after a certain amount of time the shell script just stops running.
I have more than enough hard drive storage to support all the files being created. One thing to note however is that during the part where files are being submitted, if the machine they are submitted to is full at that moment in time, the shell script I'm running will have to pause where it is and wait for the other machine to clear. This process works for a certain amount of time but eventually the shell script stops running and won't continue.
Is there a way to make it continue or prevent it from stopping? I typed control + Z to suspend the script and then fg to resume but it still does nothing. I check the status by typing ls -la to see if the file size is increasing and it is not although top/ps says the script is still running.
Assuming that you are using 'Bash' for your script - most likely, you are running out of 'system resources' for your shell session. Also most likely, the manner in which your script works is causing the issue. Without seeing your script it will be difficult to provide additional guidance, however, you can check several items at the 'system level' that may assist you, i.e.
review system logs for errors about your process or about 'system resources'
check your docs: man ulimit (or 'man bash' and search for 'ulimit')
consider removing 'deep nesting' (if present); instead, create work sets where step one builds the 'data' needed for the next step, i.e. if possible, instead of:
step 1 (all files) ## guessing this is what you are doing
step 2 (all files)
step 3 (all files
Try each step for each file - Something like:
for MY_FILE in ${FILE_LIST}
do
step_1
step_2
step_3
done
:)
Dale
I have a Cron Job scheduled to execute a command a few times a day. There are cases where the cron job isn't needed but will automatically run. If that happens the following error message shows:
PM2 [ERROR] Script already launched, add -f option to force re execution
Note: The Cron Job runs PM2 in reference to a script.
Is there any negative effect to having the cron job run even if the script is already running?
Please provided detailed information or references. Not just your opinion please.
Avoid erroneous error messages by writing a wrapper script that is run from cron instead. Inside the wrapper script, only run your job if it is not already running by querying the process table.
Assuming ksh, here's a snippet (I'm a tad rusty so syntax may need to be tweaked):
# Running will be non-zero if no match found
running=$(ps|grep MY_PROGRAM)
if [[ "$running" -gt 0 ]]; then
# run your program
else
# log its already running
fi
Not sure what detailed information or references there could be for a situation like this. It's not like someone commissions a study to look at this.
Assuming your command is intelligent enough to only allow one execution at a time (which appears to be the case judging by the error message you posted) then the only ill effect is a few CPU clock cycles (I think).
I have put a long running python program in a cron job on a server, so that I can turn off my computer without interrupting the job.
Now I would like to know if the job is correctly started, if it has finished, if there are reasons to stop at a certain point, and so on. How can I do that ?
You could have it write to a logfile, but as it sounds like this isn't possible, you could probably have cron email you the output of the job, try adding MAILTO=you#example.com to your crontab. You should also find evidence of cron activity in your system logfiles (try grep cron /var/log/* to find likely logs on your system).
If you are using cron simply as a way to run processes after you disconnect from a server, consider using screen:
type screen and press return
set your script running
type Ctrl+A Ctrl+D to detatch from the screen
The process continues running even if you log off. Later on simply
screen -r
And you will be will reattached, allowing you to review the script's output
Why not get that cron job to have a log file. Also just do a ps before shutdown.