delays just after bootup on CentOS7.5 - linux

I'm using CentOS 7.5.1804.
Right after booting-up, the operating system delays.
For example, when I try to write "python" in a terminal,
first I write "pyt" and press .
I have to wait a few seconds for the OS to interpolate to "python".
This phenomenon occurs just after booting-up.
After a few days later, this phenomenon goes away.
Does anyone know a clue to solve this problem?

The bit when you press pyt-"tab" is part of bash-completion package as the command completion happens after you typed the full command. So the cause has to be investigated starting with bash. My educated guess is that some process or I/O is keeping the system busy.
You can start with some generic system information tools as soon as the system start:
uptime to see the system load
vmstat -n 1 to check the status of the CPU
ps aux to check running processes
iotop to check for I/O
systemctl list-jobs to show running jobs in systemd
and based on the result of them perform deeper analysis.
Another thing might be the access to the disk slowing down the systemt at startup. Where is the machine running?

I don't know about fixing — there are all kinds of things that could go cause delays. But I can offer a few tips to investigate.
The first step to investigate is to run set -x to get a trace of the commands that the shell executes to generate the completions. Watch where it pauses.
do you have the issue with different auto-completion? if its only python you can time the execution of your command
time python
you can watch if you have some problems at launch with redirect standar output and error to a file.
strace python 2>&1 launch.log
take a strace at boot and one later then you can check if there is difference between:
diff -u delays.log delays2.log | grep ^+
hope it can help.

Related

When PHP exec function creates a process, where is it queued so that it can be removed programmatically or from command line?

I have a PHP script that runs the following code:
exec("ls $image_subdir | parallel -j8 tesseract $image_subdir/{} /Processed/OCR/{.} -l eng pdf",$output, $result_code);
The code runs, however, even after I terminate the PHP script and close the browser, it continues to create the pdf files (thousands). It has been 24 hrs and it is still running. When I run a ps command, it only shows the 8 current processes that were created.
How can I find where all the pending ones are running and kill them? I believe I can simply restart Apache/PHP, but I would like to know where these pending processes are and how they can be down or controlled. It seemed originally that the code waited a minute while it executed the above code, then proceeded to the next line of code in the PHP script. So it appears that it created the jobs somewhere and then proceeded to the next line of code.
Is it perhaps something peculiar to the parallel command? Any information is very much appreciated. Thank you.
The jobs appear to have been produced by a perl process:
perl /usr/bin/parallel -j8 tesseract {...basically the code from the exec() function call in the php script}
perl was invoked either by the gnu parallel command or php's exec function. In any event, htop would not allow killing of process and did not produce any error or status and so it may be a permission problem preventing htop from killing the process. So it was done with sudo on the command line which ultimately killed the process and stopped any further processes creation from the original PHP exec() call.

at job scheduler doesn't work on my Ubuntu

I know there are many linux experts here, I wish to get little help with at command in Ubuntu.
I have been troubled by at command in ubuntu (18.04 and 20.04) for quite a while, but I don't know where I made a mistake. I've tried at on three of my Ubuntu systems and it doesn't work on any of them. at is very handle and nice job scheduler, I really want to get it to work so that I do not have to manually launch programs in the late night on a shared Ubuntu server. I read many tutorials on at command, here is a very good one.
at now + 1 minutes -f ~/myscript.sh, it looks really great and can save me lots of energy. Unfortunately, when myscript.sh is extremely simple,then at now + 1 minutes -f ~/myscript.sh can run smoothly and I get what I expected. Here is everything I have in myscript.sh:
echo $(date) > ~/Desktop/time.txt
On top of that, it never worked for me. For example when I change myscript.sh to
echo $(date) > ~/Desktop/time.txt
pycharm.sh
Basically what myscript.sh does it is noting down the time and to open Pycharm IDE. I can run sh myscript.sh without at , it wroks very well. However, when I run at at now + 1 minutes -f ~/myscript.sh, the time is noted down but Pycharm was not never opened (I can see the process in htop if Pycharm is open). Also at now + 1 minutes -f ~/script.sh does not work with any of my other shell scripts.
Could you please help me understand where I have done wrong and how to make it work. Thank you very much.
PyCharm and other GUI programs need a lot of information from your environment. The atd daemon which runs jobs for at does not have access to this environment. You will need to specify it directly.
I recommend running printenv redirected to a file in an at job. Then compare that to printenv running from a terminal in your GUI session. Find the differences and see if you can set them up the same way at the beginning of your at script.

How to follow the progress of a linux command?

I am currently working with a large data set where even the file format conversion takes at least an hour per subject and as a result I am often unsure whether my command has been executed or the program has frozen. I was wondering whether anyone has a tip to how to follow the progress of the commands/scripts I am trying to run in linux?
Your help will be much appreciated.
In addition to #basile-starynkevitch answer,
I have a bash script that can measure how much file did you processed in percents.
It watch into procfs get current position from fd information (/proc/pid/fdinfo), and count this in percents, relative to total file size.
See https://gist.github.com/azat/2830255
curl -s https://gist.github.com/azat/2830255/raw >| progress_fds.sh \
&& chmod +x progress_fds.sh
Usage:
./progress_fds.sh /path/to/file [ PID]
Сan be useful to someone
If the long-lasting command produces some output in a file foo.out, you could do watch ls -l foo.out or tail -f foo.out
You could also list /proc/$(pidof prog)/fd to find out the opened files of some prog
You can follow the syscalls of a program by using strace, which will enable you to follow the open calls.
You can use verbose output, but it will slow things down even more.
I guess there can't be a general answer to that, it just depends on the type of program (that doesn't even has to do anything with Linux, see the "halting problem").
If you happen to use a pipe during the conversion I find the pv(1) tool pretty helpful. Even if pv can't know the total size of the data it helps to see if there is actual progress and how good the datarate is. It isn't part of most standard installations though and probably has to be installed explicitly.

simple timeout on I/O for command for linux

First the background to this intriguing challenge. The continuous integration build can often have failures during development and testing of deadlocks, loops, or other issues that result in a never ending test. So all the mechanisms for notifying that a build has failed become useless.
The solution will be to have the build script timeout if there's zero output to the build log file for more than 5 minutes since the build routinely writes out the names of unit tests as it proceeds. So that's the best way to identify it's "frozen".
Okay. Now the nitty gritty...
The build server uses Hudson to run a simple bash script that invokes the more complex build script based on Nant and MSBuild (all on Windows).
So far all solutions around the net involve a timeout on the total run time of the command. But that solution fails in this case because the tests might hang or freeze in the first 5 minutes.
What we've thought of so far:
First, here's the high level bash command run the full test suite in Hudson.
build.sh clean free test
That command simply sends all the Nant and MSBuild build logging to stdout.
It's obvious that we need to tee that output to a file:
build.sh clean free test 2>&1 | tee build.out
Then in parallel a command needs to sleep, check the modify time of the file and if more than 5 minutes kill the main process. A kill -9 will be fine at that point--nothing graceful needed once it has frozen.
That's the part you can help with.
In fact, I made a script like this over 15 years ago to kill the connection with a data phone line to japan after periods of inactivity but can't remember how I did it.
Sincerely,
Wayne
build.sh clean free test 2>&1 | tee build.out &
sleep 300
kill -KILL %1
You may be able to use timeout:
timeout 300 command
Solved this myself by writing a bash script.
It's called iotimeout with one parameter which is the number of seconds.
You use it like this:
build.sh clean dev test | iotimeout 120
iotimeout has 2 loops.
One is a simple while read line loop that echos echo line but
it also uses the touch command to update the modified time of a
tmp file every time it writes a line. Unfortunately, it wasn't
possible to monitor a build.out file because Windoze doesn't
update the file modified time until you close the file. Oh well.
Another loop runs in the background, that's a forever loop
which sleeps 10 seconds and then checks the modified time
of the temp file. If that ever exceeds 120 seconds old then
that loop forces the entire process group to exit.
The only tricky stuff was returning the exit code of the original
program. Bash gives you a PIPESTATUS array to solve that.
Also, figuring out how to kill the entire program group was
some research but turns out to be easy just--kill 0

/usr/bin/perl: bad interpreter: Text file busy

This is a new one for me: What does this error indicate?
/usr/bin/perl: bad interpreter: Text file busy
There were a couple of disk-intensive processes running at the time, but I've never seen that message before—in fact, this is the first time that I can remember getting an error when trying to run a Perl script. After a few seconds of waiting, I was able to run it, and haven't seen the issue since, but it would be nice to have an explanation for this.
Running Ubuntu 9.04, file system is ext3.
I'd guess you encountered this issue.
The Linux kernel will generate a bad interpreter: Text file busy error if your Perl script (or any other kind of script) is open for writing when you try to execute it.
You don't say what the disk-intensive processes were doing. Is it possible one of them had the script open for read+write access (even if it wasn't actually writing anything)?
This happens because the script file is open for writing, possibly by a rogue process which has not terminated.
Solution: Check what process is still accessing the file, and terminate it.
Eg:
# /root/wordpress_plugin_updater/updater.pl --wp-path=/var/www/virtual/joel.co.in/drjoel.in/htdocs
-bash: /root/wordpress_plugin_updater/updater.pl: /root/perl/bin/perl: bad interpreter: Text file busy
Run lsof (list open files command) on the script name:
# lsof | grep updater.pl
sftp-serv 4416 root 3r REG 144,103 11043 33046751 /root/wordpress_plugin_updater/updater.pl
Kill the process by its PID:
kill -9 4416
Now try running the script again. It works now.
# /root/wordpress_plugin_updater/updater.pl --wp-path=/www/htdocs
Wordpress Plugin Updater script v3.0.1.0.
Processing 24 plugins from
This always has to do with the perl interpreter (/usr/bin/perl) being inaccessible. In fact, it happens when a shell script is running or awk or whatever is on the #! line at the top of the script.
The cause can be many things ... perms, locked file, filesystem offline, and on and on.
It would obviously depend on what was happening at the exact moment you ran it when the problem occured. But I hope the answer is what you were looking for.
If the script was edited in Windows, or any other OS with different "native" line endings, it could be as simple as a CR(^M) "hiding" at the end of the first line. Vi improved can be set up to hide this non native line ending. In my case I simply re-typed the offending first line in VI and the error went away.
If you are using gnu parallel and you see this error then it may be because you are streaming a file in from the same place that you are writing the file out...
I had this same issue and grepping to see what was using the file didnt work. turns out i just needed to restart the droplet and viola script now works.

Resources