Removing forks in a shell script so that it runs well in Cygwin - linux

I'm trying to run a shell script on windows in Cygwin. The problem I'm having is that it runs extremely slowly in the following section of code. From a bit of googling, I believe its due to there being a large amount of fork() calls within the script and as windows has to use Cygwins emulation of this, it just slows to a crawl.
A typical scenario would be in Linux, the script would complete in < 10 seconds (depending on file size) but in Windows on Cygin for the same file it would take nearly 10 minutes.....
So the question is, how can i remove some of these forks and still have the script return the same output. I'm not expecting miracles but I'd like to cut that 10 minute wait time down a fair bit.
Thanks.
check_for_customization(){
filename="$1"
extended_class_file="$2"
grep "extends" "$filename" | grep "class" | grep -v -e '^\s*<!--' | while read line; do
classname="$(echo $line | perl -pe 's{^.*class\s*([^\s]+).*}{$1}')"
extended_classname="$(echo $line | perl -pe 's{^.*extends\s*([^\s]+).*}{$1}')"
case "$classname" in
*"$extended_classname"*) echo "$filename"; echo "$extended_classname |$classname | $filename" >> "$extended_class_file";;
esac
done
}
Update: Changed the regex a bit and used a bit more perl:
check_for_customization(){
filename="$1"
extended_class_file="$2"
grep "^\(class\|\(.*\s\)*class\)\s.*\sextends\s\S*\(.*$\)" "$filename" | grep -v -e '^\s*<!--' | perl -pe 's{^.*class\s*([^\s]+).*extends\s*([^\s]+).*}{$1 $2}' | while read classname extended_classname; do
case "$classname" in
*"$extended_classname"*) echo "$filename"; echo "$extended_classname | $classname | $filename" >> "$extended_class_file";;
esac
done
}
So, using the above code, the run time was reduced from about 8 minutes to 2.5 minutes. Quite an improvement.
If anybody can suggest any other changes I would appreciate it.

Put more commands into one perl script, e. g.
check_for_customization(){
filename="$1" extended_class_file="$2" perl -n - "$1" <<\EOF
next if /^\s*<!--/;
next unless /^.*class\s*([^\s]+).*/; $classname = $1;
next unless /^.*extends\s*([^\s]+).*/; $extended_classname = $1;
if (index($extended_classname, $classname) != -1)
{
print "$ENV{filename}\n";
open FILEOUT, ">>$ENV{extended_class_file}";
print FILEOUT "$extended_classname |$classname | $ENV{filename}\n"
}
EOF
}

Related

Is it possible to do watch logfile with tail -f and pipe updates/changes over netcat to another local system? [duplicate]

This question already has answers here:
Piping tail output though grep twice
(2 answers)
Closed 4 years ago.
There is a file located at $filepath, which grows gradually. I want to print every line that starts with an exclamation mark:
while read -r line; do
if [ -n "$(grep ^! <<< "$line")" ]; then
echo "$line"
fi
done < <(tail -F -n +1 "$filepath")
Then, I rearranged the code by moving the comparison expression into the process substitution to make the code more concise:
while read -r line; do
echo "$line"
done < <(tail -F -n +1 "$filepath" | grep '^!')
Sadly, it doesn't work as expected; nothing is printed to the terminal (stdout).
I prefer to write grep ^\! after tail. Why doesn't the second code snippet work? Why putting the command pipe into the process substitution make things different?
PS1. This is how I manually produce the gradually growing file by randomly executing one of the following commands:
echo ' something' >> "$filepath"
echo '!something' >> "$filepath"
PS2. Test under GNU bash, version 4.3.48(1)-release and tail (GNU coreutils) 8.25.
grep is not line-buffered when its stdout isn't connected to a tty. So it's trying to process a block (usually 4 KiB or 8 KiB or so) before generating some output.
You need to tell grep to buffer its output by line. If you're using GNU grep, this works:
done < <(tail -F -n +1 "$filepath" | grep '^!' --line-buffered)
^^^^^^^^^^^^^^^

Will Running A GNU Parallel Command With Variables Cause Conflict?

I think this comes down to me not fully understanding how the processes of GNU Parallel are divided. I looked up if using variables with GNU Parallel would conflict and I found very little to nothing online.
If I'm using a command like the following with parallel:
echo "$textdata" | parallel -j5 cat | for line in file; do var1=$(echo $line); var2=$(echo "$line" | grep -A 1); var3=$(echo "$line" | somecommand); echo "$var1" "$var2" "$var3"; done
Would it conflict by overriding each other when using the variable? Or, does it run in different processes and the same variable can be used?
In other words, would $var1, $var2 or $var3 get confused between different processes running in parallel?
For anyone else wanting to know if running variables in GNU parallel will overlap, simply just take the command, such as:
echo "$textdata" | parallel -j5 cat | for line in file; do var1=$(echo $line); var2=$(echo "$line" | grep -A 1); var3=$(echo "$line" | somecommand); echo "$var1" "$var2" "$var3"; done
And run the command as a script.sh:
for line in file; do var1=$(echo $line); var2=$(echo "$line" | grep -A 1); var3=$(echo "$line" | somecommand); echo "$var1" "$var2" "$var3"; done
Thus is would be:
echo "$textdata" | parallel -j5 /script.sh
Hope this helps anyone working with parallel and variables.

Calling a shell script that is stored in another shell script variabl

I searched SO but could not find any relevant post with this specific problem. I would like to know how to call a shell script which is stored in a variable of another shell script.
In the below script I am trying to read service name & corresponding shellscript, check if the service is running, if not, start the service using the shell script associated with that service name. tried multiple options shared in various forums(like 'eval' etc) with no luck. please help to provide your suggestions on this.
checker.sh
#!/bin/sh
while read service
do
servicename=`echo $service | cut -d: -f1`
servicestartcommand=`echo $service | rev | cut -d: -f1 | rev`
if (( $(ps -ef | grep -v grep | grep $servicename | wc -l) > 0 ))
then
echo "$servicename Running"
else
echo "!!$servicename!! Not Running, calling $servicestartcommand"
eval "$servicestartcommand"
fi
done < names.txt
Names.txt
WebSphere:\opt\software\WebSphere\startServer.sh
WebLogic:\opt\software\WebLogic\startWeblogic.sh
Your script can be refactored into this:
#!/bin/bash
while IFS=: read -r servicename servicestartcommand; do
if ps cax | grep -q "$servicename"; then
echo "$servicename Running"
else
echo "!!$servicename!! Not Running, calling $servicestartcommand"
$servicestartcommand
fi
done < names.txt
No need to use wc -l after grep's output as you can use grep -q
No need to use read full line and then use cut, rev etc later. You can use IFS=: and read the line into 2 separate variables
No need to use eval in the end
It is much simpler than you expect. Instead of:
eval "$servicestartcommand"
eval should only be used in extreme circumstances. All you need is
$servicestartcommand
Note: no quotes.
As an example, try this on the command-line:
cmd='ls -l'
$cmd
That should work. But:
"$cmd"
will fail. It will look for a program with a space in its name called 'ls -l'.
May be I don't get the idea, but why not use system variables?
export FOO=bar
echo $FOO
bar

Linux 'cut' command line and replace

I need to create some text using cut command and replace with space, on Linux terminal.
Examples:
Linux
inux
nux
ux
x
This is my bash script.
#!/bin/bash
INPUT=$#
SIZE=$(echo $INPUT|wc -c)
let $((SIZE--))
for i in $(seq 1 $SIZE);
do echo $INPUT | cut -c ${i}-${SIZE} ;
done
and i have failed to create some text like :
Linux
inux
nux
ux
x
This should do the trick:
#!/bin/bash
INPUT="$#"
SIZE=${#INPUT}
for ((i=0; i < ${SIZE}; i++)); do
echo "${INPUT}"
INPUT="${INPUT:0:${i}} ${INPUT:$((i+1)):${SIZE}}"
#INPUT="$(echo "$INPUT" | sed "s/^\(.\{${i}\}\)./\1 /")"
done
I added a sed option in the comment, although it creates a sub-process when you don't really have to.

newbie in bash scripting assistance please

I run bash scripts from time to time on my servers, I am trying to write a script that monitors log folders and compress log files if folder exceeds defined capacity. I know there are better ways of doing what I am currently trying to do, your suggestions are more than welcome. The script below is throwing an error "unexpected end of file" .Below is my script.
dir_base=$1
size_ok=5000000
cd $dir_base
curr_size=du -s -D | awk '{print $1}' | sed 's/%//g' zipname=archivedate +%Y%m%d
if (( $curr_size > $size_ok ))
then
echo "Compressing and archiving files, Logs folder has grown above 5G"
echo "oldest to newest selected."
targfiles=( `ls -1rt` )
echo "rocess files."
for tfile in ${targfiles[#]}
do
let `du -s -D | awk '{print $1}' | sed 's/%//g' | tail -1`
if [ $curr_size -lt $size_ok ];
then
echo "$size_ok has been reached. Stopping processes"
break
else if [ $curr_size -gt $size_ok ];
then
zip -r $zipname $tfile
rm -f $tfile
echo "Added ' $tfile ' to archive'date +%Y%m%d`'.zip and removed"
else [ $curr_size -le $size_ok ];
echo "files in $dir_base are less than 5G, not archiving"
fi
Look into logrotate. Here is an example of putting it to use.
With what you give us, you lack a "done" to end the for loop and a "fi" to end the main if. Please reformat your code and You will get more precise answers ...
EDIT :
Looking at your reformatted script, it is as said : The "unexpected end of file" comes from the fact you have not closed your "for" loop neither your "if"
As it seems that you mimick the logrotate behaviour, check it as suggested by #Hank...
my2c
My du -s -D does not show % sign. So you can just do.
curr_size=$(du -s -D)
set -- $curr_size
curr_size=$1
saves you a few overheads instead of du -s -D | awk '{print $1}' | sed 's/%//g.
If it does show % sign, you can get rid of it like this
du -s -D | awk '{print $1+0}'. No need to use sed.
Use $() syntax instead of backticks whenever possible
For targfiles=(ls -1rt) , you can omit the -1. So it can be
targfiles=( $(ls -rt) )
Use quotes around your variables whenever possible. eg "$zipname" , "$tfile"

Resources