Will Running A GNU Parallel Command With Variables Cause Conflict? - linux

I think this comes down to me not fully understanding how the processes of GNU Parallel are divided. I looked up if using variables with GNU Parallel would conflict and I found very little to nothing online.
If I'm using a command like the following with parallel:
echo "$textdata" | parallel -j5 cat | for line in file; do var1=$(echo $line); var2=$(echo "$line" | grep -A 1); var3=$(echo "$line" | somecommand); echo "$var1" "$var2" "$var3"; done
Would it conflict by overriding each other when using the variable? Or, does it run in different processes and the same variable can be used?
In other words, would $var1, $var2 or $var3 get confused between different processes running in parallel?

For anyone else wanting to know if running variables in GNU parallel will overlap, simply just take the command, such as:
echo "$textdata" | parallel -j5 cat | for line in file; do var1=$(echo $line); var2=$(echo "$line" | grep -A 1); var3=$(echo "$line" | somecommand); echo "$var1" "$var2" "$var3"; done
And run the command as a script.sh:
for line in file; do var1=$(echo $line); var2=$(echo "$line" | grep -A 1); var3=$(echo "$line" | somecommand); echo "$var1" "$var2" "$var3"; done
Thus is would be:
echo "$textdata" | parallel -j5 /script.sh
Hope this helps anyone working with parallel and variables.

Related

Using grep in an if statement

My goal is to write a shell script take the users that I have already filtered out of a file and check whether those users have a certain string, and if they do, label them as major, if not, nonmajor. My trouble is coming from my first if statement, and I'm not sure if grep is the right way to go in an if statement. Here is what I have:
(
while read i
do
username=`echo $i | grep -v 'CMPSC 1513' | grep -P -v '(?!.*CPSMA 2923)CPSMA' | cut -d'|' -f2`
fullname=`echo $i | grep -v 'CMPSC 1513' | grep -P -v '(?!.*CPSMA 2923)CPSMA' | cut -d'|' -f3`
id=`echo $i | grep -v 'CMPSC 1513' | grep -P -v '(?!.*CPSMA 2923)CPSMA' | cut -d'|' -f4`
if [ $username ]
then
if grep -q "|0510"
then
echo $username":(password):(UID):(GID):"$fullname"+"$id":/home/STUDENTS/majors:/bin/bash"
else
echo $username":(password):(UID):(GID):"$fullname"+"$id":/home/STUDENTS/nonmajors:/bin/bash"
fi
fi
done
)<./cs_roster.txt
Just some info, this is contained in a while loop. In the while loop, i determine whether the person listed should even be major or nonmajor, and my if [ $username ] has been tested and does return all the correct users. At this point the while loop is only running once and then stopping.
Just remove the square brackets and pass $i to grep:
if echo $i | grep -q "|0510"
In your code sample, grep does not have anything to work on.
The "binary operator expected" occurs because you are invoking the command [ with the arguments "grep" and "-q" (you are not invoking grep at all), and [ expects a binary operator where you have specified -q. [ is a command, treated no differently that grep or ls or cat. It is better (IMO) to spell it test, and when invoked by the name test it does not require that its last argument be ]. If you want to use grep in an if statement, just do something like:
if echo "$username" | grep -q "|0510"; then ...
(Although I suspect, depending on the context, there are better ways to accomplish your goal.)
The basic syntax of an if statement is if pipeline; then.... In the common case, the pipeline is the simple command test, and at some point in pre-history, the decision was made to provide the name [ for the test command with the added caveat that its final argument must be ]. I believe this was done in an effort to make if statements look more natural, as if the [ is an operator in the language. Just ignore [ and always use test and much confusion will be avoided.
You can use this code as an exercise. Write an awk script for it, or start with something like
while IFS='|' read -r f1 username fullname id otherfields; do
# I don't know which field you want to test. I will rest with id
if [[ $id =~ ^0510 ]]; then
subdir=majors
else
subdir=nonmajors
fi
echo "${username}:(password):(UID):(GID):${fullname}+${id}:/home/STUDENTS/${subdir}:/bin/bash"
done < <( grep -v 'CMPSC 1513' ./cs_roster.txt | grep -P -v '(?!.*CPSMA 2923)CPSMA' )
This is nice for learning some bash syntax, but consider an awk script for avoiding a while-loop.

Place grep result into variable

I have a problem with bash script. I have a list of files in specific location. I have to take only a date from it and compare it with another date.
for i in *.gz; do
echo $i | grep -Eo '[[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}'
done
The above is greping date from filenames correctly but only when I use echo. In another cases I have errors. I have tried:
tmp=$(echo $i | grep -Eo '[[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}')
Also not working. Any suggestions? I would be grateful for small help!
I wouldn't use grep at all here; use bash's built-in regular-expression handling.
for i in *.gz; do
[[ $i =~ [[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2} ]]
echo "${BASH_REMATCH[0]}"
done
One way around it, could be using stat command
for i in *.gz; do
tmp=$(stat "$i" | awk '/Modify/ { print $2}' )
done
or if you want an array
declare -a tmp
tmp+=$(
for i in *.gz; do
stat "$i" | awk '/Modify/ { print $2}'
done
)
The advantage is, that it is independent of the file names
edit:
I cannot comment on others answwers yet. So this is how you compare date
sixago=$(date --date='-6 month' +%s)
tmp=$(date --date="$tmp" +%s)
if [ "$tmp" -gt "$sixago" ];then
...
fi

Calling a shell script that is stored in another shell script variabl

I searched SO but could not find any relevant post with this specific problem. I would like to know how to call a shell script which is stored in a variable of another shell script.
In the below script I am trying to read service name & corresponding shellscript, check if the service is running, if not, start the service using the shell script associated with that service name. tried multiple options shared in various forums(like 'eval' etc) with no luck. please help to provide your suggestions on this.
checker.sh
#!/bin/sh
while read service
do
servicename=`echo $service | cut -d: -f1`
servicestartcommand=`echo $service | rev | cut -d: -f1 | rev`
if (( $(ps -ef | grep -v grep | grep $servicename | wc -l) > 0 ))
then
echo "$servicename Running"
else
echo "!!$servicename!! Not Running, calling $servicestartcommand"
eval "$servicestartcommand"
fi
done < names.txt
Names.txt
WebSphere:\opt\software\WebSphere\startServer.sh
WebLogic:\opt\software\WebLogic\startWeblogic.sh
Your script can be refactored into this:
#!/bin/bash
while IFS=: read -r servicename servicestartcommand; do
if ps cax | grep -q "$servicename"; then
echo "$servicename Running"
else
echo "!!$servicename!! Not Running, calling $servicestartcommand"
$servicestartcommand
fi
done < names.txt
No need to use wc -l after grep's output as you can use grep -q
No need to use read full line and then use cut, rev etc later. You can use IFS=: and read the line into 2 separate variables
No need to use eval in the end
It is much simpler than you expect. Instead of:
eval "$servicestartcommand"
eval should only be used in extreme circumstances. All you need is
$servicestartcommand
Note: no quotes.
As an example, try this on the command-line:
cmd='ls -l'
$cmd
That should work. But:
"$cmd"
will fail. It will look for a program with a space in its name called 'ls -l'.
May be I don't get the idea, but why not use system variables?
export FOO=bar
echo $FOO
bar

Removing lines matching a pattern

I want to search for patterns in a file and remove the lines containing the pattern. To do this, am using:
originalLogFile='sample.log'
outputFile='3.txt'
temp=$originalLogFile
while read line
do
echo "Removing"
echo $line
grep -v "$line" $temp > $outputFile
temp=$outputFile
done <$whiteListOfErrors
This works fine for the first iteration. For the second run, it throws :
grep: input file ‘3.txt’ is also the output
Any solutions or alternate methods?
The following should be equivalent
grep -v -f "$whiteListOfErrors" "$originalLogFile" > "$outputFile"
originalLogFile='sample.log'
outputFile='3.txt'
tmpfile='tmp.txt'
temp=$originalLogFile
while read line
do
echo "Removing"
echo $line
grep -v "$line" $temp > $outputFile
cp $outputfile $tmpfile
temp=$tmpfile
done <$whiteListOfErrors
Use sed for this:
sed '/.*pattern.*/d' file
If you have multiple patterns you may use the -e option
sed -e '/.*pattern1.*/d' -e '/.*pattern2.*/d' file
If you have GNU sed (typical on Linux) the -i option is comfortable as it can modify the original file instead of writing to a new file. (But handle with care, in order to not overwrite your original)
Used this to fix the problem:
while read line
do
echo "Removing"
echo $line
grep -v "$line" $temp | tee $outputFile
temp=$outputFile
done <$falseFailures
Trivial solution might be to work with alternating files; e.g.
idx=0
while ...
let next='(idx+1) % 2'
grep ... $file.$idx > $file.$next
idx=$next
A more elegant might be the creation of one large grep command
args=( )
while read line; do args=( "${args[#]}" -v "$line" ); done < $whiteList
grep "${args[#]}" $origFile

Removing forks in a shell script so that it runs well in Cygwin

I'm trying to run a shell script on windows in Cygwin. The problem I'm having is that it runs extremely slowly in the following section of code. From a bit of googling, I believe its due to there being a large amount of fork() calls within the script and as windows has to use Cygwins emulation of this, it just slows to a crawl.
A typical scenario would be in Linux, the script would complete in < 10 seconds (depending on file size) but in Windows on Cygin for the same file it would take nearly 10 minutes.....
So the question is, how can i remove some of these forks and still have the script return the same output. I'm not expecting miracles but I'd like to cut that 10 minute wait time down a fair bit.
Thanks.
check_for_customization(){
filename="$1"
extended_class_file="$2"
grep "extends" "$filename" | grep "class" | grep -v -e '^\s*<!--' | while read line; do
classname="$(echo $line | perl -pe 's{^.*class\s*([^\s]+).*}{$1}')"
extended_classname="$(echo $line | perl -pe 's{^.*extends\s*([^\s]+).*}{$1}')"
case "$classname" in
*"$extended_classname"*) echo "$filename"; echo "$extended_classname |$classname | $filename" >> "$extended_class_file";;
esac
done
}
Update: Changed the regex a bit and used a bit more perl:
check_for_customization(){
filename="$1"
extended_class_file="$2"
grep "^\(class\|\(.*\s\)*class\)\s.*\sextends\s\S*\(.*$\)" "$filename" | grep -v -e '^\s*<!--' | perl -pe 's{^.*class\s*([^\s]+).*extends\s*([^\s]+).*}{$1 $2}' | while read classname extended_classname; do
case "$classname" in
*"$extended_classname"*) echo "$filename"; echo "$extended_classname | $classname | $filename" >> "$extended_class_file";;
esac
done
}
So, using the above code, the run time was reduced from about 8 minutes to 2.5 minutes. Quite an improvement.
If anybody can suggest any other changes I would appreciate it.
Put more commands into one perl script, e. g.
check_for_customization(){
filename="$1" extended_class_file="$2" perl -n - "$1" <<\EOF
next if /^\s*<!--/;
next unless /^.*class\s*([^\s]+).*/; $classname = $1;
next unless /^.*extends\s*([^\s]+).*/; $extended_classname = $1;
if (index($extended_classname, $classname) != -1)
{
print "$ENV{filename}\n";
open FILEOUT, ">>$ENV{extended_class_file}";
print FILEOUT "$extended_classname |$classname | $ENV{filename}\n"
}
EOF
}

Resources