2 Linux scripts nearly identical. Variables getting confused between to different scripts - linux

I have two scripts. The only difference between the two scripts is the log file name and the device ip address that it fetches the data from. The problem is that the log file that concats continuously mixes up and starts writing the contents of one device onto the log of the other. So, 1 particular log file randomly switches from showing the data from one device to the other device..
Here is a sample of what it gets from the curl call.
{"method":"uploadsn","mac":"04786364933C","version":"1.35","server":"HT","SN":"267074DE","Data":[7.2]}
I'm 99% the issue is with the log variable, as one script runs every 30 minutes and one script runs every 15 minutes, so i can tell by the date stamps that the issue is not from fetching from the wrong device, but the concatenating of the files. It appears to concat the wrong file to the new file....
Here is the code of both.
#!/bin/bash
log="/scripts/cellar.log"
if [ ! -f "$log" ]
then
touch "$log"
fi
now=`date +%a,%m/%d/%Y#%I:%M%p`
json=$(curl -m 3 --user *****:***** "http://192.168.1.146/monitorjson" --silent --stderr -)
celsius=$(echo $json | cut -d "[" -f2 | cut -d "]" -f1)
temp=$(echo "scale=4; $celsius*1.8 + 32" | bc)
line=$(echo $now : $temp)
echo $line
echo $line | cat - $log > temp && mv temp $log | sed -n '1,192p' $log
and here is the second
#!/bin/bash
log="/scripts/gh.log"
if [ ! -f "$log" ]
then
touch "$log"
fi
now=`date +%a,%m/%d/%Y#%I:%M%p`
json=$(curl -m 3 --user *****:***** "http://192.168.1.145/monitorjson" --silent --stderr -)
celsius=$(echo $json | cut -d "[" -f2 | cut -d "]" -f1)
temp=$(echo "scale=4; $celsius*1.8 + 32" | bc)
line=$(echo $now : $temp)
#echo $line
echo $line | cat - $log > temp && mv temp $log | sed -n '1,192p' $log
Example of bad log file (shows contents of both devices when should only contain 1):
Mon,11/28/2022#03:30AM : 44.96
Mon,11/28/2022#03:00AM : 44.96
Mon,11/28/2022#02:30AM : 44.96
Tue,11/29/2022#02:15AM : 60.62
Tue,11/29/2022#02:00AM : 60.98
Tue,11/29/2022#01:45AM : 60.98

The problem is that you use "temp" as the filename for a temporary file in both scripts.
I'm not good in understanding sed, but as I read it, you print only the first 192 lines of the logfile with your command. You don't need a temporary file for that.
First: logfiles are usually written from oldest to newest entry (top to bottom), so probably you want to view the 192 newest lines? Then you can make use of the >> output redirection to append your output to the file. Then use tail to get only the bottom of the file. And if necessary, you could reverse that final output.
That last line of your script would then be replaced by:
sed -i '1i '"$line"'
192,$d' $log
Further possible improvements:
Use a single script that gets URL and log filename as parameters
Use the usual log file order (newest entries appended at the end)
Don't truncate log files inside the script, but use logrotate to not exceed a certain filesize

Related

Limit number of parallel jobs in bash [duplicate]

This question already has answers here:
Bash: limit the number of concurrent jobs? [duplicate]
(14 answers)
Closed 1 year ago.
I want to read links from file, which is passed by argument, and download content from each.
How can I do it in parallel with 20 processes?
I understand how to do it with an unlimited number of processes:
#!/bin/bash
filename="$1"
mkdir -p saved
while read -r line; do
url="$line"
name_download_file_sha="$(echo $url | sha256sum | awk '{print $1}').jpeg"
curl -L $url > saved/$name_download_file_sha &
done < "$filename"
wait
You can add this test :
until [ "$( jobs -lr 2>&1 | wc -l)" -lt 20 ]; do
sleep 1
done
This will maintain maximum 21 instance of curl in parallel .
And wait until you reach 19 or a lower value to start another one .
If you are using GNU sleep , you can do sleep 0.5 , to optimize the wait time
So you code will be
#!/bin/bash
filename="$1"
mkdir -p saved
while read -r line; do
until [ "$( jobs -lr 2>&1 | wc -l)" -lt 20 ]; do
sleep 1
done
url="$line"
name_download_file_sha="$(echo $url | sha256sum | awk '{print $1}').jpeg"
curl -L $url > saved/$name_download_file_sha &
done < "$filename"
wait
xargs -P is the simple solution. It gets somewhat more complicated when you want to save to separate files, but you can use sh -c to add this bit.
: ${processes:=20}
< $filename xargs -P $processes -I% sh -c '
line="$1"
url_file="$line"
name_download_file_sha="$(echo $url_file | sha256sum | awk "{print \$1}").jpeg"
curl -L $url > saved/$name_download_file_sha
' -- %
Based on triplee's suggestions, I've lower-cased the environment variable and changed its name to 'processes' to be more correct.
I've also made the suggested corrections to the awk script to avoid quoting issues.
You may still find it easier to replace the awk script with cut -f1, but you'll need to specify the cut delimeter if it's spaces (not tabs).

Check if a file exists with the same name as a directory

I'm trying to make a script that will determine whether a '.zip' file exists for each sub-directory. For example the directory I'm working in could look like this:
/folder1
/folder2
/folder3
folder1.zip
folder3.zip
The script would then recognise that a '.zip' of "folder2" does not exist and then do something about it.
So far I've come up with this (below) to loop through the folders but I'm now stuck trying to convert the directory path into a variable containing the file name. I could then run an if to see whether the '.zip' file exists.
#!/bin/sh
for i in $(ls -d */);
do
filename= "$i" | rev | cut -c 2- | rev
filename="$filename.zip"
done
# No need to use ls
for dir in */
do
# ${var%pattern} removes trailing pattern from a variable
file="${dir%/}.zip"
if [ -e "$file" ]
then
echo "It exists"
else
echo "It's missing"
fi
done
Capturing command output wasn't necessary here, but your line would have been:
# For future reference only
filename=$(echo "$i" | rev | cut -c 2- | rev)
You can do it with something like:
#!/bin/sh
for name in $(ls -d */); do
dirname=$(echo "${name}" | rev | cut -c 2- | rev)
filename="${dirname}.zip"
if [[ -f ${filename} ]] ; then
echo ${dirname} has ${filename}
else
echo ${dirname} has no ${filename}
fi
done
which outputs, for your test case:
folder1 has folder1.zip
folder2 has no folder2.zip
folder3 has folder3.zip
You can do it without calling ls and this tends to become important if you do it a lot, but it's probably not a problem in this case.
Be aware I haven't tested this with space-embedded file names, it may need some extra tweaks for that.

How to clean csv by another csv while in a 'for' loop?

I'm not a linux expert, and usually in this situation PHP would be much more suitable... But due to the circumstances it occurred that I wrote it in Bash :)
I have the following .sh which runs over all .csv files in the current folder and execute a bunch of commands.
The goal: Cleaning email lists in .csv files (not actually .csv but just a .txt file in practice).
for file in $(find . -name "*.csv" ); do
echo "====================================================" >> db_purge_log.txt
echo "$file" >> db_purge_log.txt
echo "----------------------------------------------------" >> db_purge_log.txt
echo "Contacts BEFORE purge:" >> db_purge_log.txt
wc -l $file | cut -d " " -f1 >> db_purge_log.txt
echo " " >> db_purge_log.txt
cat $file | egrep -v "xxx|yyy|zzz" | grep -v -E -i '([0-z])\1{2,}' | uniq | sort -u > tmp_file
mv tmp_file $file ;
echo "Contacts AFTER purge:" >> db_purge_log.txt
wc -l $file | cut -d " " -f1 >> db_purge_log.txt
done
Now the trouble is:
I want to add a command, somewhere in the middle of this loop, to use another .csv file as suppression list, meaning - every line found as perfect match in that suppression list - delete from $file.
At this point my brain is stuck and I can't think of a solution. To be honest, I didn't manage using sort or grep on 2 different files and export to a 3rd file without completely eliminating the duplicated lines cross both files, so I end up with much less data.
Any help would be much appreciated!
Clean up
Before adding functionality to the script, the existing script needs to be cleaned up — a lot.
I/O Redirection — Don't Repeat Yourself
When I see wall-to-wall I/O redirections like that, I want to cry — that isn't how you do it! You have three options to avoid all that:
for file in $(find . -name "*.csv" )
do
echo "===================================================="
echo "$file"
echo "----------------------------------------------------"
echo "Contacts BEFORE purge:"
wc -l $file | cut -d " " -f1
echo " "
cat $file | egrep -v "xxx|yyy|zzz" | grep -v -E -i '([0-z])\1{2,}' | uniq | sort -u > tmp_file
mv tmp_file $file ;
echo "Contacts AFTER purge:"
wc -l $file | cut -d " " -f1
done >> db_purge_log.txt
Or:
{
for file in $(find . -name "*.csv" )
do
echo "===================================================="
echo "$file"
echo "----------------------------------------------------"
echo "Contacts BEFORE purge:"
wc -l $file | cut -d " " -f1
echo " "
cat $file | egrep -v "xxx|yyy|zzz" | grep -v -E -i '([0-z])\1{2,}' | uniq | sort -u > tmp_file
mv tmp_file $file ;
echo "Contacts AFTER purge:"
wc -l $file | cut -d " " -f1
done
} >> db_purge_log.txt
Or even:
exec >>db_purge_log.txt # By default, standard output will go to db_purge_log.txt
for file in $(find . -name "*.csv" )
do
echo "===================================================="
echo "$file"
echo "----------------------------------------------------"
echo "Contacts BEFORE purge:"
wc -l $file | cut -d " " -f1
echo " "
cat $file | egrep -v "xxx|yyy|zzz" | grep -v -E -i '([0-z])\1{2,}' | uniq | sort -u > tmp_file
mv tmp_file $file ;
echo "Contacts AFTER purge:"
wc -l $file | cut -d " " -f1
done
The first form is adequate for this script which has a single loop in it to provide I/O redirection to. The second form, using { and } would handle more general sequences of commands. The third form, using exec, is 'permanent'; you can't recover the original standard output, whereas with the { ... } form you can have different sections of the script writing to different places.
One other advantage of all these variations is that you can trivially send errors to the same place that you're sending standard output if that's what you desire. For example:
exec >>db_purge_log.txt 2>&1
Other issues
Suppressing file name from wc — instead of:
wc -l $file | cut -d " " -f1
use:
wc -l < $file
UUOC — Useless use of cat — instead of:
cat $file | egrep -v "xxx|yyy|zzz" | grep -v -E -i '([0-z])\1{2,}' | uniq | sort -u > tmp_file
use:
egrep -v "xxx|yyy|zzz" $file | grep -v -E -i '([0-z])\1{2,}' | uniq | sort -u > tmp_file
UUOU — Useless use of uniq
It is not at all clear why you need uniq and sort -u; in context, sort -u is sufficient, so:
egrep -v "xxx|yyy|zzz" $file | grep -v -E -i '([0-z])\1{2,}' | sort -u > tmp_file
UUOG — Useless use of grep
egrep is equivalent to grep -E and both are capable of handling multiple regular expressions, and the second will match what is matched by the expression in the parentheses 3 or more times (we really only need to match three times), so in fact the second expression will do the job of the first. And the [0-z] match is dubious. It probably matches sundry punctuation characters as well as the upper and lower case digits, but you're already doing a case-insensitive search because of the -i, so we can regularize all that to:
grep -Eiv '([0-9a-z]){3}' $file | sort -u > tmp_file
File names with spaces
The code is not going to handle file names with spaces, tabs or newlines because of the for file in $(find ...) notation. It probably isn't necessary to deal with that now — be aware of the issue.
Final clean up
for file in $(find . -name "*.csv" )
do
echo "===================================================="
echo "$file"
echo "----------------------------------------------------"
echo "Contacts BEFORE purge:"
wc -l < $file
echo " "
grep -Evi '([0-9a-z]){3}' | sort -u > tmp_file
mv tmp_file $file
echo "Contacts AFTER purge:"
wc -l <$file
done >> db_purge_log.txt
Add the extra functionality
I want to add a command, somewhere in the middle of this loop, to use another .csv file as suppression list — meaning that every line found as perfect match in that suppression list should be deleted from $file.
Since we're already sorting the input files ($file), we can sort the suppression file (call it suppfile='suppressions.txt'too if it is not already sorted. Given that, we then use comm to eliminate the lines that appear in both $file and $suppfile. We're interested in the lines that only appear in $file (or, as will be the case here, in the edited and sorted version of the file), so we want to suppress the common entries and the entries from $suppfile that do not appear in $file. The comm -23 - "$suppfile" command reads the edited, sorted file from standard input - and leaves out the entries from "$suppfile"
suppfile='suppressions.txt' # Must be in sorted order
for file in $(find . -name "*.csv" )
do
echo "===================================================="
echo "$file"
echo "----------------------------------------------------"
echo "Contacts BEFORE purge:"
wc -l < "$file"
echo " "
grep -Evi '([0-9a-z]){3}' | sort -u | comm -23 - "$suppfile" > tmp_file
mv tmp_file "$file"
echo "Contacts AFTER purge:"
wc -l < "$file"
done >> db_purge_log.txt
If the suppression file is not in sorted order, simply sort it into a temporary file. Beware of using the .csv suffix on the suppression file in the current directory; it will catch the file and empty it because every line in the suppression file matches a line in the suppression file, which is not helpful for any files processed after the suppression file.
Oops — I over-simplified the grep regex. It should (probably) be:
grep -Evi '([0-9a-z])\1{2}' $file
The difference is considerable. My original rewrite will look for any three adjacent digits or letters (e.g. 123 or abz); the revision (actually very similar to one of the original commands) looks for a character from [0-9A-Za-z] followed by two occurrences of the same character (e.g. 111 or aaa, but not 123 or abz).
If perchance the alternatives xxx|yyy|zzz were really not 3 repeated characters, you might need two invocations of grep in sequence.
If I understand you correctly, assuming a recent 'nix, grep should do most of the trick for you. The command, grep -vf filterfile input.csv will output the lines in input.csv that do NOT match any regular expression found in filterfile.
A couple of other comments ... uniq needs the input sorted in order to remove dups, so you might want the sort before it in the pipe (unless your input data is sorted).
Or if the input is sorted to start with, grep -u will omit duplicates.
Small suggestion -- you might add a #!/bin/bash as the first line in order to ensure that the script is run by bash rather than the user's login shell (it might not be bash).
HTH.
b

newbie in bash scripting assistance please

I run bash scripts from time to time on my servers, I am trying to write a script that monitors log folders and compress log files if folder exceeds defined capacity. I know there are better ways of doing what I am currently trying to do, your suggestions are more than welcome. The script below is throwing an error "unexpected end of file" .Below is my script.
dir_base=$1
size_ok=5000000
cd $dir_base
curr_size=du -s -D | awk '{print $1}' | sed 's/%//g' zipname=archivedate +%Y%m%d
if (( $curr_size > $size_ok ))
then
echo "Compressing and archiving files, Logs folder has grown above 5G"
echo "oldest to newest selected."
targfiles=( `ls -1rt` )
echo "rocess files."
for tfile in ${targfiles[#]}
do
let `du -s -D | awk '{print $1}' | sed 's/%//g' | tail -1`
if [ $curr_size -lt $size_ok ];
then
echo "$size_ok has been reached. Stopping processes"
break
else if [ $curr_size -gt $size_ok ];
then
zip -r $zipname $tfile
rm -f $tfile
echo "Added ' $tfile ' to archive'date +%Y%m%d`'.zip and removed"
else [ $curr_size -le $size_ok ];
echo "files in $dir_base are less than 5G, not archiving"
fi
Look into logrotate. Here is an example of putting it to use.
With what you give us, you lack a "done" to end the for loop and a "fi" to end the main if. Please reformat your code and You will get more precise answers ...
EDIT :
Looking at your reformatted script, it is as said : The "unexpected end of file" comes from the fact you have not closed your "for" loop neither your "if"
As it seems that you mimick the logrotate behaviour, check it as suggested by #Hank...
my2c
My du -s -D does not show % sign. So you can just do.
curr_size=$(du -s -D)
set -- $curr_size
curr_size=$1
saves you a few overheads instead of du -s -D | awk '{print $1}' | sed 's/%//g.
If it does show % sign, you can get rid of it like this
du -s -D | awk '{print $1+0}'. No need to use sed.
Use $() syntax instead of backticks whenever possible
For targfiles=(ls -1rt) , you can omit the -1. So it can be
targfiles=( $(ls -rt) )
Use quotes around your variables whenever possible. eg "$zipname" , "$tfile"

how to loop files in linux from svn status

As being quite a newbie in linux, I have the follwing question.
I have list of files (this time resulting from svn status) and i want to create a script to loop them all and replace tabs with 4 spaces.
So I want from
....
D HTML/templates/t_bla.tpl
M HTML/templates/t_list_markt.tpl
M HTML/templates/t_vip.tpl
M HTML/templates/upsell.tpl
M HTML/templates/t_warranty.tpl
M HTML/templates/top.tpl
A + HTML/templates/t_r1.tpl
....
to something like
for i in <files>; expand -t4;do cp $i /tmp/x;expand -t4 /tmp/x > $i;done;
but I dont know how to do that...
You can use this command:
svn st | cut -c8- | xargs ls
This will cut the first 8 characters leaving only a list of file names, without Subversion flags. You can also add grep before cut to filter only some type of changes, like /^M/. xargs will pass the list of files as arguments to a given command (ls in this case).
I would use sed, like so:
for i in files
do
sed -i 's/\t/ /' "$i"
done
That big block in there is four spaces. ;-)
I haven't tested that, but it should work. And I'd back up your files just in case. The -i flag means that it will do the replacements on the files in-place, but if it messes up, you'll want to be able to restore them.
This assumes that $files contains the filenames. However, you can also use Adam's approach at grabbing the filenames, just use the sed command above without the "$i".
Not asking for any votes, but for the record I'll post the combined answer from #Adam Byrtek and #Dan Fego:
svn st | cut -c8- | xargs sed -i 's/\t/ /'
I could not test it with real subversion output, but this should do the job:
svn st | cut -c8- | while read file; do expand -t4 $file > "$file-temp"; mv "$file-temp" "$file"; done
svn st | cut -c8- will generate a list of files without subversion flags. read will then save each entry in the variable $file and expand is used to replace the tabs with four spaces in each file.
Not quite what you're asking, but perhaps you should be looking into commit hooks in subversion?
You could create a hook to block check-ins of any code that contains tabs at the start of a line, or contains tabs at all.
In the repo directory on your subversion server there'll be a directory called hooks. Put something in there which is executable called 'pre-commit' and it'll be run before anything is allowed to be committed. It can return a status to block the commit if you wish.
Here's what I have to stop php files with syntax errors being checked in:
#!/bin/bash
REPOS="$1"
TXN="$2"
PHP="/usr/bin/php"
SVNLOOK=/usr/bin/svnlook
$SVNLOOK log -t "$TXN" "$REPOS" | grep "[a-zA-Z0-9]" > /dev/null
if [ $? -ne 0 ]
then
echo 1>&2
echo "You must enter a comment" 1>&2
exit 1
fi
CHANGED=`$SVNLOOK changed -t "$TXN" "$REPOS" | awk '{print $2}'`
for LINE in $CHANGED
do
FILE=`echo $LINE | egrep \\.php$`
if [ $? == 0 ]
then
MESSAGE=`$SVNLOOK cat -t "$TXN" "$REPOS" "${FILE}" | $PHP -l`
if [ $? -ne 0 ]
then
echo 1>&2
echo "***********************************" 1>&2
echo "PHP error in: ${FILE}:" 1>&2
echo "$MESSAGE" | sed "s| -| $FILE|g" 1>&2
echo "***********************************" 1>&2
exit 1
fi
fi
done

Resources