How to append string in the contents of files on the basis of search

How to append string in the contents of files on the basis of search - linux

So far i have tried below unix script to append the contents of files on the basis of search. Everything is fine. But if i run the same below script multiple times the perl command is keep on appending in the files.
searchArr=(test1 test2 test3)
replaceArr=(somestring1 somestring2 somestring3)
a=`echo ${#searchArr[#]}`
echo $a
for ((i=0;i<$a;i++))
do
searchValue=`echo ${searchArr[i]}`
replaceValue=`echo ${replaceArr[i]}`
for file in `find . -name \* -print`; do
grep "$searchValue" $file &> /dev/null
if [ $? -ne 0 ]; then
echo "Search string not found in $file!"
else
perl -p -i -e "s/$searchValue/$replaceValue/g" $file
echo $file "Replace string Success!"
fi
done
done
Current Output:-
If run more than one times it will keep on appending like that.
For Example if run two times it will append two times
somestring1somestring1test1
somestring2somestring2test2
somestring3somestring3test3
Needed Output:-
No matter how many times it runs
somestring1test1
somestring2test2
somestring3test3

Add word boundaries to your sed search pattern.
Beginning and end of words in sed and grep

Related

Shell - iterate over content of file but do something only the first x lines

So guys,
I need your help trying to identify the fastest and the most "fault" tolerant solution to my problem.
I have a shell script which executes some functions, based on a txt file, in which I have a list of files.
The list can contain from 1 file to X files.
What I would like to do is iterate over the content of the file and execute my scripts for only 4 items out of the file.
Once the functions have been executed for these 4 files, go over to the next 4 .... and keep on doing so until all the files from the list have been "processed".
My code so far is as follows.
#!/bin/bash
number_of_files_in_folder=$(cat list.txt | wc -l)
max_number_of_files_to_process=4
Translated_files=/home/german_translated_files/
while IFS= read -r files
do
while [[ $number_of_files_in_folder -gt 0 ]]; do
i=1
while [[ $i -le $max_number_of_files_to_process ]]; do
my_first_function "$files" & # I execute my translation function for each file, as it can only perform 1 file per execution
find /home/german_translator/ -name '*.logs' -exec mv {} $Translated_files \; # As there will be several files generated, I have them copied to another folder
sed -i "/$files/d" list.txt # We remove the processed file from within our list.txt file.
my_second_function # Without parameters as it will process all the files copied at step 2.
done
# here, I want to have all the files processed and don't stop after the first iteration
done
done < list.txt
Unfortunately, as I am not quite good at shell scripting, I do not know how to structure it so that it won't waste any resources and mostly, to make sure that it "processes" everything from that file.
Do you have any advice on how to achieve what I am trying to achieve?

only 4 items out of the file. Once the functions have been executed for these 4 files, go over to the next 4
Seems to be quite easy with xargs.
your_function() {
echo "Do something with $1 $2 $3 $4"
}
export -f your_function
xargs -d '\n' -n 4 bash -c 'your_function "$#"' _ < list.txt
xargs -d '\n' for each line
-n 4 take for arguments
bash .... - run this command with 4 arguments
_ - the syntax is bash -c <script> $0 $1 $2 etc..., see man bash.
"$#" - forward arguments
export -f your_function - export your function to environment so child bash can pick it up.
I execute my translation function for each file
So you execute your translation function for each file, not for each 4 files. If the "translation function" is really for each file with no inter-file state, consider rather executing 4 processes in parallel with same code and just xargs -P 4.

If you have GNU Parallel it looks something like this:
doit() {
my_first_function "$1"
my_first_function "$2"
my_first_function "$3"
my_first_function "$4"
my_second_function "$1" "$2" "$3" "$4"
}
export -f doit
cat list.txt | parallel -n4 doit

List files greater than 100K in bash

I want to list the files recursively in the HOME directory. I'm trying to write my own script , so I should not use the command find or ls. My script is:
#!/bin/bash
minSize=102400;
printFiles() {
for x in "$1/"*; do
if [ -d "$x" ]; then
printFiles "$x";
else
size=$(wc -c "$x");
if [[ "$size" -gt "$minSize" ]]; then
echo "$size";
fi
fi
done
}
printFiles "/~";
So, the problem here is that when I run this script, the terminal throws Line 11: division by 0 and /home/gandalf/Videos/*: No such file or directory. I have not divided by any number, why I'm getting this error?. And the second one?
Alternatively, I can't use find or ls because I have to display the files one by one asking to the user if he want to see the next file or not. This is possible using the command find or ls or only can be done writing my own function?
Thanks.

size=$(wc -c "$x");
That's the line that is failing. When you run that wc command manually you should be able to see why:
$ wc -c /tmp/out
5 /tmp/out
The output contains not only the file size but also the file name. So you can't use $size with the -gt comparator on the next line. One way to fix that is to change the wc line to use cut (or awk, or sed, etc) to keep just the file size.
size=$(wc -c "$x" | cut -f1 -d " ")
A simpler alternative suggested by #mklement0:
size=$(wc -c < "$x")

How to extract only file name return from diff command?

I am trying to prepare a bash script for sync 2 directories. But I am not able to file name return from diff. everytime it converts to array.
Here is my code :
#!/bin/bash
DIRS1=`diff -r /opt/lampp/htdocs/scripts/dev/ /opt/lampp/htdocs/scripts/www/ `
for DIR in $DIRS1
do
echo $DIR
done
And if I run this script I get out put something like this :
Only
in
/opt/lampp/htdocs/scripts/www/:
file1
diff
-r
"/opt/lampp/htdocs/scripts/dev/File
1.txt"
"/opt/lampp/htdocs/scripts/www/File
1.txt"
0a1
>
sa
das
Only
in
/opt/lampp/htdocs/scripts/www/:
File
1.txt~
Only
in
/opt/lampp/htdocs/scripts/www/:
file
2
-
second
Actually I just want to file name where I find the diffrence so I can take perticular action either copy/delete.
Thanks

I don't think diff produces output which can be parsed easily for your purposes. It's possible to solve your problem by iterating over the files in the two directories and running diff on them, using the return value from diff instead (and throwing the diff output away).
The code to do this is a bit long, but here it is:
DIR1=./one # set as required
DIR2=./two # set as required
# Process any files in $DIR1 only, or in both $DIR1 and $DIR2
find $DIR1 -type f -print0 | while read -d $'\0' -r file1; do
relative_path=${file1#${DIR1}/};
file2="$DIR2/$relative_path"
if [[ ! -f "$file2" ]]; then
echo "'$relative_path' in '$DIR1' only"
# Do more stuff here
elif diff -q "$file1" "$file2" >/dev/null; then
echo "'$relative_path' same in '$DIR1' and '$DIR2'"
# Do more stuff here
else
echo "'$relative_path' different between '$DIR1' and '$DIR2'"
# Do more stuff here
fi
done
# Process files in $DIR2 only
find $DIR2 -type f -print0 | while read -d $'\0' -r file2; do
relative_path=${file2#${DIR2}/};
file1="$DIR1/$relative_path"
if [[ ! -f "$file2" ]]; then
echo "'$relative_path' in '$DIR2 only'"
# Do more stuff here
fi
done
This code leverages some tricks to safely handle files which contain spaces, which would be very difficult to get working by parsing diff output. You can find more details on that topic here.
Of course this doesn't do anything regarding files which have the same contents but different names or are located in different directories.
I tested by populating two test directories as follows:
echo "dir one only" > "$DIR1/dir one only.txt"
echo "dir two only" > "$DIR2/dir two only.txt"
echo "in both, same" > $DIR1/"in both, same.txt"
echo "in both, same" > $DIR2/"in both, same.txt"
echo "in both, and different" > $DIR1/"in both, different.txt"
echo "in both, but different" > $DIR2/"in both, different.txt"
My output was:
'dir one only.txt' in './one' only
'in both, different.txt' different between './one' and './two'
'in both, same.txt' same in './one' and './two'

Use -q flag and avoid the for loop:
diff -rq /opt/lampp/htdocs/scripts/dev/ /opt/lampp/htdocs/scripts/www/
If you only want the files that differs:
diff -rq /opt/lampp/htdocs/scripts/dev/ /opt/lampp/htdocs/scripts/www/ |grep -Po '(?<=Files )\w+'|while read file; do
echo $file
done
-q --brief
Output only whether files differ.
But defitnitely you should check rsync: http://linux.die.net/man/1/rsync

Bash scripting wanting to find a size of a directory and if size is greater than x then do a task

I have put the following together with a couple of other articles but it does not seem to be working. What I am trying to do eventually do is for it to check the directory size and then if the directory has new content above a certain total size it will then let me know.
#!/bin/bash
file=private/videos/tv
minimumsize=2
actualsize=$(du -m "$file" | cut -f 1)
if [ $actualsize -ge $minimumsize ]; then
echo "nothing here to see"
else
echo "time to sync"
fi
this is the output:
./sync.sh: line 5: [: too many arguments
time to sync
I am new to bash scripting so thank you in advance.

The error:
[: too many arguments
seems to indicate that either $actualsize or $minimumsize is expanding to more than one argument.
Change your script as follows:
#!/bin/bash
set -x # Add this line.
file=private/videos/tv
minimumsize=2
actualsize=$(du -m "$file" | cut -f 1)
echo "[$actualsize] [$minimumsize]" # Add this line.
if [ $actualsize -ge $minimumsize ]; then
echo "nothing here to see"
else
echo "time to sync"
fi
The set -x will echo commands before attempting to execute them, something which assists greatly with debugging.
The echo "[$actualsize] [$minimumsize]" will assist in trying to establish whether these variables are badly formatted or not, before the attempted comparison.
If you do that, you'll no doubt find that some arguments will result in a lot of output from the du -m command since it descends into subdirectories and gives you multiple lines of output.
If you want a single line of output for all the subdirectories aggregated, you have to use the -s flag as well:
actualsize=$(du -ms "$file" | cut -f 1)
If instead you don't want any of the subdirectories taken into account, you can take a slightly different approach, limiting the depth to one and tallying up all the sizes:
actualsize=$(find . -maxdepth 1 -type f -print0 | xargs -0 ls -al | awk '{s += $6} END {print int(s/1024/1024)}')

sed with variable as argument in bash script

I am trying to write a bash script to scan for authorized_keys files and remove the keys of a couple previous employees if found. I am having one heck of a time figuring out the escaping for the sed command at the end. I am using commas instead of / since / can show up in the ssh-key. Any help would be appreciated
#!/bin/bash
declare -A keys
keys["employee1"]='AAAAB3NzaC1yc2EAAAABJQAAAIEAxoZ7ZdpJkL98n8cSTkFBwaAeSNK0m/tOWtF1mu5NAzMM/+1SDO6rJH/ruyyqBJo9s+AHWZLGRHfXT2XBg2SRaUnubAKp0w6qNIbej0MsA/ifAs8AfVGdj0pUPLtKpo6XVZkB8vEZSIQ+xNk1n5HJrGJnFGWKWeY3z1/KOLxcLHU='
keys["employee2"]='AAAAB3NzaC1yc2EAAAABIwAAAQEAwHYNAVhb319OBVXPhYF8cSTkFBwaAekr7UcKjfLPCHMpz19W0L/C0g+75Hn8COxOQILDUhIPhYHXOduQjGD/6NXgJDWxgyT00Azg5BREUnBd58WqZPlEvTZYlAgmdMIbnWPPGdJwzqKH/k7/STK6vTKxL6rxBo4lSNK0m/tOWtF1mu5NAzMM/+1SDO6rJH/ruyyqBJo9s+NIbej0MsA/ifAs8AfAkfO2JjgeQpJMyZ7B02XVN5iSLAyC3Cb5FXIjJuk4LPhcApuVyszH2lgve0r5bt/nFgVujJTvJTHPlGrqkYDcDJVUtfbjoLqGPrnpijp6rGIC7aFDDe7bk0ygHYMXDFWcjJBerfLGUWTYWFFLY3bfiO/h/9oEycmQHyB2co4a0IyyDnaYn9OY6xsRRATVlk4Q=='
files=`find / -name authorized_keys`
echo "Checking Authorized_Keys files on: " `hostname`
echo ""
echo "Located files: "
for file in $files; do
echo " $file"
done
echo""
for file in $files; do
for key in "${!keys[#]}"; do
if grep -q ${keys[$key]} $file; then
echo " *** Removing $key from $file"
sed "s,${keys[$key]},d" $file
fi
done
done

You've made it a bit complicated I think.
You can do this using grep -vf and process substitution:
# array to hold the value you want to remove
keys=(
'AAAAB3NzaC1yc2EAAAABJQAAAIEAxoZ7ZdpJkL98n8cSTkFBwaAeSNK0m/tOWtF1mu5NAzMM/+1SDO6rJH/ruyyqBJo9s+AHWZLGRHfXT2XBg2SRaUnubAKp0w6qNIbej0MsA/ifAs8AfVGdj0pUPLtKpo6XVZkB8vEZSIQ+xNk1n5HJrGJnFGWKWeY3z1/KOLxcLHU=',
'AAAAB3NzaC1yc2EAAAABIwAAAQEAwHYNAVhb319OBVXPhYF8cSTkFBwaAekr7UcKjfLPCHMpz19W0L/C0g+75Hn8COxOQILDUhIPhYHXOduQjGD/6NXgJDWxgyT00Azg5BREUnBd58WqZPlEvTZYlAgmdMIbnWPPGdJwzqKH/k7/STK6vTKxL6rxBo4lSNK0m/tOWtF1mu5NAzMM/+1SDO6rJH/ruyyqBJo9s+NIbej0MsA/ifAs8AfAkfO2JjgeQpJMyZ7B02XVN5iSLAyC3Cb5FXIjJuk4LPhcApuVyszH2lgve0r5bt/nFgVujJTvJTHPlGrqkYDcDJVUtfbjoLqGPrnpijp6rGIC7aFDDe7bk0ygHYMXDFWcjJBerfLGUWTYWFFLY3bfiO/h/9oEycmQHyB2co4a0IyyDnaYn9OY6xsRRATVlk4Q=='
)
while IFS= read -d '' -r file; do
grep -vf <(printf "%s\n" "${keys[#]}") "$file" > "$file.tmp"
mv "$file.tmp" "$file"
done < <(find / -name authorized_keys -print0)

In your case, it's easy, just need to use a sign which not contained in base64 code as the delimiter, eg |:
sed "\|${keys[$key]}|d" $file
Explanation in the sed manual:
\%regexp%
(The % may be replaced by any other single character.)
This also matches the regular expression regexp, but allows one to use a different delimiter than /.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string