I'm writing a bash script to read a set of files line by line and perform some edits. To begin with, I'm simply trying to move the files to backup locations and write them out as-is, to test the script is working. However, it is failing to copy the last line of each file. Here is the snippet:
while IFS= read -r line
do
echo "Line is ***$line***"
echo "$line" >> $POM
done < $POM.backup
I obviously want to preserve whitespace when I copy the files, which is why I have set the IFS to null. I can see from the output that the last line of each file is being read, but it never appears in the output.
I've also tried an alternative variation, which does print the last line, but adds a newline to it:
while IFS= read -r line || [ -n "$line" ]
do
echo "Line is ***$line***"
echo "$line" >> $POM
done < $POM.backup
What is the best way to do this do this read-write operation, to write the files exactly as they are, with the correct whitespace and no newlines added?
The command that is adding the line feed (LF) is not the read command, but the echo command. read does not return the line with the delimiter still attached to it; rather, it strips the delimiter off (that is, it strips it off if it was present in the line, IOW, if it just read a complete line).
So, to solve the problem, you have to use echo -n to avoid adding back the delimiter, but only when you have an incomplete line.
Secondly, I've found that when providing read with a NAME (in your case line), it trims leading and trailing whitespace, which I don't think you want. But this can be solved by not providing a NAME at all, and using the default return variable REPLY, which will preserve all whitespace.
So, this should work:
#!/bin/bash
inFile=in;
outFile=out;
rm -f "$outFile";
rc=0;
while [[ $rc -eq 0 ]]; do
read -r;
rc=$?;
if [[ $rc -eq 0 ]]; then ## complete line
echo "complete=\"$REPLY\"";
echo "$REPLY" >>"$outFile";
elif [[ -n "$REPLY" ]]; then ## incomplete line
echo "incomplete=\"$REPLY\"";
echo -n "$REPLY" >>"$outFile";
fi;
done <"$inFile";
exit 0;
Edit: Wow! Three excellent suggestions from Charles Duffy, here's an updated script:
#!/bin/bash
inFile=in;
outFile=out;
while { read -r; rc=$?; [[ $rc -eq 0 || -n "$REPLY" ]]; }; do
if [[ $rc -eq 0 ]]; then ## complete line
echo "complete=\"$REPLY\"";
printf '%s\n' "$REPLY" >&3;
else ## incomplete line
echo "incomplete=\"$REPLY\"";
printf '%s' "$REPLY" >&3;
fi;
done <"$inFile" 3>"$outFile";
exit 0;
After review i wonder if :
{
line=
while IFS= read -r line
do
echo "$line"
line=
done
echo -n "$line"
} <$INFILE >$OUTFILE
is juts not enough...
Here my initial proposal :
#!/bin/bash
INFILE=$1
if [[ -z $INFILE ]]
then
echo "[ERROR] missing input file" >&2
exit 2
fi
OUTFILE=$INFILE.processed
# a way to know if last line is complete or not :
lastline=$(tail -n 1 "$INFILE" | wc -l)
if [[ $lastline == 0 ]]
then
echo "[WARNING] last line is incomplete -" >&2
fi
# we add a newline ANYWAY if it was complete, end of file will be seen as ... empty.
echo | cat $INFILE - | {
first=1
while IFS= read -r line
do
if [[ $first == 1 ]]
then
echo "First Line is ***$line***" >&2
first=0
else
echo "Next Line is ***$line***" >&2
echo
fi
echo -n "$line"
done
} > $OUTFILE
if diff $OUTFILE $INFILE
then
echo "[OK]"
exit 0
else
echo "[KO] processed file differs from input"
exit 1
fi
Idea is to always add a newline at the end of file and to print newlines only BETWEEN lines that are read.
This should work for quite all text files given they are not containing 0 byte ie \0 character, in which case 0 char byte will be lost.
Initial test can be used to decided whether an incomplete text file is acceptable or not.
Add a new line if line is not a line. Like this:
while IFS= read -r line
do
echo "Line is ***$line***";
printf '%s' "$line" >&3;
if [[ ${line: -1} != '\n' ]]
then
printf '\n' >&3;
fi
done < $POM.backup 3>$POM
Related
I have comma separated (sometimes tab) text file as below:
parameters.txt:
STD,ORDER,ORDER_START.xml,/DML/SOL,Y
STD,INSTALL_BASE,INSTALL_START.xml,/DML/IB,Y
with below code I try to loop through the file and do something
while read line; do
if [[ $1 = "$(echo "$line" | cut -f 1)" ]] && [[ "$(echo "$line" | cut -f 5)" = "Y" ]] ; then
//do something...
if [[ $? -eq 0 ]] ; then
// code to replace the final flag
fi
fi
done < <text_file_path>
I wanted to update the last column of the file to N if the above operation is successful, however below approaches are not working for me:
sed 's/$f5/N/'
'$5=="Y",$5=N;{print}'
$(echo "$line" | awk '$5=N')
Update: Few considerations which need to be considered to give more clarity which i missed at first, apologies!
The parameters file may contain lines with last field flag as "N" as well.
Final flag needs to be update only if "//do something" code has successfully executed.
After looping through all lines i.e, after exiting "while loop" flags for all rows to be set to "Y"
perhaps invert the operations do processing in awk.
$ awk -v f1="$1" 'BEGIN {FS=OFS=","}
f1==$1 && $5=="Y" { // do something
$5="N"}1' file
not sure what "do something" operation is, if you need to call another command/script it's possible as well.
with bash:
(
IFS=,
while read -ra fields; do
if [[ ${fields[0]} == "$1" ]] && [[ ${fields[4]} == "Y" ]]; then
# do something
fields[4]="N"
fi
echo "${fields[*]}"
done < file | sponge file
)
I run that in a subshell so the effects of altering IFS are localized.
This uses sponge to write back to the same file. You need the moreutils package to use it, otherwise use
done < file > tmp && mv tmp file
Perhaps a bit simpler, less bash-specific
while IFS= read -r line; do
case $line in
"$1",*,Y)
# do something
line="${line%Y}N"
;;
esac
echo "$line"
done < file
To replace ,N at the end of the line($) with ,Y:
sed 's/,N$/,Y/' file
This question already has answers here:
How to check the extension of a filename in a bash script?
(11 answers)
Closed 4 years ago.
I want to call a function ONLY when the file name ends with .request but somehow it calls the function if request is sub-string.
I marked with red and green the imported parts, on the right it's the output
for file in ${filesOrDir[*]}; do
if [[ -f "$file" ]]; then
if [[ "$file"=*[".request"] ]]; then
# Enters here when .request is a substring.
fi
fi
if [[ -d "$file" ]]; then
# ... some logics
fi
done
Try to use: [[ $file == *.request ]]
handle-request.sh is a helping script for checking file names.
handle-request.sh:
while IFS='' read -r line || [[ -n "$line" ]]; do
if [[ $line == *.request ]]; then
echo $line
fi
done < "$1"
Explanation: reference
IFS='' (or IFS=) prevents leading/trailing whitespace from being trimmed.
-r prevents backslash escapes from being interpreted.
|| [[ -n $line ]] prevents the last line from being ignored if it doesn't end with a \n (since read returns a non-zero exit code when it encounters EOF).
input file:
hello.request
.request.hello
file name with space.request
Output:
hello.request
file name with space.request
I have a text file.
I want to get lines starting with specific format.
I just want to get lines that have x/x/x format. x is a number.
But this regex is not working. It is always giving no match :
while read line
do
regex="\d+\/\d+\/\d+"
if [[ ${line} =~ ${regex} ]]; then
echo ${line}
else
echo "no match : ${line}"
fi
done <${textFileName}
File is :
Don't use bash if you can use a better tool:
grep -E '^[[:digit:]]+/[[:digit:]]+/[[:digit:]]+' "${textFileName}"
But if you have to use bash:
while IFS= read -r line
do
if [[ "$line}" =~ ^[[:digit:]]+/[[:digit:]]+/[[:digit:]]+ ]]; then
echo -- "$line"
else
echo "no match: $line"
fi
done < "$textFileName"
\d is not valid regex(3).
I need to extract entries from a log file and put them on an errors file.
I don't want to duplicate the entries on the errors file every time that the script is run, so I create this:
grep $1 $2 | while read -r line ; do
echo "$line"
if [ ! -z "$line" ]
then
echo "Line is NOT empty"
if grep -q "$line" $3; then
echo "Line NOT added"
else
echo $line >> $3
echo "Line added"
fi
fi
done
And is run using:
./log_monitor.sh ERROR logfile.log errors.txt
The first time that the script runs it finds the entries, and create the errors file (there are no errors file before).
The next time, this line never found the recently added lines to the errors file,
if grep -q "$line" $3;
therefore, the script adds the same entries to the errors file.
Any ideas of why this is happening?
This most likely happens because you are not searching for the line itself, but treating the line as regex. Let's say you have this file:
$ cat file
[ERROR] This is a test
O This is a test
and you try to find the first line:
$ grep "[ERROR] This is a test" file
O This is a test
As you can see, it does not match the line we're looking for (causing duplicates) and does match a different line (causing dropped entries). You can instead use -F -x to search for literal strings matching the full line:
$ grep -F -x "[ERROR] This is a test" file
[ERROR] This is a test
Applying this to your script:
grep $1 $2 | while read -r line ; do
echo "$line"
if [ ! -z "$line" ]
then
echo "Line is NOT empty"
if grep -F -x -q "$line" $3; then
echo "Line NOT added"
else
echo $line >> $3
echo "Line added"
fi
fi
done
And here with additional fixes and cleanup:
grep -e "$1" -- "$2" | while IFS= read -r line ; do
printf '%s\n' "$line"
if [ "$line" ]
then
echo "Line is NOT empty"
if grep -Fxq -e "$line" -- "$3"; then
echo "Line NOT added"
else
printf '%s\n' "$line" >> "$3"
echo "Line added"
fi
fi
done
PS: this could be shorter, faster and have a better time complexity with a snippet of awk.
There's no need to check for blank lines; the first grep only checks lines with the word "ERROR", which cannot be blank.
If you can do without the diagnostic echo messages, pretty much the whole of that code boils down what might be done using two greps, a bash process substitution, sponge, and touch for the first-run case:
[ ! -f $3 ] && touch $3 ; grep -vf $3 <(grep $1 $2) | sponge -a $3
I have a requirement to check the syntax of some config files.
The format of the config is as below:
[sect1]
sect1file1
sect1file2
[sect1_ends]
[sect2]
sect2file1
sect2file2
[sect2_ends]
My requirement is to check the for start of sect1 which is inside square brackets [sect1], then check that the files sect1file1 and sect1file2 exist, then check for the end of sect1 by reading sect1_ends inside square braces [sect1_ends]. Then repeat the same for sect2, and so on.
There is already a set of section names which are permitted. My objective is to check whether the section names are in the list, and whether the syntax is without any error.
I tried using
perl -lne 'print $1 while (/^\[(.*?)\]$/g)' <config filename>
but I'm not sure how to check and go through the file.
I am happy to see you have tried. Try again with this prototype:
while read -r line; do
if [ ${#line} -eq 0 ]; then
continue # ignore empty lines
fi
if [[ "${line}" = \[*\] ]]; then
echo "Line with [...]"
if [ -n "${inSection}" ]; then
if [ "${line}" = "${inSection/]/_ends]}" ]; then
echo "End of section"
unset inSection
else
echo "Invalid endtag ${line} while processing ${inSection}"
exit 1
fi
else
echo "Start of new section ${line}"
inSection="${line}"
fi
else
if [ -f "${line}" ]; then
echo "OK file ${line}"
else
echo "NOK file ${line}"
fi
fi
done < inputfile
Your solution looks good, but -n reads lines from standard input (STDIN). You need to feed your config file into STDIN to pass it to your script:
perl -lne 'print $1 while (/^\[(.*?)\]$/g)' <config.ini
Alternate option would be using -p.