Reverse file using tac and sed - linux

I have a usecase where I need to search and replace the last occurrence of a string in a file and write the changes back to the file. The case below is a simplified version of that usecase:
I'm attempting to reverse the file, make some changes reverse it back again and write to the file. I've tried the following snippet for this:
tac test | sed s/a/b/ | sed -i '1!G;h;$!d' test
test is a text file with contents:
a
1
2
3
4
5
I was expecting this command to make no changes to the order of the file, but it has actually reversed the contents to:
5
4
3
2
1
b
How can i make the substitution as well as retain the order of the file?

You can tac your file, apply substitution on first occurrence of desired pattern, tac again and tee result to a temporary file before you rename it with the original name:
tac file | sed '0,/a/{s//b/}' | tac > tmp && mv tmp file

Another way is to user grep to get the number of the last line that contains the text you want to change, then use sed to change that line:
$ linno=$( grep -n 'abc' <file> | tail -1 | cut -d: -f1 )
$ sed -i "${linno}s/abc/def/" <file>

Try to cat test | rev | sed -i '1!G;h;$!d' | rev
Or you can use only sed coomand:
For example you want to replace ABC on DEF:
You need to add 'g' to the end of your sed:
sed -e 's/\(.*\)ABC/\1DEF/g'
This tells sed to replace every occurrence of your regex ("globally") instead of only the first occurrence.
You should also add a $, if you want to ensure that it is replacing the last occurrence of ABC on the line:
sed -e 's/\(.*\)ABC$/\1DEF/g'
EDIT
Or simply add another | tac to your command:
tac test | sed s/a/b/ | sed -i '1!G;h;$!d' | tac

Here is a way to do this in a single command using awk.
First input file:
cat file
a
1
2
3
4
a
5
Now this awk command:
awk '{a[i++]=$0} END{p=i; while(i--) if (sub(/a/, "b", a[i])) break;
for(i=0; i<p; i++) print a[i]}' file
a
1
2
3
4
b
5
To save output back into original file use:
awk '{a[i++]=$0} END{p=i; while(i--) if (sub(/a/, "b", a[i])) break;
for(i=0; i<p; i++) print a[i]}' file >> $$.tmp && mv $$.tmp f

Another in awk. First a test file:
$ cat file
a
1
a
2
a
and solution:
$ awk '
$0=="a" && NR>1 { # when we meet "a"
print b; b="" # output and clear buffer b
}
{
b=b (b==""?"":ORS) $0 # gether the buffer
}
END { # in the end
sub(/^a/,"b",b) # replace the leading "a" in buffer b with "b"
print b # output buffer
}' file
a
1
a
2
b
Writing back the happens by redirecting the output to a temp file which replaces the original file (awk ... file > tmp && mv tmp file) or if you are using GNU awk v. 4.1.0+ you can use inplace edit (awk -i inplace ...).

Related

Searching specific lines of files using GREP

I have a directory with many text files. I want to search a given string in specific lines in the files(like searching for 'abc' in only 2nd and 3rd line of each file). Then When I find A match I want to print line 1 of the matching file.
My Approach - I'm doing a grep search with -n option and storing the output in a different file and then searching that file for the line number. Then I'm trying to get the file name and then print out it's first line.
Using the approach I mentioned above I'm not able to get the file name of the right file and even if I get that this approach is very lengthy.
Is there a better and fast solution to this?
Eg.
1.txt
file 1
one
two
2.txt
file 2
two
three
I want to search for "two" in line 2 of each file using grep and then print the first line of the file with match. In this example that would be 2.txt and the output should be "file 2"
I know it is easier using sed/awk but is there any way to do this using grep?
Use sed instead (GNU sed):
parse.sed
1h # Save the first line to hold space
2,3 { # On lines 2 and 3
/my pattern/ { # Match `my pattern`
x # If there is a match bring back the first line
p # and print it
:a; n; ba # Loop to the end of the file
}
}
Run it like this:
sed -snf parse.sed file1 file2 ...
Or as a one-liner:
sed -sn '1h; 2,3 { /my pattern/ { x; p; :a; n; ba; } }' file1 file2 ...
You might want to emit the filename as well, e.g. with your example data:
parse2.sed
1h # Save the first line to hold space
2,3 { # On lines 2 and 3
/two/ { # Match `my pattern`
F # Output the filename of the file currently being processed
x # If there is a match bring back the first line
p # and print it
:a; n; ba # Loop to the end of the file
}
}
Run it like this:
sed -snf parse2.sed file1 file2 | paste -d: - -
Output:
file1:file 1
file2:file 2
$ awk 'FNR==2{if(/one/) print line; nextfile} FNR==1{line=$0}' 1.txt 2.txt
file 1
$ awk 'FNR==2{if(/two/) print line; nextfile} FNR==1{line=$0}' 1.txt 2.txt
file 2
FNR will have line number for the current file being read
use FNR>=2 && FNR<=3 if you need a range of lines
FNR==1{line=$0} will save the contents of first line for future use
nextfile should be supported by most implementations, but the solution will still work (slower though) if you need to remove it
With grep and bash:
# Grep for a pattern and print filename and line number
grep -Hn one file[12] |
# Loop over matches where f=filename, n=match-line-number and s=matched-line
while IFS=: read f n s; do
# If match was on line 2 or line 3
# print the first line of the file
(( n == 2 || n == 3 )) && head -n1 $f
done
Output:
file 1
Only using grep, cut and | (pipe):
grep -rnw pattern dir | grep ":line_num:" | cut -d':' -f 1
Explanation
grep -rnw pattern dir
It return name of the file(s) where the pattern was found along with the line number.
It's output will be somthing like this
path/to/file/file1(.txt):8:some pattern 1
path/to/file/file2(.txt):4:some pattern 2
path/to/file/file3(.txt):2:some pattern 3
Now I'm using another grep to get the file with the right line number (for e.g. file that contains the pattern in line 2)
grep -rnw pattern dir | grep ":2:"
It's output will be
path/to/file/file3(.txt):2:line
Now I'm using cut to get the filename
grep -rnw pattern dir | grep ":2:" | cut -d':' -f 1
It will output the file name like this
path/to/file/file3(.txt)
P.S. - If you want to remove the "path/to/file/" from the filename you can use rev then cut and again rev, you can try this yourself or see the code below.
grep -rnw pattern dir | grep ":2:" | cut -d':' -f 1 | rev | cut -d'/' -f 1 | rev

Cut matching line and X successive lines until newline and paste into file

I would like to match all lines from a file containing a word, and take all lines under until coming two two newline characters in a row.
I have the following sed code to cut and paste specific lines, but not subsequent lines:
sed 's|.*|/\\<&\\>/{w results\nd}|' teststring | sed -file.bak -f - testfile
How could I modify this to take all subsequent lines?
For example, say I wanted to match lines with 'dog', the following should take the first 3 lines of the 5:
The best kind of an animal is a dog, for sure
-man's best friend
-related to wolves
Racoons are not cute
Is there a way to do this?
This should do:
awk '/dog/ {f=1} /^$/ {f=0} f {print > "new"} !f {print > "tmp"}' file && mv tmp file
It will set f to true if word dog is found, then if a blank line is found set f to false.
If f is true, print to new file.
If f is false, print to tmp file.
Copy tmp file to original file
Edit: Can be shorten some:
awk '/dog/ {f=1} /^$/ {f=0} {print > (f?"new":"tmp")}' file && mv tmp file
Edit2: as requested add space for every section in the new file:
awk '/dog/ {f=1;print ""> "new"} /^$/ {f=0} {print > (f?"new":"tmp")}' file && mv tmp file
If the original files does contains tabs or spaces instead of just a blank line after each dog section, change from /^$/ to /^[ \t]*$/
This might work for you (GNU sed):
sed 's|.*|/\\<&\\>/ba|' stringFile |
sed -f - -e 'b;:a;w resultFile' -e 'n;/^$/!ba' file
Build a set of regexps from the stringFile and send matches to :a. Then write the matched line and any further lines until an empty line (or end of file) to the resultFile.
N.B. The results could be sent directly to resultFile,using:
sed 's#.*#/\\<&\\>/ba#' stringFile |
sed -nf - -e 'b;:a;p;n;/^$/!ba' file > resultFile
To cut the matches from the original file use:
sed 's|.*|/\\<&\\>/ba|' stringFile |
sed -f - -e 'b;:a;N;/\n\s*$/!ba;w resultFile' -e 's/.*//p;d' file
Is this what you're trying to do?
$ awk -v RS= '/dog/' file
The best kind of an animal is a dog, for sure
-man's best friend
-related to wolves
Could you please try following.
awk '/dog/{count="";found=1} found && ++count<4' Input_file > temp && mv temp Input_file

Copy first row to the last in file

The purpose here is to copy the first row in the file to the last
Here the input file
335418.75,2392631.25,36091,38466,1
335418.75,2392643.75,36092,38466,1
335418.75,2392656.25,36093,38466,1
335418.75,2392668.75,36094,38466,1
335418.75,2392681.25,36095,38466,1
335418.75,2392693.75,36096,38466,1
335418.75,2392706.25,36097,38466,1
335418.75,2392718.75,36098,38466,1
335418.75,2392731.25,36099,38466,1
Using the following code i got the output desired. Is there other easy option?
awk 'NR==1 {print}' FF1-1.csv > tmp1
cat FF1-1.csv tmp1
Output desired
335418.75,2392631.25,36091,38466,1
335418.75,2392643.75,36092,38466,1
335418.75,2392656.25,36093,38466,1
335418.75,2392668.75,36094,38466,1
335418.75,2392681.25,36095,38466,1
335418.75,2392693.75,36096,38466,1
335418.75,2392706.25,36097,38466,1
335418.75,2392718.75,36098,38466,1
335418.75,2392731.25,36099,38466,1
335418.75,2392631.25,36091,38466,1
Thanks in advance.
Save the line in a variable and print at end using the END block
$ seq 5 | awk 'NR==1{fl=$0} 1; END{print fl}'
1
2
3
4
5
1
headcan produce the same output as your awk, so you can cat that instead.
You can use process substitution to avoid the temporary file.
cat FF1-1.csv <(head -n 1 FF1-1.csv)
As mentionned by Sundeep if process substitution isn't available you can simply cat the file then head it sequentially to obtain the same result, putting both in a subshell if you need to redirect the output :
(cat FF1-1.csv; head -n1 FF1-1.csv) > dest
Another alternative would be to pipe the output of head to cat and refer to it with - which for cat represents standard input :
head -1 FF1-1.csv | cat FF1-1.csv -
When you want to overwrite the existing, normal solutions can fail: do not write to a file you are working with.
A solution for editing the file is:
printf "%s\n" 1y $ x w q | ed -s file > /dev/null
Explanation:
printf will help for entering all commands in new lines.
1y will put the first line in a buf.
$ moves to the last line.
x will paste the contents of the buf.
w will write the results.
q will quit the editor.
ed is the editor that performs all work.
-s is suppressing diagnostics.
file is your input file.
> /dev/null is suppressing output to your screen.
With GNU sed:
seq 1 5 | sed '1h;$G'
Output:
1
2
3
4
5
1
1h: In first row: copy current row (pattern space) to sed's hold space
$G: In last row ($): append content from hold space to pattern space
See: man sed
Following solution may also help on same:
Solution 1st: Simply using awk with using RS and FS here(without using variables):
awk -v RS="" -v FS="\n" '{print $0 ORS $1}' Input_file
Solution 2nd: Using cat and head:
cat Input_file && head -n1 Input_file

Replace string in a file from a file [duplicate]

This question already has answers here:
Difference between single and double quotes in Bash
(7 answers)
Closed 5 years ago.
I need help with replacing a string in a file where "from"-"to" strings coming from a given file.
fromto.txt:
"TRAVEL","TRAVEL_CHANNEL"
"TRAVEL HD","TRAVEL_HD_CHANNEL"
"FROM","TO"
First column is what to I'm searching for, which is to be replaced with the second column.
So far I wrote this small script:
while read p; do
var1=`echo "$p" | awk -F',' '{print $1}'`
var2=`echo "$p" | awk -F',' '{print $2}'`
echo "$var1" "AND" "$var2"
sed -i -e 's/$var1/$var2/g' test.txt
done <fromto.txt
Output looks good (x AND y), but for some reason it does not replace the first column ($var1) with the second ($var2).
test.txt:
"TRAVEL"
Output:
"TRAVEL" AND "TRAVEL_CHANNEL"
sed -i -e 's/"TRAVEL"/"TRAVEL_CHANNEL"/g' test.txt
"TRAVEL HD" AND "TRAVEL_HD_CHANNEL"
sed -i -e 's/"TRAVEL HD"/"TRAVEL_HD_CHANNEL"/g' test.txt
"FROM" AND "TO"
sed -i -e 's/"FROM"/"TO"/g' test.txt
$ cat test.txt
"TRAVEL"
input:
➜ cat fromto
TRAVEL TRAVEL_CHANNEL
TRAVELHD TRAVEL_HD
➜ cat inputFile
TRAVEL
TRAVELHD
The work:
➜ awk 'BEGIN{while(getline < "fromto") {from[$1] = $2}} {for (key in from) {gsub(key,from[key])} print}' inputFile > output
and output:
➜ cat output
TRAVEL_CHANNEL
TRAVEL_CHANNEL_HD
➜
This first (BEGIN{}) loads your input file into an associate array: from["TRAVEL"] = "TRAVEL_HD", then rather inefficiently performs search and replace line by line for each array element in the input file, outputting the results, which I piped to a separate outputfile.
The caveat, you'll notice, is that the search and replaces can interfere with each other, the 2nd line of output being a perfect example since the first replacement happens. You can try ordering your replacements differently, or use a regex instead of a gsub. I'm not certain if awk arrays are guaranteed to have a certain order, though. Something to get you started, anyway.
2nd caveat. There's a way to do the gsub for the whole file as the 2nd step of your BEGIN and probably make this much faster, but I'm not sure what it is.
you can't do this oneshot you have to use variables within a script
maybe something like below sed command for full replacement
-bash-4.4$ cat > toto.txt
1
2
3
-bash-4.4$ cat > titi.txt
a
b
c
-bash-4.4$ sed 's|^\s*\(\S*\)\s*\(.*\)$|/^\2\\>/s//\1/|' toto.txt | sed -f - titi.txt > toto.txt
-bash-4.4$ cat toto.txt
a
b
c
-bash-4.4$

Extracting whole line if a character is present at certain position

I am a java programmer and a newbie to shell scripting, I have a daunting task to parse multi gigabyte logs and look for lines where '1'(just 1 no qoutes) is present at 446th position of the line, I am able to verify that character 1 is present by running this cat *.log | cut -c 446-446 | sort | uniq -c but I am not able to extract the lines and print them in an output file.
awk '{if (substr($0,446,1) == "1") {print $0}}' file
is the basics.
You can use FILENAME in the print feature to add the filename to the output, so then you could do
awk '{if (substr($0,446,1) == "1") {print FILENAME ":" $0}}' file1 file2 ...
IHTH
Try adding grep to the pipe:
grep '^.\{445\}1.*$'
You can use an awk command for that:
awk 'substr($0, 446, 1) == "1"' file.log
substr function will get 1 character at position 446 and == "1" will ensure that character is 1.
Another in awk. To make a more sane example, we print lines where the third char is 3:
$ cat file
123 # this
456 # not this
$ awk -F '' '$3==3' file
123 # this
based on that example but untested:
$ awk -F '' '$446==1' file

Resources