Linux file splitting - linux

I am using sed to split a file in two
I have a file that has a custom separator "/-sep-/" and I want to split the file where the separator is
currently I have:
sed -n '1,/-sep-/ {p}' /export/data.temp > /export/data.sql.md5
sed -n '/-sep-/,$ {p}' /export/data.temp > /export/data.sql
but the file 1 contains /-sep-/ at the end and the file two begins with /-sep-/
how can I handle this?
note that on the file one I should remove a break line and the /-sep-/ and on the file 2 remove the /-sep-/ and a break line :S

Reverse it: tell it what to not print instead.
sed '/-sep-/Q' /export/data.temp > /export/data.sql.md5
sed '1,/-sep-/d'/export/data.temp > /export/data.sql
(Regarding that break line, I did not understand it. A sample input would probably help.)
By the way, your original code needs only minor addition to do what you want:
sed -n '1,/-sep-/{/-sep-/!p}' /export/data.temp > /export/data.sql.md5
sed -n '/-sep-/,${/-sep-/!p}' /export/data.temp > /export/data.sql

$ cat >testfile
a
a
a
a
/-sep-/
b
b
b
b
and then
$ csplit testfile '/-sep-/' '//'
8
8
8
$ head -n 999 xx*
==> xx00 <==
a
a
a
a
==> xx01 <==
/-sep-/
==> xx02 <==
b
b
b
b

sed -n '/-sep-/q; p' /export/data.temp > /export/data.sql.md5
sed -n '/-sep-/,$ {p}' /export/data.temp | sed '1d' > /export/data.sql
Might be easier to do in one pass with awk:
awk -v out=/export/data.sql.md5 -v f2=/export/data.sql '
/-sep-/ { out=f2; next}
{ print > out }
' /exourt/data.temp

Related

Reverse file using tac and sed

I have a usecase where I need to search and replace the last occurrence of a string in a file and write the changes back to the file. The case below is a simplified version of that usecase:
I'm attempting to reverse the file, make some changes reverse it back again and write to the file. I've tried the following snippet for this:
tac test | sed s/a/b/ | sed -i '1!G;h;$!d' test
test is a text file with contents:
a
1
2
3
4
5
I was expecting this command to make no changes to the order of the file, but it has actually reversed the contents to:
5
4
3
2
1
b
How can i make the substitution as well as retain the order of the file?
You can tac your file, apply substitution on first occurrence of desired pattern, tac again and tee result to a temporary file before you rename it with the original name:
tac file | sed '0,/a/{s//b/}' | tac > tmp && mv tmp file
Another way is to user grep to get the number of the last line that contains the text you want to change, then use sed to change that line:
$ linno=$( grep -n 'abc' <file> | tail -1 | cut -d: -f1 )
$ sed -i "${linno}s/abc/def/" <file>
Try to cat test | rev | sed -i '1!G;h;$!d' | rev
Or you can use only sed coomand:
For example you want to replace ABC on DEF:
You need to add 'g' to the end of your sed:
sed -e 's/\(.*\)ABC/\1DEF/g'
This tells sed to replace every occurrence of your regex ("globally") instead of only the first occurrence.
You should also add a $, if you want to ensure that it is replacing the last occurrence of ABC on the line:
sed -e 's/\(.*\)ABC$/\1DEF/g'
EDIT
Or simply add another | tac to your command:
tac test | sed s/a/b/ | sed -i '1!G;h;$!d' | tac
Here is a way to do this in a single command using awk.
First input file:
cat file
a
1
2
3
4
a
5
Now this awk command:
awk '{a[i++]=$0} END{p=i; while(i--) if (sub(/a/, "b", a[i])) break;
for(i=0; i<p; i++) print a[i]}' file
a
1
2
3
4
b
5
To save output back into original file use:
awk '{a[i++]=$0} END{p=i; while(i--) if (sub(/a/, "b", a[i])) break;
for(i=0; i<p; i++) print a[i]}' file >> $$.tmp && mv $$.tmp f
Another in awk. First a test file:
$ cat file
a
1
a
2
a
and solution:
$ awk '
$0=="a" && NR>1 { # when we meet "a"
print b; b="" # output and clear buffer b
}
{
b=b (b==""?"":ORS) $0 # gether the buffer
}
END { # in the end
sub(/^a/,"b",b) # replace the leading "a" in buffer b with "b"
print b # output buffer
}' file
a
1
a
2
b
Writing back the happens by redirecting the output to a temp file which replaces the original file (awk ... file > tmp && mv tmp file) or if you are using GNU awk v. 4.1.0+ you can use inplace edit (awk -i inplace ...).

How do I add a line number to a file?

The contents of file.txt:
"16875170";"172";"50"
"11005137";"28";"39"
"16981017";"9347";"50"
"13771676";"13";"45"
"5865226";"963";"28"
File with the result:
"1";"16875170";"172";"50"
"2";"11005137";"28";"39"
"3";"16981017";"9347";"50"
"4";"13771676";"13";"45"
"5";"5865226";"963";"28"
awk can do this for you pretty easily.
$ cat test.txt
"16875170";"172";"50"
"11005137";"28";"39"
"16981017";"9347";"50"
"13771676";"13";"45"
"5865226";"963";"28"
$ awk '{print "\""NR"\";"$0}' test.txt
"1";"16875170";"172";"50"
"2";"11005137";"28";"39"
"3";"16981017";"9347";"50"
"4";"13771676";"13";"45"
"5";"5865226";"963";"28"
This tells awk to print a literal ", followed by the record number, followed by ";, then rest of the line. Depending on other needs not stated (e.g. the quoting not being totally necessary,) there may be a better method to use but given the question and output this works.
Grep solution for funsies:
$ grep ".*" test.txt -n | sed 's/\([0-9]*\):/"\1";/g;'
"1";"16875170";"172";"50"
"2";"11005137";"28";"39"
"3";"16981017";"9347";"50"
"4";"13771676";"13";"45"
"5";"5865226";"963";"28"
For the fun of sed:
sed "=" test.txt | sed "N;s/\([0-9]\{1,\}\)\n/\"\1\";/"
Output:
"1";"16875170";"172";"50"
"2";"11005137";"28";"39"
"3";"16981017";"9347";"50"
"4";"13771676";"13";"45"
"5";"5865226";"963";"28"
also, bash-based:
i=0; cat my_file.txt | while read line; do i=$(( $i + 1 )); echo \"$i\"\;"$line"; done > results.txt
There is also coreutils nl:
<file.txt nl -s';' -w1 | sed 's/[0-9]*/"&"/'
Or perl:
<file.txt perl -pne 's/^/"$.";/'
Or sed and paste:
<file.txt sed = | paste -d\; - - | sed 's/[0-9]*/"&"/'
Output in all cases:
"1";"16875170";"172";"50"
"2";"11005137";"28";"39"
"3";"16981017";"9347";"50"
"4";"13771676";"13";"45"
"5";"5865226";"963";"28"

awk add string to each line except last blank line

I have file with blank line at the end. I need to add suffix to each line except last blank line.
I use:
awk '$0=$0"suffix"' | sed 's/^suffix$//'
But maybe it can be done without sed?
UPDATE:
I want to skip all lines which contain only '\n' symbol.
EXAMPLE:
I have file test.tsv:
a\tb\t1\n
\t\t\n
c\td\t2\n
\n
I run cat test.tsv | awk '$0=$0"\t2"' | sed 's/^\t2$//':
a\tb\t1\t2\n
\t\t\t2\n
c\td\t2\t2\n
\n
It sounds like this is what you need:
awk 'NR>1{print prev "suffix"} {prev=$0} END{ if (NR) print prev (prev == "" ? "" : "suffix") }' file
The test for NR in the END is to avoid printing a blank line given an empty input file. It's untested, of course, since you didn't provide any sample input/output in your question.
To treat all empty lines the same:
awk '{print $0 (/./ ? "suffix" : "")}' file
#try:
awk 'NF{print $0 "suffix"}' Input_file
this will skip all blank lines
awk 'NF{$0=$0 "suffix"}1' file
to only skip the last line if blank
awk 'NR>1{print p "suffix"} {p=$0} END{print p (NF?"suffix":"") }' file
If perl is okay:
$ cat ip.txt
a b 1
c d 2
$ perl -lpe '$_ .= "\t 2" if !(eof && /^$/)' ip.txt
a b 1 2
2
c d 2 2
$ # no blank line for empty file as well
$ printf '' | perl -lpe '$_ .= "\t 2" if !(eof && /^$/)'
$
-l strips newline from input, adds back when line is printed at end of code due to -p option
eof to check end of file
/^$/ blank line
$_ .= "\t 2" append to input line
Try this -
$ cat f ###Blank line only in the end of file
-11.2
hello
$ awk '{print (/./?$0"suffix":"")}' f
-11.2suffix
hellosuffix
$
OR
$ cat f ####blank line in middle and end of file
-11.2
hello
$ awk -v val=$(wc -l < f) '{print (/./ || NR!=val?$0"suffix":"")}' f
-11.2suffix
suffix
hellosuffix
$

How to delete first two lines and last four lines from a text file with bash?

I am trying to delete first two lines and last four lines from my text files. How can I do this with Bash?
You can combine tail and head:
$ tail -n +3 file.txt | head -n -4 > file.txt.new && mv file.txt.new file.txt
Head and Tail
cat input.txt | tail -n +3 | head -n -4
Sed Solution
cat input.txt | sed '1,2d' | sed -n -e :a -e '1,4!{P;N;D;};N;ba'
This is the quickest way I found:
sed -i 1,2d filename
You can call the ex editor from the bash command line using the following sample. Note it uses a here document to end the list of commands to ex.
ex text.file << EOF
1,2d
$
-3,.d
x
EOF

How to read the second-to-last line in a file using Bash?

I have a file that has the following as the last three lines. I want to retrieve the penultimate line, i.e. 100.000;8438; 06:46:12.
.
.
.
99.900; 8423; 06:44:41
100.000;8438; 06:46:12
Number of patterns: 8438
I don't know the line number. How can I retrieve it using a shell script? Thanks in advance for your help.
Try this:
tail -2 yourfile | head -1
A short sed one-liner inspired by https://stackoverflow.com/a/7671772/5287901
sed -n 'x;$p'
Explanation:
-n quiet mode: dont automatically print the pattern space
x: exchange the pattern space and the hold space (hold space now store the current line, and pattern space the previous line, if any)
$: on the last line, p: print the pattern space (the previous line, which in this case is the penultimate line).
Use this
tail -2 <filename> | head -1
ed and sed can do it as well.
str='
99.900; 8423; 06:44:41
100.000;8438; 06:46:12
Number of patterns: 8438
'
printf '%s' "$str" | sed -n -e '${x;1!p;};h' # print last line but one
printf '%s\n' H '$-1p' q | ed -s <(printf '%s' "$str") # same
printf '%s\n' H '$-2,$-1p' q | ed -s <(printf '%s' "$str") # print last line but two
From: Useful sed one-liners by Eric Pement
# print the next-to-the-last line of a file
sed -e '$!{h;d;}' -e x # for 1-line files, print blank line
sed -e '1{$q;}' -e '$!{h;d;}' -e x # for 1-line files, print the line
sed -e '1{$d;}' -e '$!{h;d;}' -e x # for 1-line files, print nothing
You don't need all of them, just pick one.
tail +2 <filename>
This prints from second line to last line.
To clarify what has already been said:
ec2thisandthat | sort -k 5 | grep 2012- | awk '{print $2}' | tail -2 | head -1
snap-e8317883
snap-9c7227f7
snap-5402553f
snap-3e7b2c55
snap-246b3c4f
snap-546a3d3f
snap-2ad48241
snap-d00150bb
returns
snap-2ad48241
tac <file> | sed -n '2p'

Resources