Making horizontal String vertical shell or awk - string

I have a string
ABCDEFGHIJ
I would like it to print.
A
B
C
D
E
F
G
H
I
J
ie horizontal, no editing between characters to vertical. Bonus points for how to put a number next to each one with a single line. It'd be nice if this were an awk or shell script, but I am open to learning new things. :) Thanks!

If you just want to convert a string to one-char-per-line, you just need to tell awk that each input character is a separate field and that each output field should be separated by a newline and then recompile each record by assigning a field to itself:
awk -v FS= -v OFS='\n' '{$1=$1}1'
e.g.:
$ echo "ABCDEFGHIJ" | awk -v FS= -v OFS='\n' '{$1=$1}1'
A
B
C
D
E
F
G
H
I
J
and if you want field numbers next to each character, see #Kent's solution or pipe to cat -n.
The sed solution you posted is non-portable and will fail with some seds on some OSs, and it will add an undesirable blank line to the end of your sed output which will then become a trailing line number after your pipe to cat -n so it's not a good alternative. You should accept #Kent's answer.

awk one-liner:
awk 'BEGIN{FS=""}{for(i=1;i<=NF;i++)print i,$i}'
test :
kent$ echo "ABCDEF"|awk 'BEGIN{FS=""}{for(i=1;i<=NF;i++)print i,$i}'
1 A
2 B
3 C
4 D
5 E
6 F

So I figured this one out on my own with sed.
sed 's/./&\n/g' horiz.txt > vert.txt

One more awk
echo "ABCDEFGHIJ" | awk '{gsub(/./,"&\n")}1'
A
B
C
D
E
F
G
H
I
J

This might work for you (GNU sed):
sed 's/\B/\n/g' <<<ABCDEFGHIJ
for line numbers:
sed 's/\B/\n/g' <<<ABCDEFGHIJ | sed = | sed 'N;y/\n/ /'
or:
sed 's/\B/\n/g' <<<ABCDEFGHIJ | cat -n

Related

Removing the new line in the last line of a text file using sed

I wanna remove the new line in the last line of my text file using sed. For example the input is like the following:
1
1
1
1
1
1
And I want to have an output like this without any new lines at the end of the text file:
1
1
1
1
1
1
This might work for you (GNU sed):
sed -z 's/\n\+$//' file
This will remove a newline(s) at the end of a file provided there are no null characters.
N.B. Normal use of sed i.e. without the -z option which slurps the file into memory, will remove any newlines before the sed commands can act on them.
Using sed
sed '/^$/d' input_file
This will remove all empty lines
Starting with some test data:
$ printf "%s\n" {a..e} "" "" | cat -n
1 a
2 b
3 c
4 d
5 e
6
7
I would approach the problem like this: reverse the file, remove blank lines
at the top, then re-reverse the file
$ printf "%s\n" {a..e} "" "" | tac | awk 'NF && !p {p=1}; p' | tac | cat -n
1 a
2 b
3 c
4 d
5 e
NF is the awk variable for "number of fields in this record". p is a variable I'm using to indicate when to start printing. The first time that NF is non-zero, we set the p variable to a true value. The standalone p at the end triggers the default action to print the record.
Removing the newline on the last line is a different story.
Given this file:
$ cat > file
first
second
third
$ od -c file
0000000 f i r s t \n s e c o n d \n t h i
0000020 r d \n
0000023
We can use perl:
$ perl -i -0777 -ne 's/\n+$//; print' file
$ od -c file
0000000 f i r s t \n s e c o n d \n t h i
0000020 r d
0000022
or tr to translate newlines to some other character, and sed to remove the trailing character
$ tr '\n' '\034' < file | sed $'s/\034$//' | tr '\034' '\n' | od -c
0000000 f i r s t \n s e c o n d \n t h i
0000020 r d
0000022

substitute consecutive tabs for "\tNA\t"

Have a badly formatted tsv file with empty fields all over the place. I wish to fill these empty spaces with "NA" on linux.
I tried awk '{gsub("\t\t","\tNA\t"); print$0)' but that only substitutes one empty space to NA instance. Chaining the command awk '{gsub("\t\t","\tNA\t"); print$0)|awk '{gsub("\t\t","\tNA\t"); print$0) does two substitutions per line - but not particularly helpful if I have many columns to deal with.
Is there a faster (neater) way to do this?
Its a bit complex since you have to handle newlines empty fields, end of line empty fields and potentially successive empty fields. I could not achieve something with sed, it's probably insane. But with awk this seems to work:
$ cat test.txt
a c d e
g h i j
k l m n
p s t
w x
$ awk -F$'\t' '{for(i=1;i<=NF;++i){if($i==""){printf "NA"}else{printf $i} if(i<NF)printf "\t"} printf "\n"}' test.txt
a NA c d e
NA g h i j
k l m n NA
p NA NA s t
NA NA w x NA
Beware copy paste, the tabs will probably be transformed to spaces... By the way I searched a solution for the CSV files, and adapted it from this thread ;) where you can see that the most readable option is the awk one.
Did you try with sed? For example:
cat test.txt
test test test
test test test
sed 's:\t\t*:\tNA\t:g' test.txt
test NA test NA test
test NA test NA test
Ok this works:
awk '{ gsub(/\t\t\t/,"\tNA\tNA\t"); print $0}' test.txt | awk '{ gsub(/\t\t/,"\tNA\t"); print $0}' | awk '{ gsub(/\t\t/,"\tNA\t"); print
$0}' | awk '{gsub(/^[\t]+/,"NA\t"); print $0}'
interestingly this doesn't:
awk '{ gsub(/\t\t\t/,"\tNA\tNA\t"); print $0}' test.txt | awk '{ gsub(/\t\t/,"\tNA\t"); print $0}' | awk '{gsub(/^[\t]+/,"NA\t"); print
$0}'
I'm sure there is a more elegant solution though..

Joining a pair of lines with specific starting points

I know that with sed I can print
cat current.txt | sed 'N;s/\n/,/' > new.txt
A
B
C
D
E
F
to
A,B
C,D
E,F
What I would like to do is following:
A
B
C
D
E
F
to
A,D
B,E
C,F
I'd like to join 1 with 4, 2 with 5, 3 with 6 and so on.
Is this possible with sed? Any idea how it could be achieved?
Thank you.
Try printing in columns:
pr -s, -t -2 current.txt
This is longer than I was hoping, but:
$ lc=$(( $(wc -l current.txt | sed 's/ .*//') / 2 ))
$ paste <(head -"$lc" current.txt) <(tail -"$lc" current.txt) | column -t -o,
The variable lc stores the number of lines in current.txt divided by two. Then head and tail are used to print lc first and lc last lines, respectively (i.e. the first and second half of the file); then paste is used to put the two together and column changes tabs to commas.
An awk version
awk '{a[NR]=$0} NR>3 {print a[NR-3]","$0}' current.txt
A,D
B,E
C,F
This solution is easy to adjust if you like other interval.
Just change NR>3 and NR-3 to desired number.

linux command to get the last appearance of a string in a text file

I want to find the last appearance of a string in a text file with linux commands. For example
1 a 1
2 a 2
3 a 3
1 b 1
2 b 2
3 b 3
1 c 1
2 c 2
3 c 3
In such a text file, i want to find the line number of the last appearance of b which is 6.
I can find the first appearance with
awk '/ b / {print NR;exit}' textFile.txt
but I have no idea how to do it for the last occurrence.
cat -n textfile.txt | grep " b " | tail -1 | cut -f 1
cat -n prints the file to STDOUT prepending line numbers.
grep greps out all lines containing "b" (you can use egrep for more advanced patterns or fgrep for faster grep of fixed strings)
tail -1 prints last line of those lines containing "b"
cut -f 1 prints first column, which is line # from cat -n
Or you can use Perl if you wish (It's very similar to what you'd do in awk, but frankly, I personally don't ever use awk if I have Perl handy - Perl supports 100% of what awk can do, by design, as 1-liners - YMMV):
perl -ne '{$n=$. if / b /} END {print "$n\n"}' textfile.txt
This can work:
$ awk '{if ($2~"b") a=NR} END{print a}' your_file
We check every second file being "b" and we record the number of line. It is appended, so by the time we finish reading the file, it will be the last one.
Test:
$ awk '{if ($2~"b") a=NR} END{print a}' your_file
6
Update based on sudo_O advise:
$ awk '{if ($2=="b") a=NR} END{print a}' your_file
to avoid having some abc in 2nd field.
It is also valid this one (shorter, I keep the one above because it is the one I thought :D):
$ awk '$2=="b" {a=NR} END{print a}' your_file
Another approach if $2 is always grouped (may be more efficient then waiting until the end):
awk 'NR==1||$2=="b",$2=="b"{next} {print NR-1; exit}' file
or
awk '$2=="b"{f=1} f==1 && $2!="b" {print NR-1; exit}' file

How to duplicate lines in Linux with keeping the original order?

I have a file (with only 1 column) like this:
A
B
Z
D
N
and what I want to do is to duplicate each line so I get this:
A
A
B
B
Z
Z
D
D
N
N
I only could think of using cat for the same file and then sort it:
cat file1 file1 | sort -k1 > file1_duplicate
but then I lose the order of my file which is important for me:
A
A
B
B
D
D
N
N
Z
Z
any suggestion would be helpful.
Try e.g.
sed p file >newfile
awk '{print $1;}{print $1;}' file.txt > duplicatefile.txt
LSB has perl5. This will do the trick:
cat file1 | perl -pe '$_.=$_' > file1_duplicate
With coreutils paste you can do it like this:
paste -d'\n' file file
cat file | tail --lines=1 >> file

Resources