How to duplicate lines in Linux with keeping the original order? - linux

I have a file (with only 1 column) like this:
A
B
Z
D
N
and what I want to do is to duplicate each line so I get this:
A
A
B
B
Z
Z
D
D
N
N
I only could think of using cat for the same file and then sort it:
cat file1 file1 | sort -k1 > file1_duplicate
but then I lose the order of my file which is important for me:
A
A
B
B
D
D
N
N
Z
Z
any suggestion would be helpful.

Try e.g.
sed p file >newfile

awk '{print $1;}{print $1;}' file.txt > duplicatefile.txt

LSB has perl5. This will do the trick:
cat file1 | perl -pe '$_.=$_' > file1_duplicate

With coreutils paste you can do it like this:
paste -d'\n' file file

cat file | tail --lines=1 >> file

Related

How do I print a line in one file based on a corresponding value in a second file?

I have two files:
File1:
A
B
C
File2:
2
4
3
I would like to print each line in file1 the number of times found on the corresponding line of file2, and then append each line to a separate file.
Desired output:
A
A
B
B
B
B
C
C
C
Here is one of the approaches I have tried:
touch output.list
paste file1 file2 > test.dict
cat test.dict
A 2
B 4
C 3
while IFS="\t" read -r f1 f2
do
yes "$f1" | head -n "$f2" >> output.list
done < test.dict
For my output I get a bunch of lines that read:
head: : invalid number of lines
Any guidance would be greatly appreciated. Thanks!
Change IFS to an ANSI-C quoted string or remove IFS (since the default value already contains tab).
You can also use a process substitution and prevent the temporary file.
while IFS=$'\t' read -r f1 f2; do
yes "$f1" | head -n "$f2"
done < <(paste file1 file2) > output.list
You could loop through the output of paste and use a c-style for loop in replacement of yes and head.
#!/usr/bin/env bash
while read -r first second; do
for ((i=1;i<=second;i++)); do
printf '%s\n' "$first"
done
done < <(paste file1.txt file2.txt)
If you think that the output is correct add
| tee file3.txt
with a white space in between the command substitution to redirect the output to a stdout and the output file which is file3.txt
done < <(paste file1.txt file2.txt) | tee file3.txt
You can do this with an awk 1-liner
$ awk 'NR==FNR{a[++i]=$0;next}{for(j=0;j<$0;j++)print a[FNR]}' File1 File2
A
A
B
B
B
B
C
C
C

How to get a single output file with N lines from N multiline files?

I am looking for some effective way to concatenate multiple multiline files into one file - here is an example for three input files:
1.txt:
a b
c d
2.txt:
e f
g
h
3.txt:
ijklmn
output.txt:
a b c d
e f g h
ijklmn
(Replacing each linebreak with single whitespace). Which way can you recommend?
Using BASH for loop:
for i in [0-9]*.txt; do tr '\n' ' ' < "$i"; echo; done > output.txt
cat output.txt
a b c d
e f g h
ijklmn
If you want to strip one ending space before each line break then use:
for i in [0-9]*.txt; do tr '\n' ' ' < "$i"; echo; done | sed 's/ *$//' > output.txt

Joining a pair of lines with specific starting points

I know that with sed I can print
cat current.txt | sed 'N;s/\n/,/' > new.txt
A
B
C
D
E
F
to
A,B
C,D
E,F
What I would like to do is following:
A
B
C
D
E
F
to
A,D
B,E
C,F
I'd like to join 1 with 4, 2 with 5, 3 with 6 and so on.
Is this possible with sed? Any idea how it could be achieved?
Thank you.
Try printing in columns:
pr -s, -t -2 current.txt
This is longer than I was hoping, but:
$ lc=$(( $(wc -l current.txt | sed 's/ .*//') / 2 ))
$ paste <(head -"$lc" current.txt) <(tail -"$lc" current.txt) | column -t -o,
The variable lc stores the number of lines in current.txt divided by two. Then head and tail are used to print lc first and lc last lines, respectively (i.e. the first and second half of the file); then paste is used to put the two together and column changes tabs to commas.
An awk version
awk '{a[NR]=$0} NR>3 {print a[NR-3]","$0}' current.txt
A,D
B,E
C,F
This solution is easy to adjust if you like other interval.
Just change NR>3 and NR-3 to desired number.

Making horizontal String vertical shell or awk

I have a string
ABCDEFGHIJ
I would like it to print.
A
B
C
D
E
F
G
H
I
J
ie horizontal, no editing between characters to vertical. Bonus points for how to put a number next to each one with a single line. It'd be nice if this were an awk or shell script, but I am open to learning new things. :) Thanks!
If you just want to convert a string to one-char-per-line, you just need to tell awk that each input character is a separate field and that each output field should be separated by a newline and then recompile each record by assigning a field to itself:
awk -v FS= -v OFS='\n' '{$1=$1}1'
e.g.:
$ echo "ABCDEFGHIJ" | awk -v FS= -v OFS='\n' '{$1=$1}1'
A
B
C
D
E
F
G
H
I
J
and if you want field numbers next to each character, see #Kent's solution or pipe to cat -n.
The sed solution you posted is non-portable and will fail with some seds on some OSs, and it will add an undesirable blank line to the end of your sed output which will then become a trailing line number after your pipe to cat -n so it's not a good alternative. You should accept #Kent's answer.
awk one-liner:
awk 'BEGIN{FS=""}{for(i=1;i<=NF;i++)print i,$i}'
test :
kent$ echo "ABCDEF"|awk 'BEGIN{FS=""}{for(i=1;i<=NF;i++)print i,$i}'
1 A
2 B
3 C
4 D
5 E
6 F
So I figured this one out on my own with sed.
sed 's/./&\n/g' horiz.txt > vert.txt
One more awk
echo "ABCDEFGHIJ" | awk '{gsub(/./,"&\n")}1'
A
B
C
D
E
F
G
H
I
J
This might work for you (GNU sed):
sed 's/\B/\n/g' <<<ABCDEFGHIJ
for line numbers:
sed 's/\B/\n/g' <<<ABCDEFGHIJ | sed = | sed 'N;y/\n/ /'
or:
sed 's/\B/\n/g' <<<ABCDEFGHIJ | cat -n

Use a file to extract specified rows from another file

input1:
1 s1
100 s100
90 s90
input2:
a 1
b 3
c 7
d 100
e 101
f 90
Output:
a 1
d 100
f 90
I know join can do this, but it needs to (1) sort these common fields (2) after join, I need to remove the second column from input1. Does anyone have better solution for this.
Here's one way using awk:
awk 'FNR==NR { a[$1]; next } $2 in a' file1 file2
Results:
a 1
d 100
f 90
This might work for you (GNU sed):
sed -r 's|(\S+).*|/\\<\1$/p|' input1 | sed -nf - input2
Depending on your requirements, grep might do:
grep -wFf <(cut -d' ' -f1 input1) input2
Output:
a 1
d 100
f 90
Note that grep is not column-aware and will happily match where it can.
As far i know awk is better soluiton for this,but since its already provided :below is the perl solution.
> perl -F -lane '$H{$F[0]}=$F[1];END{%T=reverse(%H);foreach (values %H){if(exists($H{$_})){print $T{$_}." ".$_;}}}' file1 file2
a 1
d 100
f 90

Resources