Re-ordering columns with a Perl one-liner - linux

How do you reorganize this with one liner
foo r1.1 abc
foo r10.1 pqr
qux r2.1 lmn
bar r33.1 xpq
# In fact there could be more fields that preceeds column with "rxx.x".
Into this
r1.1 foo abc
r10.1 foo pqr
r2.1 qux lmn
r33.1 bar xpq
Basically, put second column into the first and everything else that succeeds it, after.

Assuming your text is in the file "test", this will do it:
perl -lane 'print "$F[1] $F[0] $F[2]"' test

If you have more than three columns, you will want something like:
perl -lane 'print join q( ),$F[1],$F[0],#F[2..#F-1]'

$ perl -pale '$_ = "#F[1,0,2..$#F]"' file
If it's tab-separated, a little more is needed:
$ perl -pale 'BEGIN { $"="\t"; } $_ = "#F[1,0,2..$#F]"' file

Content of 'infile':
foo r1.1 abc
foo r10.1 pqr
qux r2.1 lmn
bar r33.1 xpq
Perl one-line:
perl -pe 's/\A(\S+\s+)(\S+\s+)/$2$1/' infile
Result:
r1.1 foo abc
r10.1 foo pqr
r2.1 qux lmn
r33.1 bar xpq

The basic answers are provided by others, I considered the case of fixed width data with possible empty fields:
>cat spacedata.txt
foo r1.1 abc
foo r10.1 pqr
qux r2.1 lmn
bar r33.1 xpq
r1.2 cake
is r1.2 alie
>perl -lpwE '$_=pack "A7A5A*", (unpack "A5A7A*")[1,0,2];' spacedata.txt
r1.1 foo abc
r10.1 foo pqr
r2.1 qux lmn
r33.1 bar xpq
r1.2 cake
r1.2 is alie

file
a 5 ss
b 3 ff
c 2 zz
cat file | awk '{print $2, $1, $3}' # will print column 2,1,3
5 a ss
3 b ff
2 c zz
#or if you want to sort by column and print to new_file
cat file | sort -n -k2 | awk '{print $0}' > new_file
new_file
c 2 zz
b 3 ff
a 5 ss

Related

Treat spaces as spaces after n column

How to run bash column command that after n columns it treats spaces as spaces and not as a separator?
Input:
field1 field2 field3 field 4 with spaces
foo1 foo2 foo3 foo4
bar1 bar2 bar3 bar 4 with spaces
Output:
col1 col2 col3 col4
field1 field2 field3 field 4 with spaces
foo1 foo2 foo3 foo4
bar1 bar2 bar3 bar 4 with spaces
Maybe replace spaces with other char before the column command and after that replace it again with spaces? awk or sed might be the right tool for this, but I'm not too familiar with them.
Any help is appreciated! Please, don't shoot me down. This is my first question here...
Another awk that replaces first 3 spaces with a tab:
awk '{for (i=1; i<=3; ++i) sub(/ +/, "\t")} 1' file
field1 field2 field3 field 4 with spaces
foo1 foo2 foo3 foo4
bar1 bar2 bar3 bar 4 with spaces
How about this
$ cat t
field1 field2 field3 field 4 with spaces
foo1 foo2 foo3 foo4
bar1 bar2 bar3 bar 4 with spaces
$ cat t | sed -E 's/^([^ ]+) ([^ ]+) ([^ ]+) (.+)$/\1\t\2\t\3\t\4/g'
field1 field2 field3 field 4 with spaces
foo1 foo2 foo3 foo4
bar1 bar2 bar3 bar 4 with spaces
$
How about building another variable, here s4 to cause confusion:
$ awk '
BEGIN {
OFS="\t"
}
{
for(i=4;i<=NF;i++)
s4=s4 (s4==""?"":" ") $i
print $1,$2,$3,s4
s4=""
}' file
Output:
field1 field2 field3 field 4 with spaces
foo1 foo2 foo3
bar1 bar2 bar3 bar 4 with spaces
If there are multiple spaces like this, you need to set FS=" ".
Using awk and the example data in a file "spaces", while still utilising column:
awk '{ printf $1":"$2":"$3":";
for (i=4;i<=NF;i++)
{ if (i != NF) { printf $i" "
}
else { printf $i
}
}
printf "\n"
}' spaces | column -t -s":"
Use awk to separate the first four fields with ":" and then pipe through to column using ":" as a separator.
This might work for you (GNU sed):
sed 'y/ /\t/;s/\t/ /4g' file
Translate all spaces to tabs and then replace the 4th tab and thereafter with spaces.
If you prefer a kind of symmetry:
sed 's/ /\t/g;s/\t/ /4g' file

How to Print All line between matching first occurrence of word?

input.txt
ABC
CDE
EFG
XYZ
ABC
PQR
EFG
From above file i want to print lines between 'ABC' and first occurrence of 'EFG'.
Expected output :
ABC
CDE
EFG
ABC
PQR
EFG
How can i print lines from one word to first occurrence of second word?
EDIT: In case you want to print all occurrences of lines coming between ABC to DEF and leave others then try following.
awk '/ABC/{found=1} found;/EFG/{found=""}' Input_file
Could you please try following.
awk '/ABC/{flag=1} flag && !count;/EFG/{count++}' Input_file
$ awk '/ABC/,/EFG/' file
Output:
ABC
CDE
EFG
ABC
PQR
EFG
This might work for you (GNU sed):
sed -n '/ABC/{:a;N;/EFG/!ba;p}' file
Turn off implicit printing by using the -n option.
Gather up lines between ABC and EFG and then print them. Repeat.
If you want to only print between the first occurrence of ABC to EFG, use:
sed -n '/ABC/{:a;N;/EFG/!ba;p;q}' file
To print the second through fourth occurrences, use:
sed -En '/ABC/{:a;N;/EFG/!ba;x;s/^/x/;/^x{2,4}$/{x;p;x};x;}' file

Swap column x of tab-separated values file with column x of second tsv file

Let's say I have:
file1.tsv
Foo\tBar\tabc\t123
Bla\tWord\tabc\tqwer
Blub\tqwe\tasd\tqqq
file2.tsv
123\tzxcv\tAAA\tqaa
asd\t999\tBBB\tdef
qwe\t111\tCCC\tabc
And I want to overwrite column 3 of file1.tsv with column 3 of file2.tsv to end up with:
Foo\tBar\tAAA\t123
Bla\tWord\tBBB\tqwer
Blub\tqwe\tCCC\tqqq
What would be a good way to do this in bash?
Take a look at this awk:
awk 'FNR==NR{a[NR]=$3;next}{$3=a[FNR]}1' OFS='\t' file{2,1}.tsv > output.tsv
If you want to use just bash, with little more effort:
while IFS=$'\t' read -r a1 a2 _ a4; do
IFS=$'\t' read -ru3 _ _ b3 _
printf '%s\t%s\t%s\t%s\n' "$a1" "$a2" "$b3" "$a4"
done <file1.tsv 3<file2.tsv >output.tsv
Output:
Foo Bar AAA 123
Bla Word BBB qwer
Blub qwe CCC qqq
Another way to do this can be, with correction as pointed out by #PesaThe:
paste -d$'\t' <(cut -d$'\t' -f1,2 file1.tsv) <(cut -d$'\t' -f3 file2.tsv) <(cut -d$'\t' -f4 file1.tsv)
The output will be:
Foo Bar AAA 123
Bla Word BBB qwer
Blub qwe CCC qqq

accessing text between specific words in UNIX multiple times

if the file is like this:
ram_file
abc
123
end_file
tony_file
xyz
456
end_file
bravo_file
uvw
789
end_file
now i want to access text between ram_file and end_file, tony_file & end _file and bravo_file & end_file simultaneously. I tried sed command but i don't know how to specify *_file in this
Thanks in advance
This awk should do the job for you.
This solution threat the end_file as an end of block, and all other xxxx_file as start of block.
It will not print text between the block of there are some, like in my example do not print this.
awk '/end_file/{f=0} f; /_file/ && !/end_file/ {f=1}' file
abc
123
xyz
456
uvw
789
cat file
ram_file
abc
123
end_file
do not print this
tony_file
xyz
456
end_file
nor this data
bravo_file
uvw
789
end_file
If you like some formatting, it can be done easy with awk
awk -F_ '/end_file/{printf (f?RS:"");f=0} f; /file/ && !/end_file/ {f=1;print "-Block-"++c"--> "$1}' file
-Block-1--> ram
abc
123
-Block-2--> tony
xyz
456
-Block-3--> bravo
uvw
789

How to do a numeric UNIX's sort on fields with a character attached in front of the number

I have a very large data (12G) that looks like this:
foo r1.1 abc
foo r10.1 pqr
qux r2.1 lmn
bar r33.1 xpq
What I want to do is to sort 2nd field numerically yielding (in reality there are more leading fields):
foo r1.1 abc
qux r2.1 lmn
foo r10.1 pqr
bar r33.1 xpq
I tried the following but won't work:
sort -k1 -n
What's the right way to do it?
How about sort -k1.2n if it starts with just an r
You almost had it - you need to do:
sort -k2
-k1 starts from the first character.

Resources