linux command to combine multiple columns in a tab-delim file?

linux command to combine multiple columns in a tab-delim file? - linux

everyone!
How can I convert this
a 2 3 4
b 3 1 6
c 3 5 2
d 6 3 5
to below?
a-2:3 4
b-3:1 6
c-3:5 2
d-6:3 5
Thank you!!!

You can use awk, in your case :
awk -F \\t '{print $1"-"$2$3":"$4}' < input.txt
if the input is in the input.txt file or you can even pipe to awk

Related

AWK (or something else) Average of multiple columns from multiple files

I would appreciate some help with an awk script, or whatever would do the job.
So, I've got multiple files (the same amount of lines and columns) and I want to do an average of every number in every column (except the first) from all the files. I have got no idea how many columns there are in a file (though i could probably get the number if needed).
filename.1
1 1 2 3 4
2 3 4 5 6
3 2 3 5 6
filename.2
1 3 4 6 6
2 5 6 7 8
3 4 5 7 8
output
1 2 3 5 5
2 4 5 6 7
3 3 4 6 7
I've found this somewhere on here that does it for a single column (as far as I understand it
awk '{a[FNR]+=$2;b[FNR]++;}END{for(i=1;i<=FNR;i++)print i,a[i]/b[i];}' fort.*
So the only? change would be to replace the +=$2 with a cycle over all columns? Is there a way to do that without knowing the exact number of columns?
Thanks.

$ cat tst.awk
{
key[FNR] = $1
for (colNr=2; colNr<=NF; colNr++) {
sum[FNR,colNr] += $colNr
}
}
END {
for (rowNr=1; rowNr<=FNR; rowNr++) {
printf "%s%s", key[rowNr], OFS
for (colNr=2; colNr<=NF; colNr++) {
printf "%s%s", int(sum[rowNr,colNr]/ARGIND+0.5), (colNr<NF ? OFS : ORS)
}
}
}
$ awk -f tst.awk file1 file2
1 2 3 5 5
2 4 5 6 7
3 3 4 6 7
The above uses GNU awk for ARGIND, with other awks just add a line FNR==1{ARGIND++} at the start.

How to use paste command for different lengths of columns

I have:
file1.txt file2.txt file3.txt
8 2 2
1 2 1
8 1 0
3 3
5 3
3
4
I want to paste all these three columns in ofile.txt
I tried with
paste file1.txt file2.txt file3.txt > ofile.txt
Result I got in ofile.txt:
ofile.txt:
8 2 2
1 2 1
8 1 0
3 3
5 3
3
4
Which should come
ofile.txt
8 2 2
1 2 1
8 1 0
3 3
5 3
3
4

You can try this paste command in bash using process substitution:
paste <(sed 's/^[[:blank:]]*//' file1.txt) file2.txt file3.txt
8 2 2
1 2 1
8 8 0
3 3
5 3
3
4
sed command is used to remove leading whitespace from file1.txt.

I can reproduce your output when I make inputfiles with tabs.
paste also uses tabs betwen the columns and does this how he thinks it should.
You see the results when I replace the tabs with -:
# more x* | tr '\t' '-'
::::::::::::::
x1
::::::::::::::
-1a
-1b
-1c
-1d
::::::::::::::
x2
::::::::::::::
-2a
-2b
::::::::::::::
x3
::::::::::::::
-3a
-3b
-3c
-3d
-3e
-3f
-3g
# paste x? | tr '\t' '-'
-1a--2a--3a
-1b--2b--3b
-1c---3c
-1d---3d
---3e
---3f
---3g
Think how you want it. When you want correct indents, you need to append lines with tab for files with less lines. Or manipulate the result: 3 tabs into 4 and 4 tabs at the beginning of the line to 5 tabs.
sed -e 's/\t\t\t/\t\t\t\t/' -e 's/^\t\t\t\t/\t\t\t\t\t/'

missing number from two squence

How do I findout missing number from two sequence using bash script
from example I have file which contain following data
1 1
1 2
1 3
1 5
2 1
2 3
2 5
output : missing numbers are
1 4
2 2
2 4

This awk one-liner gives the requested output for the specified input:
$ awk '$2!=l2+1&&$1==l1{for(i=l2+1;i<$2;i++)print l1,i}{l1=$1;l2=$2}' file
1 4
2 2
2 4

a solution using grep:
printf "%s\n" {1..2}" "{1..5} | grep -vf file

Searching a column in a unix file?

I have the data file below:
136110828724515000007700877
137110904734015000007700877
138110911724215000007700877
127110626724515000007700871
127110626726015000007700871
131110724724515000007700871
134110814725015000007700871
134110814734015000007700871
104110122726027000001810072
107110208724527000002900000
And I want to extract value of column 3 ie values of 6787714447.
I tried by using:-
awk "print $3" <filename>
but it didn't work. What should I use instead?

It is a better job for cut:
$ cut -c 3 < file
6
7
8
7
7
1
4
4
4
7
As per man cut:
-c, --characters=LIST
select only these characters
To make them appear all in the same line, pipe tr -d '\n':
$ cut -c 3 < file | tr -d '\n'
6787714447
Or even to sed to have the new line at the end:
$ cut -c 3 < file | tr -d '\n' | sed 's/$/\n/'
6787714447
With grep:
$ grep -oP "^..\K." file
6
7
8
7
7
1
4
4
4
7
with sed:
$ sed -r 's/..(.).*/\1/' file
6
7
8
7
7
1
4
4
4
7
with awk:
$ awk '{split ($0, a, ""); print a[3]}' file
6
7
8
7
7
1
4
4
4
7

Cut is probably the simpler/cleaner option, but here two alternatives:
AWK version:
awk '{print substr($1, 3, 1) }' <filename>
Python version:
python -c 'print "\n".join(map(lambda x: x[2], open("<filename>").readlines()))'
EDIT: Please see 1_CR's comments and disregard this option in favour of his.

How can I separate some repeated patterns in a row into multiple rows using bash script?

I have some problem with bash script.
I've got a string which has some repeated patterns like this.
1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 ...
Each fields is separated by tab key.
I want it to look like this...
1 2 3 4
1 2 3 4
1 2 3 4
…
How can I solve this problem using bash script like cut, sed, awk ... ?
I've tried some command like cut -f 'seq 4, 4, 40' example.txt
It doesn't work...
It looks very easy but so difficult to me...

You can use sed like this:
s='1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4'
p='1 2 3 4'
echo "$s"|sed "s/$p\s*/&\n/g"
1 2 3 4
1 2 3 4
1 2 3 4
1 2 3 4
Live Demo: http://ideone.com/P59OCJ

Here's a pure bash solution:
IFS=$'\t' set -- $(<input_file)
seen=()
while [[ $1 ]]; do
if (( ${seen[$1]} )); then # If we've seen the value before, start a new line.
echo
unset seen
fi
printf '%s ' "$1"
seen[$1]=1
shift
done

If you know the ending number of your sequence beforehand, you can do something like:
LAST_NUMBER=4
sed -e "s/$LAST_NUMBER\t*/&\n/g" < example.txt
Just replace 4 with the last number from the sequence
If you don't know the number, you have to search through it using the following:
#!/bin/bash
declare -A CHECKED_NUMBERS
LAST_NUMBER=
while read LINE; do
SPLIT_LINE=$(cut -d" " -f1- <<< "$LINE")
for number in $SPLIT_LINE; do
if [ "${CHECKED_NUMBERS[$number]}" == "1" ]; then
LAST_NUMBER=$number
else
CHECKED_NUMBERS[$number]=1
fi
done
done < example.txt
# do the replacement
sed -e "s/$LAST_NUMBER\t*/&\n/g" < example.txt

An awk version
awk '{for (i=1;i<=NF;i++) {printf "%s"(i%4?" ":"\n"),$i}}' file
1 2 3 4
1 2 3 4
1 2 3 4
1 2 3 4
An gnu awk version
awk -v RS="\t" '{printf "%s"(NR%4?" ":"\n"),$0}' file
1 2 3 4
1 2 3 4
1 2 3 4
1 2 3 4

xargs may help:
kent$ echo "1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4"|xargs -n4
1 2 3 4
1 2 3 4
1 2 3 4
1 2 3 4

This might work for you:
printf "%s\t%s\t%s\t%s\n" $string
or you want the fields space separated:
printf "%s %s %s %s\n" $string

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

linux command to combine multiple columns in a tab-delim file? - linux

everyone! How can I convert this a 2 3 4 b 3 1 6 c 3 5 2 d 6 3 5 to below? a-2:3 4 b-3:1 6 c-3:5 2 d-6:3 5 Thank you!!!

You can use awk, in your case : awk -F \\t '{print $1"-"$2$3":"$4}' < input.txt if the input is in the input.txt file or you can even pipe to awk

Related

AWK (or something else) Average of multiple columns from multiple files

How to use paste command for different lengths of columns

missing number from two squence

Searching a column in a unix file?

How can I separate some repeated patterns in a row into multiple rows using bash script?

Categories

Resources