I have 100 file that looks like this
>file.csv
gene1,55
gene2,23
gene3,33
I want to insert the filename and make it look like this:
file.csv
gene1,55,file.csv
gene2,23,file.csv
gene3,33,file.csv
Now, I can almost get there using awk
awk '{print $0,FILENAME}' *.csv > concatenated_files.csv
But this prints the filenames with a space, instead of a comma. Is there a way to replace the space with a comma?
Is there a way to replace the space with a comma?
Yes, change the OFS
$ awk -v OFS="," '{print $0,FILENAME}' file.csv
gene1,55,file.csv
gene2,23,file.csv
gene3,33,file.csv
Figured it out, turns out:
for d in *.csv; do (awk '{print FILENAME (NF?",":"") $0}' "$d" > ${d}.all_files.csv); done
Works just fine.
You can also create a new field
awk -vOFS=, '{$++NF=FILENAME}1' file.csv
gene1,55,file.csv
gene2,23,file.csv
gene3,33,file.csv
I want to remove all the brackets value from first column and add some field in last column
Sample Input
EXAMPLE(abc#gmail.com),60,6
EXAMPLE(bcd#gmail.com),30,6
EXAMPLE1(sample#gmail.com),60,3
Sample Output
EXAMPLE,60,6.ABC
EXAMPLE,30,6,ABC
EXAMPLE1,60,3,ABC
Below is code which I tried but no luck :
for file_name in tmp/*.csv
do
sed -i 's/$/,"AB"/' "$file_name"| awk '{sub(/[(].*[)]/,"")}1' $file_name > tmp.csv && mv tmp.csv $file_name
done
when I try it with single file it is working but in look it is not working:
sed -i 's/$/,"AB"/' abc.csv| awk '{sub(/[(].*[)]/,"")}1' abc.csv > tmp.csv
Perhaps this will do:
awk '{sub(/[(].*[)]/,""); print $0",ABC"}' file | awk '{sub(/60,6,/,"60,6.")}1'
EXAMPLE,60,6.ABC
EXAMPLE,30,6,ABC
EXAMPLE1,60,3,ABC
Try this
/tmp> cat jyoti.txt
EXAMPLE(abc#gmail.com),60,6
EXAMPLE(bcd#gmail.com),30,6
EXAMPLE1(sample#gmail.com),60,3
/tmp> perl -lne 's/\(.*\)//; print "$_,ABC" ' jyoti.txt
EXAMPLE,60,6,ABC
EXAMPLE,30,6,ABC
EXAMPLE1,60,3,ABC
/tmp>
I'm trying to use sed to show only the 1st, 2nd, and 8th word in a line.
The problem I have is that the words are random, and the amount of spaces between the words are also random... For example:
QST334 FFR67 HHYT 87UYU HYHL 9876S NJI QD112 989OPI
Is there a way to get this to output as just the 1st, 2nd, and 8th words:
QST334 FFR67 QD112
Thanks for any advice or hints for the right direction!
Use awk
awk '{print $1,$2,$8}' file
In action:
$ echo "QST334 FFR67 HHYT 87UYU HYHL 9876S NJI QD112 989OPI" | awk '{print $1,$2,$8}'
QST334 FFR67 QD112
You do not really need to put " " between two columns as mentioned in another answer. By default awk consider single white space as output field separator AKA OFS. so you just need commas between the desired columns.
so following is enough:
awk '{print $1,$2,$8}' file
For Example:
echo "QST334 FFR67 HHYT 87UYU HYHL 9876S NJI QD112 989OPI" |awk '{print $1,$2,$8}'
QST334 FFR67 QD112
However, if you wish to have some other OFS then you can do as follow:
echo "QST334 FFR67 HHYT 87UYU HYHL 9876S NJI QD112 989OPI" |awk -v OFS="," '{print $1,$2,$8}'
QST334,FFR67,QD112
Note that this will put a comma between the output columns.
Another solution is to use the cut command:
cut --delimiter '<delimiter-character>' --fields <field> <file>
Where:
'<delimiter-character>'>: is the delimiter on which the string should be parsed.
<field>: specifies which column to output, could a single column 1, multiple columns 1,3 or a range of them 1-3.
In action:
cut -d ' ' -f 1-3 /path/to/file
This might work for you (GNU sed):
sed 's/\s\+/\n/g;s/.*/echo "&"|sed -n "1p;2p;8p"/e;y/\n/ /' file
Convert spaces to newlines. Evaluate each line as a separate file and print only the required lines i.e. fields. Replace remaining newlines with spaces.
I have a csv file with data as follows
16:47:07,3,r-4-VM,230000000.,0.466028518635,131072,0,0,0,60,0
16:47:11,3,r-4-VM,250000000.,0.50822578824,131072,0,0,0,0,0
16:47:14,3,r-4-VM,240000000.,0.488406067907,131072,0,0,32768,0,0
16:47:17,3,r-4-VM,230000000.,0.467893525702,131072,0,0,0,0,0
I would like to shorten the value in the 5th column.
Desired output
16:47:07,3,r-4-VM,230000000.,0.46,131072,0,0,0,60,0
16:47:11,3,r-4-VM,250000000.,0.50,131072,0,0,0,0,0
16:47:14,3,r-4-VM,240000000.,0.48,131072,0,0,32768,0,0
16:47:17,3,r-4-VM,230000000.,0.46,131072,0,0,0,0,0
Your help is highly appreciated
awk '{$5=sprintf( "%.2g", $5)} 1' OFS=, FS=, input
This will round and print .47 instead of .46 on the first line, but perhaps that is desirable.
Try with this:
cat filename | sed 's/\(^.*\)\(0\.[0-9][0-9]\)[0-9]*\(,.*\)/\1\2\3/g'
So far, the output is at GNU/Linux standard output, so
cat filename | sed 's/\(^.*\)\(0\.[0-9][0-9]\)[0-9]*\(,.*\)/\1\2\3/g' > out_filename
will send the desired result to out_filename
If rounding is not desired, i.e. 0.466028518635 needs to be printed as 0.46, use:
cat <input> | awk -F, '{$5=sprintf( "%.4s", $5)} 1' OFS=,
(This can another example of Useless use of cat)
You want it in perl, This is it:
perl -F, -lane '$F[4]=~s/^(\d+\...).*/$1/g;print join ",",#F' your_file
tested below:
> cat temp
16:47:07,3,r-4-VM,230000000.,0.466028518635,131072,0,0,0,60,0
16:47:11,3,r-4-VM,250000000.,10.50822578824,131072,0,0,0,0,0
16:47:14,3,r-4-VM,240000000.,0.488406067907,131072,0,0,32768,0,0
16:47:17,3,r-4-VM,230000000.,0.467893525702,131072,0,0,0,0,0
> perl -F, -lane '$F[4]=~s/^(\d+\...).*/$1/g;print join ",",#F' temp
16:47:07,3,r-4-VM,230000000.,0.46,131072,0,0,0,60,0
16:47:11,3,r-4-VM,250000000.,10.50,131072,0,0,0,0,0
16:47:14,3,r-4-VM,240000000.,0.48,131072,0,0,32768,0,0
16:47:17,3,r-4-VM,230000000.,0.46,131072,0,0,0,0,0
sed -r 's/^(([^,]+,){4}[^,]{4})[^,]*/\1/' file.csv
This might work for you (GNU sed):
sed -r 's/([^,]{,4})[^,]*/\1/5' file
This replaces the 5th occurence of non-commas to no more than 4 characters length.