Strip whitespaces in AWK - linux

I have tried to remove all the white spaces but it's not taking it.
awk -F, -v OFS=", " 'NR==1 {
print $0,"FILENAME,DATE_LOADED,TEST";
next
}
{
line=$0
key=echo "${11//[[:space:]]/}" "${12//[[:space:]]/}" "${57//[[:space:]]/}"
key | getline
k=$0
cmd="md5 <<<"k
cmd | getline
md5sum=$0
print line, ENVIRON["FILE"], ENVIRON["ISODATE"], md5sum
}' $FILE > $NAME"_ready.csv"
If I try this it throws errors. I tried all options and really at a loss here.
key=echo $11$12$57 | tr -d

awk '{gsub(" +","");print $0}' test.text
If you want only awk based solution then this one liner will remove all the spaces from the file.

Related

Joining consecutive lines using awk

How can i join consecutive lines into a single lines using awk? Actually i have this with my awk command:
awk -F "\"*;\"*" '{if (NR!=1) {print $2}}' file.csv
I remove the first line
44895436200043
38401951900014
72204547300054
38929771400013
32116464200027
50744963500014
i want to have this:
44895436200043 38401951900014 72204547300054 38929771400013 32116464200027 50744963500014
csv file
That's a job for tr:
# tail -n +2 prints the whole file from line 2 on
# tr '\n' ' ' translates newlines to spaces
tail -n +2 file | tr '\n' ' '
With awk, you can achieve this by changing the output record separator to " ":
# BEGIN{ORS= " "} sets the internal output record separator to a single space
# NR!=1 adds a condition to the default action (print)
awk 'BEGIN{ORS=" "} NR!=1' file
I assume you want to modify your existing awk, so that it prints a horizontal space separated list, instead of words, one per row.
You can replace the print $2 action in your command, you can do this:
awk -F "\"*;\"*" 'NR!=1{u=u s $2; s=" "} END {print u}' file.csv
or replace the ORS (output record separator)
awk -F "\"*;\"*" -v ORS=" " 'NR!=1{print $2}' file.csv
or pipe output to xargs:
awk -F "\"*;\"*" 'NR!=1{print $2}' file.csv | xargs

How to get 1st field of a file only when 2nd field matches a string?

How to get 1st field of a file only when 2nd field matches a given string?
#cat temp.txt
Ankit pass
amit pass
aman fail
abhay pass
asha fail
ashu fail
cat temp.txt | awk -F"\t" '$2 == "fail" { print $1 }'*
gives no output
Another syntax with awk:
awk '$2 ~ /^faild$/{print $1}' input_file
A deleted 'cat' command.
^ start string
$ end string
It's the best way to match patten.
Either:
Your fields are not tab-separated or
You have blanks at the end of the relevant lines or
You have DOS line-endings and so there are CRs at the end of every
line and so also at the end of every $2 in every line (see
Why does my tool output overwrite itself and how do I fix it?)
With GNU cat you can run cat -Tev temp.txt to see tabs (^I), CRs (^M) and line endings ($).
Your code seems to work fine when I remove the * at the end
cat temp.txt | awk -F"\t" '$2 == "fail" { print $1 }'
The other thing to check is if your file is using tab or spaces. My copy/paste of your data file copied spaces, so I needed this line:
cat temp.txt | awk '$2 == "fail" { print $1 }'
The other way of doing this is with grep:
cat temp.txt | grep fail$ | awk '{ print $1 }'

How to join every newline Strings within single or double quote

How to join every newline Strings within single or double quote separated by comma.
Example:
I have below names..
$ cat file
James kurt
Suji sane
Bhujji La
Loki Hapa
Desired:
"James kurt", "Suji sane", "Bhujji La", "Loki Hapa"
EDIT:
My Side Efforts:
Below which i have done but there i'm completing it in two steps, jst curious if it can be clubbed into one only.
$ awk '{print "\x22" $1" "$2 "\x22"}'| tr '\n' ','
First print all lines with the " and then join the lines with a comma:
< file xargs -d '\n' printf '"%s"\n' | paste -sd,
Instead of newline you could just remove trailing (or leading comma):
< file xargs -d '\n' printf '"%s",' | sed 's/,$//'
< file xargs -d '\n' printf ',"%s"' | cut -c2-
< file xargs -d '\n' printf ', "%s"' | cut -c3- # with space after comma
With sed add the " and hold the lines, then on last line replace newline with comma and remove the leading command and print:
sed -n 's/^/"/;s/$/"/;H;${x;s/\n/, /g;s/^, //;p}' file
You were close! The " " in your attempt adds a space between the line and ". You could:
awk '{print "\x22" $0 "\x22"}' | tr '\n' ',' |
# and then remove trailing comma:
sed 's/,$//'
But joining the lines with paste is just simpler then replacing newlines with comma and removing the last one:
awk '{print "\x22" $0 "\x22"}' | paste -sd,
Could you please try following.
awk -v lines=$(wc -l < Input_file) -v s1="\"" '
BEGIN{
OFS=", "
}
{
printf("%s%s",s1 $0 s1,lines==FNR?ORS:OFS)
}
' Input_file
Explanation: Adding detailed explanation for above.
awk -v lines=$(wc -l < Input_file) -v s1="\"" ' ##Starting awk program, creating variable lines which has total number of lines in Input_file and creating s1 variable with " in it.
BEGIN{ ##Starting BEGIN section of this program from here.
OFS=", " ##Setting OFS value as comma space here.
}
{
printf("%s%s",s1 $0 s1,lines==FNR?ORS:OFS) ##Printing current line and either printing space or new line as per condition.
}
' Input_file ##Mentioning Input_file name here.
awk '{printf "%s",(NR==1?"":",")"\042"$0"\042"}END{print ""}'
Note that the last END statement is only used to add the last new-line to the output. This makes it POSIX complaint.
This might work for you (GNU sed):
sed ':a;N;$!ba;s/.*/"&"/mg;s/\n/, /g' file
Slurp file into the pattern space, surround lines by double quotes and replace newlines by a comma and a space.
Alternative:
sed -z 's/\n$//;s/.*/"&"/mg;s/\n/, /g;s/$/\n/' file

replace sed command text inline

I have this file
file.txt
unknown#mail.com||unknown#mail.com||
unknown#mail2.com||unknown#mail2.com||
unknown#mail3.com||unknown#mail3.com||
unknown#mail4.com||unknown#mail4.com||
unknownpass
unknownpass2
unknownpass3
unknownpass4
How can I use the sed command to obtain this:
unknown#mail.com|unknownpass|unknown#mail.com|unknownpass|
unknown#mail2.com|unknownpass2|unknown#mail2.com|unknownpass2|
unknown#mail3.com|unknownpass3|unknown#mail3.com|unknownpass3|
unknown#mail4.com|unknownpass4|unknown#mail4.com|unknownpass4|
This might work for you (GNU sed):
sed ':a;N;/\n[^|\n]*$/!ba;s/||\([^|]*\)||\(\n.*\)*\n\(.*\)$/|\3|\1|\3|\2/;P;D' file
Slurp the first part of the file into pattern space and one of the replacements, substitute, print and delete the first line and then repeat.
Well, this does use sed anyway:
{ sed -n 5,\$p file.txt; sed 4q file.txt; } | awk 'NR<5{a[NR]=$0; next}
{$2=a[NR-4]; $4=a[NR-4]} 1' FS=\| OFS=\|
awk to the rescue!
awk 'BEGIN {FS=OFS="|"}
NR==FNR {if(NF==1) a[++c]=$1; next}
NF>4 {$2=a[FNR]; $4=$2; print}' file{,}
a two pass algorithm, caches the entries in the first round and inserts them into the empty fields, assumes the number of items match.
Here is another approach with one pass, powered by tac wrapped awk
tac file |
awk 'BEGIN {FS=OFS="|"}
NF==1 {a[++c]=$1}
NF>4 {$2=a[c--]; $4=$2; print}' |
tac
I would combine the related lines with paste and reshuffle the elements with awk (I assume the related lines are exactly half a file away):
n=$(wc -l < file.txt)
paste -d'|' <(head -n $((n/2)) file.txt) <(tail -n $((n/2)) file.txt) |
awk '{ print $1, $6, $3, $6, "" }' FS='|' OFS='|'
Output:
unknown#mail.com|unknownpass|unknown#mail.com|unknownpass|
unknown#mail2.com|unknownpass2|unknown#mail2.com|unknownpass2|
unknown#mail3.com|unknownpass3|unknown#mail3.com|unknownpass3|
unknown#mail4.com|unknownpass4|unknown#mail4.com|unknownpass4|

Linux/Unix bash basic script awk/sed

I'm working on bash script.
var=$(ls -t1 | head -n1);
cat $var | sed 's/"//g' > latest.csv
cat latest.csv | sed -e 's/^\|$/"/g' -e 's/,/","/g' > from_epos.csv
echo "LATEST: $var";
Here's the whole script, it's meant to delete all quotation mark from current file and add new one, between each field.
INPUT:
"sku","item","price","qty"
5135,"ITEM1",1.79,5
5338,"ITEM2",1.39,5
5318,"ITEM3",1.09,5
5235,"ITEM4",1.09,5
9706,"ITEM5",1.99,5
OUTPUT:
"sku","item","price","qty"
"5135","ITEM1","1.79","5
"
"5338","ITEM2","1.39","5
"
"5318","ITEM3","1.09","5
"
"5235","ITEM4","1.09","5
"
"9706","ITEM5","1.09","5
"
My ideal output is:
"sku","item","price","qty"
"5135","ITEM1","1.79","5"
"5338","ITEM2","1.39","5"
"5318","ITEM3","1.09","5"
"5235","ITEM4","1.09","5"
"9706","ITEM5","1.99","5"
It seems like it's entering random character between line in current output like
" and quotation mark is between CR and LF.
What's the problem and how to get it to my ideal vision?
Thanks,
Adam
awk 'BEGIN{FS=OFS=","}{gsub(/\"/,"");gsub(/[^,]+/,"\"&\"")}1' input
Solution using sed:
sed -e 's/"//g; s/,/","/g; s/^/"/; s/$/"/'
Long-piped-commented version:
sed -e 's/"//g' | # removes all quotations
sed -e 's/,/","/g' | # changes all colons to ","
sed -e 's/^/"/; s/$/"/' # puts quotations in the start and end of each line
awk can do all this in one command:
awk -F"," 'NR>1{for(i=1; i<=NF; i++) {if (!($i ~ /^"/)) printf("\"%s\"",$i);
else printf("%s",$i); if (i<NF) printf(","); else print "";}}' latest.csv
EDIT:
Try this awk: (modified from JS's suggested command)
awk 'BEGIN{FS=OFS=","}{gsub(/\"/,"");gsub(/[^,\r]+/,"\"&\"")}1'
OR
awk -F"[,\r]" 'NR==1{print} NR>1{for(i=1; i<NF; i++) {if (!($i ~ /^"/))
printf("\"%s\"",$i); else printf("%s",$i); if (i<NF-1) printf(",");
else print "";}}'

Resources