I have a file like this;
2018-01-02;1.5;abcd;111
2018-01-04;2.75;efgh;222
2018-01-07;5.25;lmno;333
2018-01-09;1.25;prs;444
I'd like to add double ticks to non-numeric columns, so the new file should look like;
"2018-01-02";1.5;"abcd";111
"2018-01-04";2.75;"efgh";222
"2018-01-07";5.25;"lmno";333
"2018-01-09";1.25;"prs";444
I tried this so far, know that this is not the correct way
head myfile.csv -n 4 | awk 'BEGIN{FS=OFS=";"} {gsub($1,echo $1 ,$1)} 1' | awk 'BEGIN{FS=OFS=";"} {gsub($3,echo "\"" $3 "\"",$3)} 1'
Thanks in advance.
You may use this awk that sets ; as input/output delimiter and then wraps each field with "s if that field is non-numeric:
awk '
BEGIN {
FS = OFS = ";"
}
{
for (i=1; i<=NF; ++i)
$i = ($i+0 == $i ? $i : "\"" $i "\"")
} 1' file
"2018-01-02";1.5;"abcd";111
"2018-01-04";2.75;"efgh";222
"2018-01-07";5.25;"lmno";333
"2018-01-09";1.25;"prs";444
Alternative gnu-awk solution:
awk -v RS='[;\n]' '$0+0 != $0 {$0 = "\"" $0 "\""} {ORS=RT} 1' file
Using GNU awk and typeof(): Fields - - that are numeric strings have the strnum attribute. Otherwise, they have the string attribute.1
$ gawk 'BEGIN {
FS=OFS=";"
}
{
for(i=1;i<=NF;i++)
if(typeof($i)=="string")
$i=sprintf("\"%s\"",$i)
}1' file
Some output:
"2018-01-02";1.5;"abcd";111
- -
Edit:
If some the fields are already quoted:
$ gawk 'BEGIN {
FS=OFS=";"
}
{
for(i=1;i<=NF;i++)
if(typeof($i)=="string")
gsub(/^"?|"?$/,"\"",$i)
}1' <<< string,123,"quoted string"
Output:
"string",123,"quoted string"
Further enhancing upon anubhava's solution (including handling fields already double-quoted :
gawk -e 'sub(".+",$-_==+$-_?"&":(_)"&"_\
)^gsub((_)_, _)^(ORS = RT)' RS='[;\n]' \_='\42'
"2018-01-02";1.5;"abcd";111
"2018-01-04";2.75;"efgh";222
"2018-01-07";5.25;"lmno";333
"2018-01-09";1.25;"prs";444
"2018-01-09";1.25;"prs";111111111111111111112222222222
222222223333333333333333333333
333344444444444444444499999999
999991111111111111111111122222
222222222222233333333333333333
333333333444444444444444444999
999999999991111111111111111111
122222222222222222233333333333
333333333333333444444444444444
444999999999999991111111111111
111111122222222222222222233333
333333333333333333333444444444
444444444999999999999991111111
111111111111122222222222222222
233333333333333333333333333444
444444444444444999999999999991
111111111111111111122222222222
222222233333333333333333333333
333444444444444444444999999999
999991111111111111111111122222
222222222222233333333333333333
333333333444444444444444444999
999999999999
In a directory, there is several files such as:
file1
file2
file3
Is there a simple way to concatenate those files to get one line (connected by "OR") in bash as follows:
file1 OR file2 OR file3
Or do I need to write a script for it?
You can use this function to print all filenames (including ones with space, newline or special characters) with " OR " as separator (assuming your filename doesn't contain ASCII code 4):
orfiles() {
local IFS=$'\4'
local out="$*"
echo "${out//$'\4'/ OR }"
}
Then call it as:
orfiles *
How it works:
We set IFS (Internal Field Separator) to ASCII 4 locally inside the function
We store output of "$*" in local variable out. This will place \4 after each filename in variable $out.
Finally using BASH string substitution we globally replace \4 by " OR " while printing the output from $out.
In Unix systems IFS is only a single character delimiter therefore it cannot store multi character string " OR " and we have to do this in 2 steps as shown above.
You can simply do that with
printf '%s OR ' $(ls -1 *) | sed 's/OR $/''/'; echo -e '\n'
Where ls -1 * is the directory.
The moment that should be considered is that a filename could contain whitespace(s).
Use the following ls + awk solution:
ls -1 * | awk '{ r=(r)? r" OR "$0 : $0 }END{ print r }'
Workaround for filenames with newline(s):
echo -e $(ls -1b hello* | awk -v RS= '{gsub(/\n/," OR ",$0); gsub(/\\ /," ",$0); print $0}')
-b - ls option to print C-style escapes for nongraphic characters
ls -1|awk -v q='"' '{printf "%s%s", NR==1?"":" OR ", q $0 q}END{print ""}'
the ls & awk way to do it, with example that the filename containing spaces:
kent$ ls -1
file1
file2
'file with OR and space'
kent$ ls -1|awk -v q='"' '{printf "%s%s", NR==1?"":" OR ", q $0 q}END{print ""}'
"file1" OR "file2" OR "file with OR and space"
$ for f in *; do printf '%s%s' "$s" "$f"; s=" OR "; done; printf '\n'
file1 OR file2 OR file3
I'm writing an awk script to let it parse something for me. For the purpose of convenience, I want the awk script to be executable in linux. Here are my codes:
#!/usr/bin/awk -f
BEGIN {
FILENAME=ARGV[1]
sub_name=ARGV[2]
run=ARGV[3]
count=0
}
{
if ($4 == "ARGV[2]" && $8 == ARGV[3])
{
print $15
count=count+1
}
}
END {
print count
}
When I issue my awk script in linux such as:
./my_script 001.log type1 2
awk will say:
awk: ./awk_script:23: fatal: cannot open file `type1' for reading (No such
file or directory)
I just want to let argument "type1" as a variable in my script, not a input file for parsing. How can I don't let awk treat it as an imput file?
Thank you,
Don't use a shebang to execute the awk script as it just complicates things:
/usr/bin/awk -v sub_name="$2" -v run="$3" '
{
if ($4 == sub_name && $8 == run)
{
print $15
count=count+1
}
}
END {
print count
}
' "$1"
Note that your script could be cleaned up a bit:
/usr/bin/awk -v sub_name="$2" -v run="$3" '
($4 == sub_name) && ($8 == run) {
print $15
count++
}
END { print count+0 }
' "$1"
Delete the non-file options from ARGV:
delete ARGV[2]
delete ARGV[3]
if you want to use them as variables then you have to use the -v argument. The way you are trying to do it , suggests that the second argument is an input/output file
I have two CSV files, the first one looks like below:
File1:
3124,3124,0,2,,1,0,1,1,0,0,0,0,0,0,0,0,1106,11
6118,6118,0,0,,0,0,1,0,0,0,0,1,1,1,1,1,5156,51
6679,6679,0,0,,1,0,1,0,0,0,0,0,1,0,1,0,1106,11
5249,5249,0,0,,0,0,1,1,0,0,0,0,0,0,0,0,1106,13
2658,2658,0,0,,1,0,1,1,0,0,0,0,0,0,0,0,1197,11
4322,4322,0,0,,1,0,1,1,0,0,0,0,0,0,0,0,1307,13
File2:
7792,1307,2012-06-07,,,,
5249,4001,2016-07-02,,,,
6001,1334,2017-01-23,,,,
2658,4001,2009-02-09,,,,
9279,1326,2014-12-20,,,,
what I need:
if the $2 in file2 = 4001, then has to match $1 of file2 with file1, if $18 in file1 = 1106 for the matched $1 then print that line.
the expected output:
5249,5249,0,0,,0,0,1,1,0,0,0,0,0,0,0,0,1106,13
I have tried something as the following, but with no success.
awk 'NR=FNR {A[$1]=$1;next} {print $1}'
P.S: The files are compressed, so I have to use the zcat command
I would try something like:
$ cat t.awk
BEGIN { FS = "," }
# Processing first file
NR == FNR && $18 == 1106 { a[$1] = $0; next }
# Processing second file
$2 == 4001 && $1 in a { print a[$1] }
$ awk -f t.awk file1.txt file2.txt
5249,5249,0,0,,0,0,1,1,0,0,0,0,0,0,0,0,1106,13
i have a problem with this code.. i can't figure out what i have to write as condition to cut my file with awk.
i=0
while [ $i -lt 10 ]; #da 1 a 9, Ap1..Ap9
do
case $i in
1) RX="54:75:D0:3F:1E:F0";;
2) RX="54:75:D0:3F:4D:00";;
3) RX="54:75:D0:3F:51:50";;
4) RX="54:75:D0:3F:53:60";;
5) RX="54:75:D0:3F:56:10";;
6) RX="54:75:D0:3F:56:E0";;
7) RX="54:75:D0:3F:5A:B0";;
8) RX="54:75:D0:3F:5F:90";;
9) RX="D0:D0:FD:68:BC:70";;
*) echo "Numero invalido!";;
esac
echo "RX = $RX" #check
awk -F, '$2 =="$RX" { print $0 }' File1 > File2[$i] #this is the line!
i=$(( $i + 1 ))
done
the command echo prints correctly but when i use the same "$RX" as condition in AWK it doesn't work (it prints a blank page).
my File1 :
1417164082794,54:75:D0:3F:53:60,54:75:D0:3F:1E:F0,-75,2400,6
1417164082794,54:75:D0:3F:56:10,54:75:D0:3F:1E:F0,-93,2400,4
1417164082794,54:75:D0:3F:56:E0,54:75:D0:3F:1E:F0,-89,2400,4
1417164082794,54:75:D0:3F:5A:B0,54:75:D0:3F:1E:F0,-80,2400,4
1417164082794,54:75:D0:3F:53:60,54:75:D0:3F:1E:F0,-89,5000,2
could you tell me the right expression "awk -F ..."
thank you very much!
To pass variables from shell to awk use -v:
awk -F, -v R="$RX" '$2 ==R { print $0 }' File1 > File2[$i]
#Ricky - any time you write a loop in shell just to manipulate text you have the wrong approach. It's just not what the shell was created to do - it's what awk was created to do and the shell was created to invoke commands like awk.
Just use a single awk command and instead of reading File 10 times and switching on variables for every line of the file, just do it all once, something like this:
BEGIN {
split(file2s,f2s)
split("54:75:D0:3F:1E:F0\
54:75:D0:3F:4D:00\
54:75:D0:3F:51:50\
54:75:D0:3F:53:60\
54:75:D0:3F:56:10\
54:75:D0:3F:56:E0\
54:75:D0:3F:5A:B0\
54:75:D0:3F:5F:90\
D0:D0:FD:68:BC:70", rxs)
for (i in rxs) {
rx2file2s[rxs[i]] = f2s[i]
}
}
{
if ($2 in rx2file2s) {
print > rx2file2s[$2]
}
else {
print NR, $2, "Numero invalido!" | "cat>&2"
}
}
which you'd then invoke as awk -v file2s="${File2[#]}" -f script.awk File1
I say "something like" because you didn't provide any sample input (File1 contents) or expected output (File2* values and contents) so I couldn't test it but it will be very close to what you need if not exactly right.