converting variable to date format in unix script - linux

if [ -f $FILEPATH3 ]; then
#Will print the Header columns from properties file.
print $header >$CFILEPATH3
#To add rows in output file from input file .
awk -F\" 'BEGIN{OFS=FS;} {for(i=1;i<=NF;i=i+2){gsub(/,/,"~\t",$i);}}1' $FILEPATH3 > $TFILEPATH3
#Removes the footer,header and prints the columns as per mapping by picking column numbers from properties file
cat $TFILEPATH3| sed '1d' | awk 'BEGIN { FS = "~\t"; OFS = ",";}{ DATE = date -d "$'$ms2'" "+%Y%m%d" } { printf "%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s\n", $'$ms1', $DATE,$'$ms3',$'$ms4',$'$ms5',$'$ms6',$'$ms7',$'$ms8’,”MS”,$’$ms10',$'$ms11',$'$ms12',$'$ms13',$'$ms14',$'$ms15',$'$ms16',$'$ms17'}' >> $CFILEPATH3
In the above code, I'm trying to copy the data from the input file to output file. ms1, ms2 are the column positions of a CSV input file.
ms2 is the date with format mm/dd/yyyy and which is considered as variable. We need to convert the variable into YYYYMMDD format and write it into output file
In the script I'm trying to change the date format to YYYYMMDD.. but I'm getting an error.
I think the error is from this code
{ DATE = date -d "$'$ms2'" "+%Y%m%d" }

You are trying to access the variable in single quotes and which wont happen, So try to use the following syntax.
DATE=$(echo $ms2|awk -F '/' '{print $3$2$1}')
cat $TFILEPATH3| sed '1d' | awk -v ms2="$DATE" 'BEGIN { FS = "~\t"; OFS = ",";} { printf "%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s\n", $'$ms1', $ms2,$'$ms3',$'$ms4',$'$ms5',$'$ms6',$'$ms7',$'$ms8’,”MS”,$’$ms10',$'$ms11',$'$ms12',$'$ms13',$'$ms14',$'$ms15',$'$ms16',$'$ms17'}' >> $CFILEPATH3

Related

How to print the value in third column of a line which comes after a line which, contains a specific string using AWK to a different file?

I have an output which contains something like this in the middle.
Stopping criterion = max iterations
Energy initial, next-to-last, final =
-83909.5503696 -86748.8150981 -86748.8512012
What I am trying to do is to print out the last value(3rd column) in line after the line which contains the string "Energy" to a different file. and I have to print out these values from 100 different files. currently I have been trying with this line which only looks at a single file.
awk -F: '/Energy/ { getline; print $0 }' inputfile > outputfile
but this gives output like:
-83909.5503696 -86748.8150981 -86748.8512012
Update - With the help of a suggestion below I was able to output the value to a file. but as it reads through different files it overwrites the final output file and prints out value of the final file that it read. What I tried was this,
#SBATCH --array=1-100
num=$SLURM_ARRAY_TASK_ID..
fold=$(printf '%03d' $num)
cd $main_path/surf_$fold
awk 'f{print $3; f=0} /Energy/{f=1}' inputfile > outputfile
This would not be an appropriate job for getline, see http://awk.freeshell.org/AllAboutGetline, and idk why you're setting FS to : with -F: when your fields are space-separated as awk assumes by default.
Here's how to do what I think you're trying to do with 1 call to awk:
awk 'f{print $3; f=0} /Energy/{f=1}' "$main_path/surf_"*"/inputfile > outputfile

Filtering CSV file based on string name

I'm trying to get specific columns of a csv file (that Header contains "SOF" in case). Is a large file and i need to copy this columns to another csv file using Shell.
I've tried something like this:
#!/bin/bash
awk ' {
i=1
j=1
while ( NR==1 )
if ( "$i" ~ /SOF/ )
then
array[j] = $i
$j += 1
fi
$i += 1
for ( k in array )
print array[k]
}' fil1.csv > result.csv
In this case i've tried to save the column numbers that contains "SOF" in the header in an array. After that copy the columns using this numbers.
Preliminary note: contrary to what one may infer from the code included in the OP, the values in the CSV are delimited with a semicolon.
Here is a solution with two separate commands:
the first parses the first line of your CSV file and identifies which fields must be exported. I use awk for this.
the second only prints the fields. I use cut for this (simpler syntax and quicker than awk, especially if your file is large)
The idea is that the first command yields a list of field numbers, separated with ",", suited to be passed as parameter to cut:
# Command #1: identify fields
fields=$(awk -F";" '
{
for (i = 1; i <= NF; i++)
if ($i ~ /SOF/) {
fields = fields sep i
sep = ","
}
print fields
exit
}' fil1.csv
)
# Command #2: export fields
{ [ -n "$fields" ] && cut -d";" -f "$fields" fil1.csv; } > result.csv
try something like this...
$ awk 'BEGIN {FS=OFS=","}
NR==1 {for(i=1;i<=NF;i++) if($i~/SOF/) {col=i; break}}
{print $col}' file
there is no handling if the sought out header doesn't exist so should print the whole line.
This link might be helpful for you :
One of the useful commands you probably need is "cut"
cut -d , -f 2 input.csv
Here number 2 is the column number you want to cut from your csv file.
try this one out :
awk '{for(i=1;i<=NF;i++)a[i]=a[i]" "$i}END{for (i in a ){ print a[i] } }' filename | grep SOF | awk '{for(i=1;i<=NF;i++)a[i]=a[i]" "$i}END{for (i in a ){ print a[i] } }'

How can I concat current date with title in shell script?

I am working on shellscript with excel sheet. Till now I have done as shown in screenshot by using below command:
bash execution.sh BehatIPOP.xls| awk '/Script|scenario/' | awk 'BEGIN{print "Title\tResult"}1' | awk '0 == NR%2{printf "%s",$0;next;}1' >> BehatIPOP.xls
My requirement is along with the heading Result I want to add(concat) current date also. So I am getting date by using below command:
$(date +"%d-%m-%y %H:%M:%S")
So date will display like this : 25-08-2016 17:00:00
But I am not getting how can use date command in the above mentioned command to achieve heading like below:
| Title | Result # 25-08-2016 17:00:00|
Thanks for any suggestions..
You can pick up the date inside awk and store it in a variable d like this, if that is what you mean:
awk 'BEGIN{cmd="date +\"%d-%m-%y %H:%M:%S\""; cmd |getline d; close(cmd);print "Result # " d}'
Result # 25-08-16 13:44:05
Don't use awk at all for the header, just use date directly:
{ printf "Title\tResult # "; date +"%d-%m-%y %H:%M:%S"; bash execution.sh BehatIPOP.xls |
awk '/Script|scenario/' |
awk '1 == NR%2{printf "%s",$0;next;}1'; } >> BehatIPOP.xls
Note that there's no need for 2 awks, but I'm keeping that here to minimize the diff. Since I've pulled the header out of the awk, the comparison changes from 0==NR%2 to 1==NR%2.

How to Compare CSV Column using awk?

I receive and CSV like this:
column$1,column$2,column$
john,P,10
john,P,10
john,A,20
john,T,30
john,T,10
marc,P,10
marc,C,10
marc,C,20
marc,T,30
marc,A,10
I need so sum the values and display the name and results but column$2 needs to show the sum of values T separated from values P,A,C.
Output should be this:
column$1,column$2,column$3,column$4
john,PCA,40
john,T,40,CORRECT
marc,PCA,50
marc,T,30,INCORRECT
All i could do was extract the columns i need from the original csv:
awk -F "|" '{print $8 "|" $9 "|" $4}' input.csv >> output.csv
Also sort by the correct column:
sort -t "|" -k1 input.csv >> output.csv
And add a new column to the end of the csv:
awk -F, '{NF=2}1' OFS="|" input.csv >> output.csv
I managed to sum and display the sum by column$1 and $2, but i don't how to group different values from column$2:
awk -F "," '{col[$1,$2]++} END {for(i in col) print i, col[i]}' file > output
Awk is stream oriented. It processes input and outputs what you change. It does not do in file changes.
You just need to add a corresponding print
awk '{if($2 == "T") {print "MATCHED"}}'
If you want to output more than the "matched" you need to add it to the print
e.g. '{print $1 "|" $2 "|" $3 "|" " MATCHED"}'
or use print $0 as comment mentions above.
Assuming that "CORRECT" and "INCORRECT" are determined by comparing the "PCA" value to the "T" value, the following awk script should do the trick:
awk -F, -vOFS=, '$2=="T"{t[$1]+=$3;n[$1]} $2!="T"{s[$1]+=$3;n[$1]} END{ for(i in n){print i,"PCA",s[i]; print i,"T",t[i],(t[i]==s[i] ? "CORRECT" : "INCORRECT")} }' inputfile
Broken out for easier reading, here's what this looks like:
awk -F, -vOFS=, '
$2=="T" { # match all records that are "T"
t[$1]+=$3 # add the value for this record to an array of totals
n[$1] # record this name in our authoritative name list
}
$2!="T" { # match all records that are NOT "T"
s[$1]+=$3 # add the value for this record to an array of sums
n[$1] # record this name too
}
END { # Now that we've collected data, analyse the results
for (i in n) { # step through our authoritative list of names
print i,"PCA",s[i]
print i,"T",t[i],(t[i]==s[i] ? "CORRECT" : "INCORRECT")
}
}
' inputfile
Note that array order is not guaranteed in awk, so your output may not come out in the same order as your input.
If you want your output to be delimited using vertical bars, change the -vOFS=, to -vOFS='|'.
Then you can sort using:
awk ... | sort
which defaults to -k1.

Unix command to create new output file by combining 2 files based on condition

I have 2 files. Basically i want to match the column names from File 1 with the column name listed in the File 2. The resulting output File should have data for the column that matches with File 2 and Null value for the remaining column name in File 2.
Example:
file1
Name|Phone_Number|Location|Email
Jim|032131|xyz|xyz#qqq.com
Tim|037903|zzz|zzz#qqq.com
Pim|039141|xxz|xxz#qqq.com
File2
Location
Name
Age
Based on these 2 files, I want to create new file which has data in the below format:
Output:
Location|Name|Age
xyz|Jim|Null
zzz|Tim|Null
xxz|Pim|Null
Is there a way to get this result using join, awk or sed. I tried with join but couldnt get it working.
$ cat tst.awk
BEGIN { FS=OFS="|" }
NR==FNR { names[++numNames] = $0; next }
FNR==1 {
for (nameNr=1;nameNr<=numNames;nameNr++) {
name = names[nameNr]
printf "%s%s", name, (nameNr<numNames?OFS:ORS)
}
for (i=1;i<=NF;i++) {
name2fldNr[$i] = i
}
next
}
{
for (nameNr=1;nameNr<=numNames;nameNr++) {
name = names[nameNr]
fldNr = name2fldNr[name]
printf "%s%s", (fldNr?$fldNr:"Null"), (nameNr<numNames?OFS:ORS)
}
}
$ awk -f tst.awk file2 file1
Location|Name|Age
xyz|Jim|Null
zzz|Tim|Null
xxz|Pim|Null
Get the book Effective Awk Programming, 4th Edition, by Arnold Robbins.
I'd suggest using csvcut, which is part of CSVKit (https://csvkit.readthedocs.org), along the lines of the following:
#!/bin/bash
HEADERS=File2
PSV=File1
headers=$(tr '\n' , < "$HEADERS" | sed 's/,$//' )
awk '-F|' '
BEGIN {OFS=FS}
NR==1 {print $0,"Age"; next}
{print $0, "Null"}' "$PSV" ) |\
csvcut "-d|" -c "$headers"
I realize this may not be entirely satisfactory, but csvcut doesn't currently have options to handle missing columns or translate missing data to a specified value.

Resources