Paste output into a CSV file in bash with paste command - excel

I am using the paste command as below:
message+=$(paste <(echo "${arr[0]}") <(echo "$starttimeF") <(echo "$ttime") <(echo "$time") | column -t)
echo "$message"
It gives me the following output:
ASPPPLD121 11:45:00 00:00:16 00:02:23
FPASDDF123 11:45:00 00:00:16 00:02:23
ZWASD77F0D 09:04:58 02:40:18 03:51:10
DDPADSDSD5 11:29:41 00:15:35 01:17:33
How do I redirect this to a CSV or an EXCEL FILE?
OR
How do I put it into HTML table?

You have three questions in one. The easiest to solve is the CSV file output:
echo $message | awk 'BEGIN{OFS=","}{$1=$1}1' > mycsvfile.csv
This will also open in excel since excel can read csv files, so maybe that solves the second question?
Awk is opening the file and reading line by line. We set the OFS (Output Field Seperator) to a comma. Then we just set one of the fields equal to itself, which is a just a trick to get awk to process the record, and let it print out the results to the CSV file.
I suppose you could use awk to print out a HTML table too, but it seems like a gross way of doing things:
echo $message | awk 'BEGIN{print "<table>"} {print "<tr>";for(i=1;i<=NF;i++)print "<td>" $i"</td>";print "</tr>"} END{print "</table>"}'
html output awk script stolen from here

Related

Using awk to delete multiple lines using argument passed on via function

My input.csv file is semicolon separated, with the first line being a header for attributes. The first column contains customer numbers. The function is being called through a script that I activate from the terminal.
I want to delete all lines containing the customer numbers that are entered as arguments for the script. EDIT: And then export the file as a different file, while keeping the original intact.
bash deleteCustomers.sh 1 3 5
Currently only the last argument is filtered from the csv file. I understand that this is happening because the output file gets overwritten each time the loop runs, restoring all previously deleted arguments.
How can I match all the lines to be deleted, and then delete them (or print everything BUT those lines), and then output it to one file containing ALL edits?
delete_customers () {
echo "These customers will be deleted: "$#""
for i in "$#";
do
awk -F ";" -v customerNR=$i -v input="$inputFile" '($1 != customerNR) NR > 1 { print }' "input.csv" > output.csv
done
}
delete_customers "$#"
Here's some sample input (first piece of code is the first line in the csv file). In the output CSV file I want the same formatting, with the lines for some customers completely deleted.
Klantnummer;Nationaliteit;Geslacht;Title;Voornaam;MiddleInitial;Achternaam;Adres;Stad;Provincie;Provincie-voluit;Postcode;Land;Land-voluit;email;gebruikersnaam;wachtwoord;Collectief ;label;ingangsdatum;pakket;aanvullende verzekering;status;saldo;geboortedatum
1;Dutch;female;Ms.;Josanne;S;van der Rijst;Bliek 189;Hellevoetsluis;ZH;Zuid-Holland;3225 XC;NL;Netherlands;JosannevanderRijst#dayrep.com;Sourawaspen;Lae0phaxee;Klant;CZ;11-7-2010;best;tand1;verleden;-137;30-12-1995
2;Dutch;female;Mrs.;Inci;K;du Bois;Castorweg 173;Hengelo;OV;Overijssel;7557 KL;NL;Netherlands;InciduBois#gustr.com;Hisfireeness;jee0zeiChoh;Klant;CZ;30-8-2015;goed ;geen;verleden;188;1-8-1960
3;Dutch;female;Mrs.;Lusanne;G;Hijlkema;Plutostraat 198;Den Haag;ZH;Zuid-Holland;2516 AL;NL;Netherlands;LusanneHijlkema#dayrep.com;Digum1969;eiTeThun6th;Klant;Achmea;12-2-2010;best;mix;huidig;-335;9-3-1973
4;Dutch;female;Dr.;Husna;M;Hoegee;Tiendweg 89;Ameide;ZH;Zuid-Holland;4233 VW;NL;Netherlands;HusnaHoegee#fleckens.hu;Hatimon;goe5OhS4t;Klant;VGZ;9-8-2015;goed ;gezin;huidig;144;12-8-1962
5;Dutch;male;Mr.;Sieds;D;Verspeek;Willem Albert Scholtenstraat 38;Groningen;GR;Groningen;9711 XA;NL;Netherlands;SiedsVerspeek#armyspy.com;Thade1947;Taexiet9zo;Intern;CZ;17-2-2004;beter;geen;verleden;-49;12-10-1961
6;Dutch;female;Ms.;Nazmiye;R;van Spronsen;Noorderbreedte 180;Amsterdam;NH;Noord-Holland;1034 PK;NL;Netherlands;NazmiyevanSpronsen#jourrapide.com;Whinsed;Oz9ailei;Intern;VGZ;17-6-2003;beter;mix;huidig;178;8-3-1974
7;Dutch;female;Ms.;Livia;X;Breukers;Everlaan 182;Veenendaal;UT;Utrecht;3903
Try this in loop..
awk -v variable=$var '$1 != variable' input.csv
awk - to make decision based on columns
-v - to use a variable into a awk command
variable - store the value for awk to process
$var - to search for a specific string in run-time
!= - to check if not exist
input.csv - your input file
It's awk's behavior, when you use -v it can will work with variable on run-time and provide an output that doesn't contain the value you passed. This way, you get all the values that are not matching to your variable. Hope this is helpful. :)
Thanks
This bash script should work:
!/bin/bash
FILTER="!/(^"$(echo "$#" | sed -e "s/ /\|^/g")")/ {print}"
awk "$FILTER" input.csv > output.csv
The idea is to build an awk relevant FILTER and then use it.
Assuming the call parameters are: 1 2 3, the filter will be: !/(^1|^2|^3)/ {print}
!: to invert matching
^: Beginning of the line
The input data are in the input.csv file and output result will be in the output.csv file.

Sed/awk: Aligning words in a file

I have a file with the following structure:
# #################################################################
# TEXT: MORE TEXT
# TEXT: MORE TEXT
# #################################################################
___________________________________________________________________
ITEM 1
___________________________________________________________________
PROPERTY1: VALUE1_1
PROPERTY222: VALUE2_1
PROPERTY33: VALUE3_1
PROPERTY4444: VALUE4_1
PROPERTY55: VALUE5_1
Description1: Some text goes here
Description2: Some text goes here
___________________________________________________________________
ITEM 2
___________________________________________________________________
PROPERTY1: VALUE1_2
PROPERTY222: VALUE2_2
PROPERTY33: VALUE3_2
PROPERTY4444: VALUE4_2
PROPERTY55: VALUE5_2
Description1: Some text goes here
Description2: Some text goes here
I want to add another item to the file, using sed or awk:
sed -i -r "\$a$PROPERTY1: VALUE1_3" file.txt
sed -i -r "\$a$PROPERTY2222: VALUE2_3" file.txt
etc. So my next item looks like this:
___________________________________________________________________
ITEM 3
___________________________________________________________________
PROPERTY1: VALUE1_3
PROPERTY222: VALUE2_3
PROPERTY33: VALUE3_3
PROPERTY4444: VALUE4_3
PROPERTY55: VALUE5_3
Description1: Some text goes here
Description2: Some text goes here
The column values is jagged. How do I align my values to the left like for previous items? I can see 2 solutions here:
To align the values while inserting them into the file.
To insert the values into the file the way I did it and align them next.
The command
sed -i -r "s|.*:.*|&|g" file.txt
catches the properties and values I want to align, but I haven't been able to align them properly, i.e.
awk '/^.*:.*$/{ printf "%-40s %-70s\n", $1, $2 }' file.txt
It prints out the file, but it includes the description values and tags, cuts the values if they include spaces or dashes. It just a big mess.
I've tried more commands based on what I've found on Stack Overflow and some blogs, but nothing does what I need.
Note: Values of the description tags are not jagged- this is because I write them to the file in a separate way.
What is wrong with my commands? How do I achieve what I need?
When your file is without tabs, try this:
sed -r 's/: +/:\t/' file.txt | expand -20
When this works, redirect the output to a tmpfile and move the tmpfile to file.txt.
You can use gensub and thoughtful field seperators to take care of this:
for i in {1..5}; do
echo $(( 10 ** i )): $i;
done | awk -F ':::' '/^[^:]+:.+/{
$0 = gensub(/: +/, ":::", $0 );
key=( $1 ":" );
printf "%-40s %s\n", key, $2;
}'
The relevant part being where we swap out ": +" for just ":::" and then do a printf to bring it back together.
You could use \t to insert tabs (rather than spaces which is why you get 'jagged' values)
instead of
sed -i -r "\$a$PROPERTY1: VALUE1_3" file.txt
use
sed -i -r "\$a$PROPERTY1:\t\tVALUE1_3" file.txt
All you need to do is remember the existing indentation when inserting the new line, e.g.:
echo 'PROPERTY732: VALUE9_8_7' |
awk -v prop="PROPERTY1" -v val="VALUE1_3" '
match($0,/^PROPERTY[^[:space:]]+[[:space:]]+/) { wid=RLENGTH }
{ print }
END { printf "%-*s%s\n", wid, prop":", val }
'
PROPERTY732: VALUE9_8_7
PROPERTY1: VALUE1_3
but it's not clear that adding 1 line at a time makes sense or where all of the other text you're adding is coming from.
The above will work with any awk on any UNIX system.
If your "properties" don't actually start with the word PROPERTY then you just need to edit your question to show more realistic sample input/output and tell/show us how to distinguish a PROPERTY line from a Description line and, again, the solution will be trivial with awk.

how to add numbers in C-shell

I have a question about C-shell. In my script, I want to automatically add all the numbers and get the total number. How to implement such function in C-shell?
My script is shown below:
#!/bin/csh -f
set log_list = $1
echo "Search begins now"
foreach subdir(`cat $log_list`)
grep "feature identified" "$subdir" -A1 | grep "ne=" | awk '{print $7}'
echo "done"
end
For this script, it will grep the log file "log_list" for keyword "feature identified" and the next line containing keyword "ne=". I care about the number after "ne=", for example ne=140.
Then the grep output will be like this:
ne=100
ne=115
ne=120
...
There are more than 1K lines of such numbers. Of course I can redirect the grep data to a new file(in Linux). Then copy all the data into Excel spreadsheet to add them up. But I want to do this in the script. And it will make thing easier.
The final result should be like this:
total_ne=335
Do you know how to do this in the C-shell? Thanks!

Subsetting a CSV by unique column values

I am fairly new to linux and feel this should be a fairly simple task, but I cannot quite figure it out. I have a large data file with millions of rows, and I want to break the file into smaller files based on date. I have a time column that contains YYMMDDHH data, and I want to create sub files based on the DD. For each new DD, I want a new file created with all entries for that day. The file is a csv and is already sorted by time.
From what I have read it looks like I should be able to use cat, awk and possibly grep to perform what I want.
To elaborate further, there are 14 columns per row. One column has data that contains YYMMDDHH (ie 14071000, 14071000...14071022,14071022....14071100...14071200...)
I can manually subset with
cat trial | awk 'NR>=1 && NR<=100 {print}' >output.txt
This gives me the rows between 1 and 100. I was wondering if there is a command that allows me to extract based off the YYMMDDHH column, so that all data points on 140710 could be put in a single file. Hope that helps explain my problem a little better.
You should be able to use s.th. like this:
awk '{ line_date = $1 / 100; print > "out_" line_date ".txt"; }'
BTW you might want to avoid 'useless use of cat' by not piping but using awk directly on your file.
YYMMDDHH 14071000
imagine YYMMDDHH is at the 1st coloumn.
awk '{fn = substr($1, 1, 6) ; print $0 >> fn }' 1.txt
awk '{print $0 >> "File" substr($1, 0, 6) ".txt"}' file
Assuming date is in the first column. Logic is to append each line to corresponding file (name of the file is the date in YYMMDD format). So that all data corresponding to each date will be in corresponding "FileYYMMDD.txt". If date is in some other column, you can just change $1 to the column number.
Sample Output:
sdlcb#Goofy-Gen:~/AMD/SO$ cat file
14071000 asasaa
14071022 iosido
14071000 lsdksld
14071022 sodisdois
14071100 iwiwe
14071022 iosido
14071100 iwiwe
14071200 yqiwyq
sdlcb#Goofy-Gen:~/AMD/SO$ awk '{print $0 >> "File" substr($1, 0, 6) ".txt"}' file
sdlcb#Goofy-Gen:~/AMD/SO$ ls
file File140710.txt File140711.txt File140712.txt
sdlcb#Goofy-Gen:~/AMD/SO$ cat File140710.txt
14071000 asasaa
14071022 iosido
14071000 lsdksld
14071022 sodisdois
14071022 iosido
sdlcb#Goofy-Gen:~/AMD/SO$ cat File140711.txt
14071100 iwiwe
14071100 iwiwe
sdlcb#Goofy-Gen:~/AMD/SO$ cat File140712.txt
14071200 yqiwyq

Executing zgrep recursively in Linux

This zgrep command is outputting a particular field of a line containing the word yellow when given a giant input log file for all 24 hours of 26th Feb 1989.
zgrep 'yellow' /color_logs/1989/02/26/*/1989-02-26-00_* | cut -f3 -d'+'
1) I prefer using a perl script. Are there advantages of using a bash script?
Also when writing this script I would like for it to create a file after processing the data for each DAY (so it will look at all the hours in a day)
zgrep 'yellow' /color_logs/1989/02/*/*/1989-02-26-00_* | cut -f3 -d'+'
2) How do I determine the value of the first star (in Perl), after processing a day's worth of data so that I can output the file with the YYMMDD in its name. I'm interested in getting the value of the first star from the line of code directly above this question.
Grep writes out the file that where the line came from, but your cut command is throwing that away. You could do something like:
open(PROCESS, "zgrep 'yellow' /color_logs/1989/02/*/*/1989-02-26_* |");
while(<PROCESS>) {
if (m!/color_logs/(\d\d\d\d)/(\d\d)/(\d\d)/[^:]+:(.+)$!) {
my ($year, $month, $day, $data) = ($1, $2, $3, $4);
# Do the cut -f3 -d'+' on the line from the log
my $data = (split('+', $data))[2];
open(OUTFILE, ">>${year}${month}${day}.log");
print OUTFILE $data, "\n";
close(OUTFILE);
}
}
That's inefficient in that you're opening and closing the file for each line, you could use an IO::File object instead and only open when the date changes, but you get the idea.

Resources