Modify values of a CSV based on previous values using awk

Modify values of a CSV based on previous values using awk - linux

I'd like to replace extraneous data values (any value >100,000) of a csv with the previous non-extraneous value.
Input
1/1/2017 01:00 56242
1/1/2017 02:00 51214
1/1/2017 03:00 101442
1/1/2017 04:00 44242
1/1/2017 05:00 990919
1/1/2017 06:00 221512
1/1/2017 07:00 52100
Expected Output
1/1/2017 01:00 56242
1/1/2017 02:00 51214
1/1/2017 03:00 51214
1/1/2017 04:00 44242
1/1/2017 05:00 44242
1/1/2017 06:00 44242
1/1/2017 07:00 52100
Can this be done using awk in a bash script?
Something like:
awk 'BEGIN{FS=OFS=" "} NF{print $1, $2, ($3 >=100000 "not sure") output.csv
Any help would be appreciated - I'm not very familiar with awk.

awk to the rescue!
awk '{if($NF>100000) $NF=p; else p=$NF}1' file

Related

Excel: How to change the date column to display only Month of the date?

I'm working on generating a report by month and I'm going to create Graph based on the month values.
Consider I'm having 1000+ records in my excel sheet and there is a column called created_date which is containing the values like 11/1/2019 1:34:00 AM. I'm looking for a function or any solution to convert the created_date value to 11 or 11/2019 so I can generate a chart by Month.
Note: I'm using the online version of Microsoft Excel for this operation.
For Example - I have attached some records below.
Created_date
11/1/2019 1:34
11/1/2019 0:10
10/31/2019 19:31
10/31/2019 8:32
10/31/2019 3:59
10/31/2019 0:06
10/29/2019 23:48
10/29/2019 23:37
10/29/2019 22:35
10/29/2019 22:33
10/29/2019 22:26
10/29/2019 19:15
10/25/2019 20:44
10/25/2019 3:36
10/5/2019 3:25
10/5/2019 1:52
10/3/2019 0:40
10/2/2019 19:23
10/1/2019 3:56
9/27/2019 4:23
9/27/2019 0:19
9/25/2019 0:46
9/24/2019 22:22
9/24/2019 22:20
9/24/2019 17:12
9/20/2019 20:21

Assume your data is in cell A1, input the formula into cell B1 =TEXT(A1,"mm/yyyy"). This wil give you output of 11/2019. If you only want a 2 digit month, change the formula to =TEXT(A1,"mm"). You can then reference your chart to this new column.

How do I find a list of available timeslots given scheduled dates from list in Excel/VBA sheet?

I have a list of scheduled appointments already and want to be able to show all possible timeslots available between 7:30 AM - 5:00 PM for a 2 hour appointment. I've tried a visual and been able to get it through a hack, but I need to get it to work just from reading the below table
SCHEDULED APPOINTMENTS
|---------------------|-------------------|
| Start Date/Time | End Date/Time |
| 6/12/2019 7:30 AM | 6/12/2019 8:30 AM |
| 6/12/2019 8:45 AM | 6/12/2019 9:15 AM |
| 6/12/2019 3:00 PM | 6/12/2019 3:30 PM |
| 6/12/2019 3:45 PM | 6/12/2019 4:15 PM |
| 6/12/2019 4:15 PM | 6/12/2019 5:00 PM |
|---------------------|-------------------|
EXPECTED OUTCOME:
6/12/2019 9:15 AM
6/12/2019 9:30 AM
6/12/2019 9:45 AM
6/12/2019 10:00 AM
6/12/2019 10:15 AM
6/12/2019 10:30 AM
6/12/2019 10:45 AM
6/12/2019 11:00 AM
6/12/2019 11:15 AM
6/12/2019 11:30 AM
6/12/2019 11:45 AM
6/12/2019 12:00 PM
6/12/2019 12:15 PM
6/12/2019 12:30 PM
6/12/2019 12:45 PM
6/12/2019 1:00 PM

To get just that list directly would require VBA, which is possible, but StackOverflow is not a write-your-code-for you service. We would help if you got stuck with your code, but you need to know how to code in the first place and have made a start.
That said, if you accept a slightly easier solution, then a single formula can give you your desired result:
Convert your appointments range to a data table with column headings "Start" and "End"
Set the table name to "Appointments"
Store your new appointment length (2) in a cell and give it the name "Length"
Create a list of every possible appointment start time, starting from A1
Enter this formula next to the first time in B1, and save it by pressing CTRL+SHIFT+ENTER:
=AND((ROUND(Appointments[Start],4)>=ROUND(A1+Length/24,4))+(ROUND(Appointments[End],4)<=ROUND(A1,4)),ROUND(A1-TRUNC(A1),4)<=ROUND((17-Length)/24,4))
Then fill down that formula against every time slot and it will say TRUE for the available time slots.
For each possible time slot, the formula checks that all existing appointments finish on or before the time slot or start 2 or more hours after the time slot. It also checks that there are at least 2 hours left in the day before finishing at 5pm. The formula handles different lengths required for the new appointment by changing the value in the "length" cell.
The ROUND functions are added to eliminate issues with floating point precision on fractions/times not always correctly identifying when 2 times are the same.

Excel: convert float number to time

In Excel I need to convert float numbers to time.
For example:
8,3 must become 08:30
10 must become 10:00
11,3 must become 11:30
Any ideas?
Thank you

Simple:
=(INT(A1)+(A1-INT(A1))/0.6)/24
Input | Output
----- | --------
8.3 | 08:30:00
10 | 10:00:00
11.3 | 11:30:00

How to count entries with specific time range in Excel?

Can anyone help me with this brain teaser :)
I need to count entries by hour and date and as the list is huge formula will save my life.
Bellow is the example how it looks.
Thank you in advance for your help!
17/05/2017 00:40
17/05/2017 01:10
17/05/2017 04:30
17/05/2017 05:00
17/05/2017 05:00
17/05/2017 05:05
17/05/2017 05:15
17/05/2017 05:20
17/05/2017 05:20
17/05/2017 05:30
17/05/2017 05:30
17/05/2017 05:30
17/05/2017 05:40
17/05/2017 05:45
17/05/2017 05:45
17/05/2017 05:50
17/05/2017 06:00
17/05/2017 06:00
17/05/2017 06:00
17/05/2017 06:20
17/05/2017 06:25

To do it with your date and times in one column use:
=SUMPRODUCT((MOD($A$1:$A$21,1)>=C1)*(MOD($A$1:$A$21,1)<=D1))

Edit: If the date and time is in one column, just use DATA --> Text to Columns, and use SPACE as the delimiter to put them in to two columns. There may be other ways to get your answer, keeping the info in one column, but that would likely be a relatively convoluted/complex formula. Text to Columns allows for quicker analysis.
If your data is in two columns, you can use COUNTIFS():
=COUNTIFS($B$1:$B$21,">="&E1,$B$1:$B$21,"<="&F1)

If your data is in one column, then add to the right column the following formula
=REPLACE(REPT(REPT("0",2-LEN(HOUR(A1)))&HOUR(A1),2),3,0,":00 - ")&":55"
and then use pivot table to count each group
enter image description here

sed - extract specific characters from a string

So I have some unclean HTML:
"<table class="content divbackground"><tr><td class='title'> </td><td class='title'>From</td><td class='title'>To</td></tr><tr><td class='entry'>Monday</td><td class='entry'>09:00</td><td class='entry'>18:00</td></tr><tr><td class='entry'>Tuesday</td><td class='entry'>09:00</td><td class='entry'>18:00</td></tr><tr><td class='entry'>Wednesday</td><td class='entry'>09:00</td><td class='entry'>18:00</td></tr><tr><td class='entry'>Thursday</td><td class='entry'>09:00</td><td class='entry'>20:00</td></tr><tr><td class='entry'>Friday</td><td class='entry'>09:00</td><td class='entry'>20:00</td></tr><tr><td class='entry'>Saturday</td><td class='entry'>09:00</td><td class='entry'>18:00</td></tr><tr><td class='entry'>Sunday</td><td class='entry'>11:00</td><td class='entry'>18:00</td></tr></table></td></td>"
It's the opening hours of a pharmacy (the information is published on a public register).
Now I could parse the HTML using a parser, but I find that this is not robust to errors and I still have to pull out the code between <table> and </table>.
Is there some nice unix command (sed?) that searches for all occurances of:
XX:XX
inside <td></td> tags
where X must be a number?

handle html with regex is not the good practice. however if your input format is fixed, you can try this grep line:
grep -oP '<td[^>]*>\K\d\d:\d\d' input
with your example input, it outputs:
09:00
18:00
09:00
18:00
09:00
18:00
09:00
20:00
09:00
20:00
09:00
18:00
11:00
18:00

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Modify values of a CSV based on previous values using awk - linux

awk to the rescue! awk '{if($NF>100000) $NF=p; else p=$NF}1' file

Related

Excel: How to change the date column to display only Month of the date?

How do I find a list of available timeslots given scheduled dates from list in Excel/VBA sheet?

Excel: convert float number to time

How to count entries with specific time range in Excel?

sed - extract specific characters from a string

Categories

Resources