Power Query change values in a cell on condition - excel

I have a table with 100+ equipment. The sample is below:
Equipment
Rounded Start of operation
Rounded end of operation
eq1
01/09/2020 20:10
01/09/2020 20:15
eq1
01/09/2020 20:15
01/09/2020 20:40
eq2
01/09/2020 20:30
01/09/2020 20:45
eq2
01/09/2020 20:50
01/09/2020 21:55
As you can see the start of the second operation is the same as the end of the first operation for eq1.
I need to increase such start of the operation on 5 minutes.
I have tried to add 2 additional columns:
column_1 = equipment+rounded start of operation
column_2 = equipment+rounded end of operation.
Then I wanted to check if column_2 contains a value of column_1 but could not find any good way to do it. Table.Contains took around an hour to calculate everything.
Is there a better way to do it in Power Query?

Related

Is there a way to do IntervalMatch in Azure Data Factory?

Im trying to do an IntervalMatch from the following link in ADF:
https://help.qlik.com/en-US/qlikview/May2022/Subsystems/Client/Content/QV_QlikView/Scripting/ScriptPrefixes/IntervalMatch.htm
Is there an activity (Join, ...) or another way to achieve this?
I tried to repro this in ADF data flow with sample inputs and below is the approach.
Input tables:
Event_log:
Time
Event
Comment
0:00
0
Start of shift 1
1:18
1
Line stop
2:23
2
Line restart 50%
4:15
3
Line speed 100%
8:00
4
Start of shift 2
11:43
5
End of production
Order_log:
Start
End
Order
1:00
03:35
A
2:30
07:58
B
3:04
10:27
C
7:23
11:43
D
Source transformations (source1 and source2) are taken for the above tables.
Join transformation is taken. In Join settings
Source1 (Event Log) is taken as Left stream and Source2 (Order_log) is taken as right stream.
Join type is given as Right Outer.
Joining conditions are Start<=Time and End>=Time.
Output of Join Transformation:
Start
End
Order
Time
Event
Comment
NULL
NULL
NULL
0:00
0
Start of shift 1
1:00
3:35
A
1:18
1
Line stop
1:00
3:35
A
2:23
2
Line restart 50%
2:30
7:58
B
4:15
3
Line speed 100%
3:04
10:27
C
4:15
3
Line speed 100%
3:04
10:27
C
8:00
4
Start of shift 2
7:23
11:43
D
8:00
4
Start of shift 2
7:23
11:43
D
11:43
5
End of production

Generate a interval based time series using Spark SQL

I am new to Spark sql. I want to generate the following series of start time and end time which have an interval of 5 seconds for current date. So in lets say I am running my job on 1st Jan 2018 I want a series of start time and end time which have a difference of 5 seconds. So there will be 17280 records for 1 day
START TIME | END TIME
-----------------------------------------
01-01-2018 00:00:00 | 01-01-2018 00:00:04
01-01-2018 00:00:05 | 01-01-2018 00:00:09
01-01-2018 00:00:10 | 01-01-2018 00:00:14
.
.
01-01-2018 23:59:55 | 01-01-2018 23:59:59
01-02-2018 00:00:00 | 01-01-2018 00:00:05
I know I can generate this data-frame using a scala for loop. My constraint is that I can use only queries to do this.
Is there any way I can create this data structure using select * constructs?

how to value hh:mm in Ms Excel correctly so that certain values can be assigned to it

Please could anyone kindly help. Thanking you in advance. My problem is as the following:
I have thousands lines of data with two clusters of time. One is in sheet1, for example, random times from 16pm to 20pm or 4 hours or 240 minutes, I would like to give value to them i.e. 1 to 241
(column B)
A B
17:19
17:19
17:19
18:06
18:06
18:06
16:30
16:30
16:30
I have a second sheet which will give values to sheet1 column B, the content of sheet2 is:
16:00 1
16:01 2
16:02 3
.
.
.
17:19 80
17:20 81
17:21 82
.
.
.
18:06 127
18:07 128
18:08 129
.
.
.
16:30 31
16:31 32
16:32 33
.
.
.
19:58 239
19:59 240
20:00 241
I tried to use VLOOKUP, hour, minute to get values for sheet1 B, using sheet2 but I am still unsuccessful (I kept getting false value from comparing two columns containing times) e.g. in sheet1 column B, say B2 I have
=IFERROR(VLOOKUP($B2,'sheet2'!$A:$B,2,FALSE),"")
My solution did not work. if possible I should get sheet1 filled in like
A B
17:19 80
17:19 80
17:19 80
18:06 127
18:06 127
18:06 127
16:30 31
16:30 31
16:30 31
You can use an excel formula like this:
B2: =(HOUR(A2)*60)+MINUTE(A2)+1
This is just calculating the number of minutes after midnight. If you wanted to start at say, 16:00, you would just modify it like this:
Set a value somewhere that is the START time... In this example I have "$E$4" set to 16:00
=(HOUR(A3-$E$4)*60)+MINUTE(A3)+1
You could put your start time on another sheet or anywhere.
Of course you can always add the If statement to deal with empty rows:
=IF(A2="","",(HOUR(A3-$E$4)*60)+MINUTE(A3)+1)
In the example, note in the screenshot in column A, the formatting is TIME for row 2, and General for rows 3 & 4. The formula will work for either.
edit:added IF statement & description of screenshot.
EDIT AFTER COMMENT: Modified formula to add 1 minute.
I finally, found a correct answer
=(HOUR(A2)-16)*60+MINUTE(A2)+1
so 16:00 will give 1, 16:30 will give 31, 17:19 will give 80 and so on, 20:00 will give 241.
Thanks for trying to help, PJ Rosenburg. Indeed I was on right track in using hour and minute, BUT I do not even need to use VLOOKUP.

Calculate Average based on multiple condition in Excel

I am back with my new excel question.
Lets say I have table like this.
| A | B
------------------------------------------
1 | ENV | Value
------------------------------------------
2 | ABC - 10/1/2014 1:38:32 PM | 4
3 | XYZ - 10/1/2014 1:38:32 PM | 6
4 | ABC - 9/1/2014 1:38:32 PM | 1
5 | XYZ - 10/1/2014 1:38:32 PM | 10
6 | ABC - 10/1/2014 1:38:32 PM | 7
7 | XYZ - 9/1/2014 1:38:32 PM | 1
8 | ABC - 9/1/2014 1:38:32 PM | 10
9 | ABC - 10/1/2014 1:38:32 PM | 7
10 | XYZ - 10/1/2014 1:38:32 PM | 7
Now, in Cell C2, I've selected ABC.
So in cell D2, I want the average (from col B) of all the "ABC" (col A) where Month = 10 (col A) and in cell E2, Max (from col B) of all the "ABC" where Month = 10 (col A).
So, my result in cells D2 and E2 would be 6 and 7 respectively.
I hope my question and example make sense.
UPDATE:
Thank you all for all your help.
Now let's say I am not sure how many rows I'll have on this spreadsheet, so I came up with this formula, but its not working, giving me #DIV/0! error.
*Note: I am using formula to get "ABC" and "10" from cell C2.
=AVERAGEIFS(
(OFFSET($A$1,1,1,COUNTA($B:$B)-1,1)),
OFFSET($A$1,1,0,COUNTA($A:$A)-1,1), (MID(C2,1,(FIND("-",C2))-2)),
OFFSET($A$1,1,0,COUNTA($A:$A)-1,1), (MID(C2,(FIND("-",C2)+1),(FIND("/",C2))-(FIND("-",C2)+1))))
Even tried this, but same error:
=SUMPRODUCT(((MID(A2:A10,1,(FIND("-",A2:A10))-1))=(MID(C2,(FIND("-",C2)+1),(FIND("/",C2))-(FIND("-",C2)+1))))*
(MONTH(DATEVALUE(MID(A2:A10,7,99)))=(MID(C2,(FIND("-",C2)+1),(FIND("/",C2))-(FIND("-",C2)+1))))*
(B2:B10))/SUMPRODUCT(((MID(A2:A10,1,(FIND("-",A2:A10))-1))=(MID(C2,(FIND("-",C2)+1),(FIND("/",C2))-(FIND("-",C2)+1))))*
(MONTH(DATEVALUE(MID(A2:A10,7,99)))=(MID(C2,(FIND("-",C2)+1),(FIND("/",C2))-(FIND("-",C2)+1)))))
Can you help me with this...?
Solution with Intermediary Values
To solve the issue (I tested the average only) I first used 2 intermediary values: this solution is not optimal and there will be many smarter ways to address the issue (e.g. pivot tables).
ENV Value Intermediary 1 Intermediary 2
ABC - 10/1/2014 1:38:32 PM 4 ABC 10
XYZ - 10/1/2014 1:38:32 PM 6 XYZ 10
ABC - 9/1/2014 1:38:32 PM 1 ABC 9
XYZ - 10/1/2014 1:38:32 PM 10 XYZ 10
The first intermediary column contains the first 3 chars of ENV column (=LEFT(A9,3)), while the second intermediary column contains the month (=MID(A9,7,2)). This works only if your ENV records are fixed size and homogeneous (e.g. your env name has exactly 3 chars).
With this layout, you can compute the average putting in any cell the following formula:
=AVERAGEIFS(D9:D12, F9:F12,"=ABC", G9:G12, "=10")
Where D9:D12 is the values interval, F9:F12 is the 1st intermediary column and G9:G12 the second intermediary column.
One Shot Compact Solution (Arrays)
An optimized solution can be found relying on arrays. For instance, to calculate the average and the max of an interval based on 2 "vectorial" conditions you can write this one liners:
= MAX(IF((LEFT(A9:A12,3)="ABC")*(MID(A9:A12,7,2)="10"),D9:D12))
= AVERAGE(IF((LEFT(A9:A12,3)="ABC")*(MID(A9:A12,7,2)="10"),D9:D12))
With A9:A12 your original records, and D9:D12 is the values interval.
The advantages of this solution are that you don't need any intermediary column and that you can extend this approach to all the other formulas that don't have 'xxxxxIFS' (it's the case for MAX).
NOTE: you have to confirm this formula with CTRL + SHIFT + RETURN or your formula will fail with #VALUE error.
Live Demo
Live demo available here.
You can start by spiting column A into a date and letters using - Data > Text to Columns with the delimiter " - ".
after you have the new two columns (let say F and G) you can use the function "AVERAGEIF" with a condition that check is the value of the cell in "F" is ABC and the Moth(cell in "G") = 10.
as for the max, you can do the same with MAX(IF....) for column E.
SUMPRODUCT will allow you to parse the left-most and date characters from your combined string. A pseudo-MAXIF() can be similarly constructed using MAX() and INDEX().
In D2 use =SUMPRODUCT((LEFT(A2:A10,3)="ABC")*(MONTH(DATEVALUE(MID(A2:A10,7,99)))=10)*(B2:B10))/SUMPRODUCT((LEFT(A2:A10,3)="ABC")*(MONTH(DATEVALUE(MID(A2:A10,7,99)))=10))
In E2 use =MAX(INDEX((LEFT(A2:A10,3)="ABC")*(MONTH(DATEVALUE(MID(A2:A10,7,99)))=10)*(B2:B10),,))
Both SUMPRODUCT and INDEX like to choke on anything remotely resembling an error when parsing text so keep the cell range references to what your actual data is and avoid blanks.
Your results should look like the following.
            

MDB query for Time

I have table as
Id Name Date Time
1 S 1-Dec-2009 9:00
2 N 1-Dec-2009 10:00
1 S 1-Dec-2009 10:30
1 S 1-Dec-2009 11:00
2 N 1-Dec-2009 11:10
Need query to display as
Id Name Date Time
1 S 1-Dec-2009 9:00
1 S 1-Dec-2009 11:00
2 N 1-Dec-2009 10:00
2 N 1-Dec-2009 11:10
My backend database is MS Access and using VB6 for Max and Min time
I would make an additional two [int] columns, say hour and minute and then use an MS Access query to sort them. It would be MUCH easier to call that in VB. The query itself would be something like the following:
SELECT * FROM YOURTABLE ORDER BY id, hour, minute;

Resources