Excel: convert float number to time - excel

In Excel I need to convert float numbers to time.
For example:
8,3 must become 08:30
10 must become 10:00
11,3 must become 11:30
Any ideas?
Thank you

Simple:
=(INT(A1)+(A1-INT(A1))/0.6)/24
Input | Output
----- | --------
8.3 | 08:30:00
10 | 10:00:00
11.3 | 11:30:00

Related

PySpark: (broadcast) joining two datasets on closest datetimes/unix

I am using PySpark and are close to giving up on my problem.
I have two data sets: one very very very large one (set A) and one that is rather small (set B).
They are of the form:
Data set A:
variable | timestampA
---------------------------------
x | 2015-01-01 09:29:21
y | 2015-01-01 12:01:57
Data set B:
different information | timestampB
-------------------------------------------
info a | 2015-01-01 09:30:00
info b | 2015-01-01 09:30:00
info a | 2015-01-01 12:00:00
info b | 2015-01-01 12:00:00
A has many rows where each row has a different time stamp. B has a time stamp every couple of minutes. The main problem here is, that there are no exact time stamps that match in both data sets.
My goal is to join the data sets on the nearest time stamp. An additional problem arises since I want to join in a specific way.
For each entry in A, I want to map the entire information for the closest timestamp while duplicating the entry in A. So, the result should look like:
Final data set
variable | timestampA | information | timestampB
--------------------------------------------------------------------------
x | 2015-01-01 09:29:21 | info a | 2015-01-01 09:30:00
x | 2015-01-01 09:29:21 | info b | 2015-01-01 09:30:00
y | 2015-01-01 12:01:57 | info a | 2015-01-01 12:00:00
y | 2015-01-01 12:01:57 | info b | 2015-01-01 12:00:00
I am very new to PySpark (and also stackoverflow). I figured that I probably need to use a window function and/or a broadcast join, but I really have no point to start and would appreciate any help. Thank you!
You can you use broadcast to avoid shuffling.
If understand correctly you have timestamps in set_B which are consequent with some determined interval? If so you can do the following:
from pyspark.sql import functions as F
# assuming 5 minutes is your interval in set_B
interval = 'INTERVAL {} SECONDS'.format(5 * 60 / 2)
res = set_A.join(F.broadcast(set_B), (set_A['timestampA'] > (set_B['timestampB'] - F.expr(interval))) & (set_A['timestampA'] <= (set_B['timestampB'] + F.expr(interval))))
Output:
+--------+-------------------+------+-------------------+
|variable| timestampA| info| timestampB|
+--------+-------------------+------+-------------------+
| x|2015-01-01 09:29:21|info a|2015-01-01 09:30:00|
| x|2015-01-01 09:29:21|info b|2015-01-01 09:30:00|
| y|2015-01-01 12:01:57|info a|2015-01-01 12:00:00|
| y|2015-01-01 12:01:57|info b|2015-01-01 12:00:00|
+--------+-------------------+------+-------------------+
If you don't have determined interval then only cross join and then finding min(timestampA - timestampB) interval can do the trick. You can do that with window function and row_number function like following:
w = Window.partitionBy('variable', 'info').orderBy(F.abs(F.col('timestampA').cast('int') - F.col('timestampB').cast('int')))
res = res.withColumn('rn', F.row_number().over(w)).filter('rn = 1').drop('rn')

Computing First Day of Previous Quarter in Spark SQL

How do I derive the first day of the last quarter pertaining to any given date in Spark-SQL query using the SQL API ? Few required samples are as below:
input_date | start_date
------------------------
2020-01-21 | 2019-10-01
2020-02-06 | 2019-10-01
2020-04-15 | 2020-01-01
2020-07-10 | 2020-04-01
2020-10-20 | 2020-07-01
2021-02-04 | 2020-10-01
The Quarters generally are:
1 | Jan - Mar
2 | Apr - Jun
3 | Jul - Sep
4 | Oct - Dec
Note:I am using Spark SQL v2.4.
Any help is appreciated. Thanks.
Use the date_trunc with the negation of 3 months.
df.withColumn("start_date", to_date(date_trunc("quarter", expr("input_date - interval 3 months"))))
.show()
+----------+----------+
|input_date|start_date|
+----------+----------+
|2020-01-21|2019-10-01|
|2020-02-06|2019-10-01|
|2020-04-15|2020-01-01|
|2020-07-10|2020-04-01|
|2020-10-20|2020-07-01|
|2021-02-04|2020-10-01|
+----------+----------+
Personally I would create a table with the dates in from now for the next twenty years using excel or something and just reference that table.

Excel: Difference in hours between duplicates

I am having a problem, hope you can help.
I need to have the differente in hours between duplicates. Example:
Date Time | SESSION_ID | Column I need
24/01/2020 10:00 | 100 | NaN
24/01/2020 11:00 | 100 | 1
14/03/2020 12:00 | 290 | NaN
16/03/2020 13:00 | 254 | NaN
16/03/2020 14:00 | 100 | 1251
In session_ID column, there are 3 duplicates with value 100.
I need to know the difference in hours between those sessions, which would be 1 hour between the first and the second, and 1 251 hours between the second and the third.
Does anyone has any type of clue on how this could be done?
If one has the Dynamic Array formula XLOOKUP, put this in C2 and copy down:
=IF(COUNTIF($B$1:B1,B2),A2-XLOOKUP(B2,$B$1:B1,$A$1:A1,,0,-1),"NaN")
Then format the column: [h]
If not then use INDEX/AGGREGATE in its place:
=IF(COUNTIF($B$1:B1,B2),A2-INDEX(A:A,AGGREGATE(14,7,ROW($B$1:B1)/($B$1:B1=B2),1)),"NaN")

Is there a way to convert a date as 2008.5 (the middle of the year 2008) in a proper date with excel?

I have a series of values (concentrations of ions in the atmosphere) each plotted in function of their date that looks a little bit like this
+---------+----------+---------+
| Year | Sulphate | Nitrate |
+---------+----------+---------+
| 2008.0 | 22.8 | 12.5 |
| 2007.75 | 13.5 | 13.4 |
| 2007.5 | 10.2 | 12.7 |
| 2007.25 | 19.4 | 10.3 |
| 2007.0 | 25.4 | 12.4 |
+---------+----------+---------+
is there a way to convert the year in a proper date? Like, 2008 should become 01/01/2008 (first of jan of 2008) etc.
If you want to display a proper date (middle of the year, a quarter of the year, etc.) then you can try the following formula:
=DATE(LEFT(A1,4),1,1)+(DATE(LEFT(A1,4)+1,1,1)-DATE(LEFT(A1,4),1,1))*MOD(A1,1)
Here's the result:
Another method, that might come a bit closer:
=DATE(INT(A2),MOD(A2,1)*12+1,MOD(MOD(A2,1)*12,1)*30+1)
The formula assumes all months have 30 days.
Understand that, because months have different numbers of days, that a formula like this may not always give an "exact" result.
is there a way to convert the year in a proper date? Like, 2008 should become 01/01/2008
Yes, with the Function DATE:
DATE Function
Also with function LEFT to get the first 4 chars.
LEFT Function
My formula in B2 is:
=DATE(LEFT(A2;4);1;1)
Hope this helps.

Excel spreadsheet for hours worked tracking

I want to track my hours for work on my personal Excel spreadsheet.
My company records time in 6 minute intervals. So 8 hrs and 6 min worked is represented as 8.1 in the time card systems and so forth.
I have in cell A1 the header and A2 is the data.
+------------+-------------+-----------+---------+-------------+----------+
| Start Time | Start Lunch | End Lunch | End day | Total Hours | TC Hours |
+------------+-------------+-----------+---------+-------------+----------+
| 07:00 | 11:00 | 12:00 | 16:03 | 8:03 | 8 |
| 07:00 | 11:00 | 12:00 | 16:06 | 8:06 | 8.1 |
+------------+-------------+-----------+---------+-------------+----------+
I would like to achieve two results. The first is that anything that is from 0-3 minutes be rounded down and anything from 4-6 minutes be rounded up. The next one is to output in the Time Card (TC) format of 6 minutes =.1 and so forth. I have a previous code I used for both but need to adjust it.
This is for the rounding
=IF(ISBLANK($E10)," ", TIME(HOUR(F10), ROUND((MINUTE(F10)/60)*4, 0) * 15, 0))
This is for the TC
=IF(ISBLANK(E10)," ", ROUND((G10*24)/0.25,0)*0.25)
Get away from using math tricks to achieve your averaging and use MROUND and TIME instead. It is every bit as accurate (if not more so) in avoiding floating point problems with time and generally makes more sense to the user.
To average any time value to the nearest 6 minutes use,
=MROUND(E2, TIME(0, 6, 0))
To convert the total hours to hours as integers with minutes as floating points (in F2 as per the included image),
=HOUR(MROUND(E2, TIME(0, 6, 0)))+MINUTE(MROUND(E2, TIME(0, 6, 0)))/60

Resources