I've been trying to change the UNIX date (13 digits the one on the first column on the pic) to a readable date:
from pyspark.sql import functions as F
#TRY TO CHANGE THE DATA FORMAT
sd = df.withColumn('Date2', F.from_unixtime(F.col("Date") / 1000 ** 3, "yyyy-MM-dd"))
display(sd)
This is the result:
The first column is the UNIX Date to convert, and the result is the in the last column, I don't know why it comes out in the wrong format.
F.from_unixtime(F.col("Date") / 1000, "yyyy-MM-dd")
You don't need the **3 part (it raises the number to the power of 3). So you divide by 1000000000. This is why the "Date2" in your case is almost the begining of UNIX epoch.
Related
The excel column contains Zulu Timzezones How to calculate the difference between two dates in seconds.milliseconds
From Time 2022-04-25T04:16:57.823121842Z
To Time
2022-04-25T04:16:58.173194593Z
2022-04-25T04:16:58.089133751Z
2022-04-25T04:16:58.462278784Z
2022-04-25T04:16:57.829376293Z
2022-04-25T04:16:57.961790312Z
2022-04-25T04:16:58.445884586Z
2022-04-25T04:16:57.830806273Z
2022-04-25T04:16:58.067723338Z
2022-04-25T04:16:58.470913276Z
2022-04-25T04:16:57.838787068Z
When I Try to Do something like =B13-B14
Error
Function MINUS parameter 1 expects number values. But '2022-04-25T04:35:59.092943356Z' is a text and cannot be coerced to a number.
Converted to Number format
REVISED: I forgot to convert the milliseconds
You can convert the date strings into time values by breaking them into parts:
=DATEVALUE(LEFT(A2,10)) + TIMEVALUE( MID(A2,12,8) ) --MID(A2,20,10)/24/60/60
Where A2 is the date string.
This assumes that they have the exact structure that you have shown and fully padded with zeros. If that is not the case, for example the milliseconds could be .095Z, then you can mod this to:
=DATEVALUE(LEFT(A2,10)) + TIMEVALUE( MID(A2,12,8) ) --MID(SUBSTITUTE(A2,"Z",""),20,999)/24/60/60
to be safe.
I'm currently working on a dataset that has a duration field in time format, I'm trying to convert this into just minutues however haven't been succesful in doing so.
Is there a formula to convert these formats into mintues or seconds.
The format is HH:MM:SS some examples of data displayed and required output below.
Example
00:01:00 = 1
00:01:30 = 1.5
00:02:00 = 2
It depends on whether Excel sees these as DateTime or Text. To know this you can put =ISNUMBER( cell address ). If true, then it is DateTime. You can do =ISTEXT( cell address ) to see if it is text.
If it is DateTime, you can use this to convert it to minutes:
= A1 * 24 * 60
to convert it to minutes (where A1 is the cell with the 00:xx:xx value).
If text, then you need to do:
=TIMEVALUE(A1)*24*60
And even if it is already datetime, you can use =TIMEVALUE( cell address )*24*60 - it will figure out if it is text or already datetime.
I want to covert a column with Excel date numbers (float) to datetime, below is the function I am trying to use:
import datetime
datetime.datetime.fromtimestamp(42029.0).strftime('%Y-%m-%d')
Bu the result I got is: '1970-01-01',I don't think it is right, I must missing a constant which should be added to the variable, because in the excel the number 42029.0 represents date: 1/25/2015.
Can anyone please advise how to improve the code?
datetime.fromtimestamp() follows the unix convention and expects seconds since '1970-01-01'. The Excel date number for '1970-01-01' is 25569, so you have to subtract this number from your given date number and multiply the result, which is in days, with 86400 (seconds per day):
datetime.datetime.fromtimestamp(
(42029.0 - 25569) * 86400).strftime('%Y-%m-%d')
'2015-01-25'
I am trying to run the following code :
import openpyxl
wb = openpyxl.load_workbook("/Users/kakkar/Downloads/TRADEBOOK ALL-EQ 01-01-2016 TO 04-02-2017.xlsx")
sheet = wb.get_sheet_by_name('TRADEBOOK')
print(sheet.cell(row=15, column=3).value)
print(sheet.cell(row=15, column=3).internal_value)
In the input Excel file, column 3 contains the time value 10:47:49. This is being treated in openpyxl as the floating-point value 0.449872685185185.
How can I get the time value in the form HH:MM:SS?
Excel stores time (and date) values as floating-point numbers. The whole number is the number of days since the epoch (1 Jan 1970), while the remainder is the portion of the day.
Here is an answer that can help you get the hours, minutes, and seconds out of the value, but this answer might be better for getting the answer without writing code (it uses the xlrd library).
the .internal_value of 10:47:49 (HH:MM:SS) IS 0.449872685185185, so its OK.
Please give as the following output:
cell = sheet['C15']
print('cell=\t%s\n\ttype=%s\n\t.number_format=%s;\n\t.value=%s;\n\t.internal_value=%s;' % (cell,str(type(cell.value)),cell.number_format,cell.value,cell.internal_value) )
Tested with Python:3.4.2 - openpyxl:2.4.1 - LibreOffice: 4.3.3.2
I have time data from the unix time command like
203m53.5s
I have this data in excel. I want it to be converted to Excell time format so I can do mathematical operations like sum and averages over them.
How can I do this
Replace the m with : and the s with "":
=--SUBSTITUTE(SUBSTITUTE(A1,"m",":"),"s","")
Now that the time is in a format that Excel will recognize we need to change it from string text to a number. The -- is forcing the string into a number by performing a mathematical process of multiplying -1 * -1 to it.
It can be replaced by TIMEVALUE()
Then format the cell with a custom format of:
[mm]:ss.0
One way is to use a forumala to strip out the m and s and use those values for time in a new column in Excel.
Assume the Unix data is in column A.
=(LEFT($A1,FIND("m",$A1)-1)*60+MID($A1,FIND("m",$A1)+1, LEN($A1)-FIND("m",$A1)-1)/84600
then format the cell as custom and choose the time without the AM/PM
Breakdown:
(get the minutes by finding "m")
multiply by 60 to convert to seconds
+ (get the seconds by starting at the location of m, +1 to the location of m-length of the whole string)
-1 to account for the actual "s"
Then divide the whole thing by 84600 to convert to time as a decimal