Python pandas: unify datetime column format from CSV input - python-3.x

I have a mixed format column in a dataframe from a pd.read_csv(). There is a lot of information out there about datetime handling but I didn't find anything for this specific problem:
2 datatime types:
Custom dd/mm/yyyy hh:mm that shows up in excel as such: 10/03/2018 07:18
General that shows up in Excel as such: 8/13/2018 2:28:34 PM
I used :
df.Last_Updated = pd.to_datetime(df['Last_Updated'])
df = df.sort_values('Last_Updated').drop_duplicates(['Name'], keep='last')
But I get a mixed bunch where the custom format returns as another datatime type :
yyyy-mm-dd hh:mm:ss and shows up in my Excel export as 2017-11-22 19:54:35
Upon checking it changes the dd/mm/yyyy hh:mm (02/09/2018 17:55:44) format to yyyy-mm-dd hh:mm:ss (2018-02-09 17:55:44) and since I have to perform an exclusion of the type 'older than' it causes errors; in this particular case, a computer that has it's last connection in September returns as having it in February.
Does anyone know a way to unify the datetime format?
Date format:
from notepad:
X = "10/2/2018 10:07:31 PM"
Y = "8/13/2018 2:28:34 PM"
from CSV (and by opening the .txt via Excel):
X = 10/02/2018 22:07 PM
Y = 8/13/2018 2:28:34 PM
after datetime applied in code:
X = 02/10/2018 22:07:31
Y = 13/08/2018 14:28:34

Related

read xlsb file as pandas dataframe and parse the date column as datetime format

I have a 'some.xlsb' file with some 10 columns, out of which 2 are DateTime column.
When I load using pandas the date-time column is parsed in a different form.
Explanations:
where DateTime value corresponding to 4/10/2021 11:50:24 AM - read as 44296.5
Below is the code I tried.
goods_df = pd.read_excel('some.xlsb',
engine='pyxlsb', sheet_name='goods_df')
goods_df_header = goods_df.iloc[1]
goods_df.columns = goods_df_header #set the header row as the df header
goods_df= goods_df[2:]
goods_df.head(2)
When you read xlsb file using pandas you will get excel time float value because xlsb convert datetime object into an float value before storing.
According to Microsoft 44296.5 means 44296.5 days passed since jan 1st 1900.
You need convert this into epoch and then date by using below formula( epoch value= number of sec passed since jan 1st 1970 00:00:00 ).
a = datetime.datetime.strftime((int(<datevalue from excel>)*86400)-2207520000, "%m/%d/%Y")
Or you can save this xlsb as xlsx and read it you will get exact datetime object.

Pandas set date as day(int)-month(str)-year(int)

I am trying to change the formatting of a date column
original: 2020/05/22
Desired outcome: 22/may/2020
so far I've done:
.to_datetime
dt.strftime('%d-%m-%Y')
converting into: 22/05/2020
how can I get the middle part to convert into alphabetical?
Try this, all the format codes are given here date formats:
df['Date'] = pd.to_datetime(df['Date']).dt.strftime('%d/%b/%Y')
print(df)
Date
0 22/May/2020

Covert in C# DateTime from US to UK format

string DateAndTime = Cells[1].Text; // Output is 3/18/2020 3:00:18 PM
DateTime DT = DateTime.ParseExact(DateAndTime, "dd/MM/yyyy HH:mm:ss", CultureInfo.CurrentCulture);
Error: string was not recognized as a valid datetime
Current string is this 3/18/2020 3:00:18 PM
I want to convert and parse it to DateTime as 18/03/2020 15:00:18
ParseExact does exactly that, parses the string using the exact specification you provide. And, per your specification, "18" isn't a valid month. It sounds like you want to swap the month and day identifiers (and only use M instead of MM for the month, and use h for the single-digit 12-hour clock, and add tt for the AM/PM specification):
DateTime.ParseExact(DateAndTime, "M/dd/yyyy h:mm:ss tt")
Once it's parsed as a DateTime you can output the value in any format you like. For example:
DT.ToString("dd/MM/yyyy HH:mm:ss")
But your input format very much is not "dd/MM/yyyy HH:mm:ss". For parsing you need to match the input format, not the intended downstream format.
DateTime DT = DateTime.Parse(DateAndTime, new CultureInfo("en-US"));

How to convert the norway date format to date format using python

I have the Norwegian date format of the date, but don’t know how to convert it to the standard date format using Python.
I have converted the date format below to standard date format.
tir. 18. jun. 2019
I have tried using the locale date format, but that is not working.
This the expected output
18-07-2019
Use dateparser
Ex:
import dateparser
d = "tir. 18. jun. 2019"
print(dateparser.parse(d).strftime("%d-%m-%Y"))
Output:
18-06-2019

Convert dates from Excel to Matlab

I have a series of dates and some corresponding values. The format of the data in Excel is "Custom" dd/mm/yyyy hh:mm.
When I try to convert this column into an array in Matlab, in order to use it as the x axis of a plot, I use:
a = datestr(xlsread('filename.xlsx',1,'A:A'), 'dd/mm/yyyy HH:MM');
But I get a Empty string: 0-by-16.
Therefore I am not able to convert it into a date array using the function datenum.
Where do I make a mistake? Edit: passing from hh:mm to HH:MM doesn't work neither. when I try only
a = xlsread('filename.xlsx',1,'A2')
I get: a = []
According to the documentation of datestr the syntax for minutes, months and hours is as follows:
HH -> Hour in two digits
MM -> Minute in two digits
mm -> Month in two digits
Therefore you have to change the syntax in the call for datestr. Because the serial date number format between Excel and Matlab differ, you have to add an offset of 693960 to the retrieved numbers from xlsread.
dateval = xlsread('test.xls',1,'A:A') + 693960;
datestring = datestr(dateval, 'dd/mm/yyyy HH:MM');
This will read the first column (A) of the first sheet (1) in the Excel-file. For better performance you can specify the range explicitly (for example 'A1:A20').
The code converts...
... to:
datestring =
22/06/2015 16:00
Edit: The following code should work for your provided Excel-file:
% read from file
tbl = readtable('data.xls','ReadVariableNames',false);
dateval = tbl.(1);
dateval = dateval + 693960;
datestring = datestr(dateval)
% plot with dateticks as x-axis
plot(dateval,tbl.(2))
datetick('x','mmm/yy')
%datetick('x','dd/mmm/yy') % this is maybe better than only the months
Minutes need to be called with a capital M to distinguish them from months.
Use a=datestr(xlsread('filename.xlsx',1,'A:A'),'dd/mm/yyyy HH:MM')
Edit: Corrected my original answer, where I had mixed up the cases needed.
I tried with this. It works but it is slow and I am not able to plot the dates at the end. Anyway:
table= readtable ('filename.xlsx');
dates = table(:,1);
dates = table2array (dates);
dates = datenum(dates);
dates = datestr (dates);

Resources