Excel::Writer::XLSX writing unformatted date - excel

I have a script which is using the Spreadsheet::XLSX module to read in data which includes unformatted dates. When I open the spreadsheet on the desktop it does show the dates in the mm/dd/yyyy format. After I am finished reading them I write them out to a spreadsheet using the Excel::Writer::XLSX module. I am basically adding the next date in sequence. Everything works fine until I then read in from the spreadsheet that I created. It ONLY reads the date as formatted, no matter if I use either of these to read them:
$cell->{Val}
$cell->value()
This is the write format I'm using to write out the date. Just so I'm clear, I am writing out the date in the unformatted value using this format. If I don't use the num_format then the dates are in the ddddd format when I open the spreadsheet.
$workbook->add_format(bold => 1, align => 'center', num_format => 'mm/dd/yyyy');
How do I get it to be consistent when reading the spreadsheets and also viewing the dates as mm/dd/yyyy when I open it?

Quoting from CELL FORMATTING section:
Cell formatting is defined through a Format object. Format objects are created by calling the workbook add_format() method as follows:
my $format2 = $workbook->add_format( %props ); # Set at creation
The format object holds all the formatting properties that can be applied to a cell, a row or a column. The process of setting these properties is discussed in the next section.
Another quote from write_date_time() section:
A date should always have a $format, otherwise it will appear as a number, see "DATES AND TIME IN EXCEL" and "CELL FORMATTING".
According to this your code should be:
my $date_format = $workbook->add_format(
bold => 1,
align => 'center',
num_format => 'mm/dd/yyyy'
);
# apply format to a specific cell
# $string should be in ISO-8601 format as UTC, e.g. yyyy-mm-ddThh:mm:ss.sssZ
$worksheet->write_date_time($row, $column, $string, $date_format);
My weapon of choice for wrangling time stamps would be the Date::Manip distribution.

Related

Read hh:mm:ss from Excel properly

I have a column in my Excel sheet called "Start_Time" and the data in the column is in "HH:MM:SS" format, for example "10:13:20".
But when I use pandas.read_excel() function to load the data. The "Start_Time" column showed decimal values (for example: 0.425925925925926) with data type as "object".
How could make the df["Start_Time"] to display as "10:13:20"?
I tried pd.Timedelta(), but it works for only one value at a time. I want to convert all values in that column.
Start Time
End Time
16:24:50
16:32:27
10:35:53
15:06:46
15:21:43
6:39:50
6:39:50
21:55:02
3:29:04
3:29:13
0:53:06
0:53:06
10:21:13
10:25:18
16:15:25
16:19:31
Excel stores the date as a serial number counting from a starting date.
When you choose a format it converts that serial number to the format demanded.
If you want Pandas to display the correct date then you have to convert the serial number.

Exceljs package not retrieving some cell values

I am using ExcelJS package when I retrieve some cells value, it doesn't return the values inside instead it returns some sort of format that I think is a date format sort of.
const workbook = new Excel.Workbook ();
workbook.csv.readFile(path)
.then(worksheet => {
const seenCell = worksheet.getCell('A3').value;
console.log(seenCell);
}
When I run this code try to get cell A4 it returns the content which is a string, but trying to get cell A3 returns
2027-02-11T23:00:00.000Z
I will like to know which format this is, it looks like a date to me and my data is not date.
Since CSV files don't contain any information about data types, ExcelJS tries to guess: anything that even remotely looks like a date is converted to a Date. But the test isn't perfect, and something like 123-456-7890 gets converted to 7891-01-13T22:00:00.000Z.
You can disable date detection by passing empty dateFormats list,
e.g. workbook.csv.readFile('foo.csv', {dateFormats:[]}).

Changing format of TODAY() in excel

I'm using today to aquire todays date and then adding a static value to the end of it using the following:
=TODAY()&"T23:00:00"
Which Returns 43202T23:00:00
I really need it in the format 2018-04-12T23:00:00
Any help on this would be great!
There are a couple ways to accomplish this, depending on whether your goal is a formatted String (to display) or a numeric value (such as data type Date) for storing or using with calculations.
If you want a formatted date/time result (to display to the user)...
Use the TEXT worksheet function:
=TEXT(TODAY(),"yyyy-mm-dd")&"T23:00:00"
...the reason this works is because TODAY() returns a Date data type, which is basically just a number representing the date/time, (where 1 = midnight on January 1, 1900, 2 = midnight on January 2, 1900, 2.5 = noon on January 2, 1900,etc).
You can convert the date type to a String (text) with the TEXT function, in whatever format you like. The example above will display today's date as 2018-04-12.
If, for example, you wanted the date portion of the string displayed asApril 12, 2018 then you would instead use:
TEXT(TODAY(),"mmmm d, yyyy")
Note that the TEXT worksheet function (and VBA's Format function) always return Strings, ready to be concatenated with the rest of the String that you're trying to add ("T23:00:00").
If you want to use the result in calculations...
If you instead want the result to be in a Date type, then instead of concatenating a string (produced by the TEXT function) to a string (from "T23:00:00"), you could instead add a date to a date:
=TODAY()+TIME(23,0,0)
or
=TODAY()+TIMEVALUE("23:00")
..and then you can format it as you like to show or hide Y/M/D/H/M/S as necessary with Number Formats (shortcut: Ctrl+1).
More Information:
MSDN : TEXT Function (Excel)
MSDN : TIMEVALUE Function (Excel)
MSDN : TIME Function (Excel)

SAS: Date reading issue

I have imported an excel sheet where the date1 is 4/1/16 date2 is 5/29/14 and date3 is 5/2/14. However, when I import the sheet into SAS and do PROC PRINT gives the first 2 variable columns as "42461" and "41788" while the date3 is 05/02/2014.
I need these date formats consistent b/c I am doing a Cox regression with PROC PHREG.
Any thoughts about how to make these dates consistent?
Thanks!
This probably depends on how the data is represented in Excel and how it is imported into SAS. First, are the formats the same in Excel? The first two are being imported as a number. The second as a string.
In Excel, you can format the column using a date format. Perhaps your import method will recognize this. You can also define another column as a string, using the text(<whatever>, "YYYY-MM-DD") to convert to a string in that format.
Alternatively, you can import all as numbers and then add the value to 1899-12-31. That is the base date for Excel. This makes more sense if you think of "1" as being 1900-01-01.
Because your column had mixed numeric (date) and character values SAS imported the field as character. So the actual dates got imported as the text version of the actual number that Excel stores for dates. The ones that look like date strings in SAS are the fields that were strings in Excel also.
Or if in your case one of the three columns was all valid dates then SAS imported it as a number and assigned a date format to it so there is nothing to fix for that column.
The best way to fix it is to make sure that all of the values in the date column are either real dates or empty cells. Then PROC IMPORT will be able to make the right guess at how to import it.
Once you have the strings in SAS and you want to try to fix them then you need to decide which strings look like integers and which should be treated as date strings.
So you might just check if they have any non-digit characters and assume those are the ones that are date strings instead of numbers. For the ones that look like integers just adjust the number to account for the fact that Excel numbers dates from 1900 and SAS numbers them from 1960.
data want ;
set have ;
if missing(exel_string) then date=.;
else if notdigit(trim(excel_string)) then date=input(excel_string,anydtdte32.);
else date=input(excel_string,32.) + '01JAN1900'd -2 ;
format date yymmdd10. ;
run;
You might wonder why the minus 2? It is because Excel starts from 1 instead of 0 and also because Excel thinks 1900 was a leap year. Here are the Excel date numbers for some key dates and a little SAS program to convert them. Try it.
data excel_dates;
input datestr :$10. excel_num :comma32. #1 sas_num :yymmdd10. ;
diff = sas_num - excel_num ;
format _numeric_ comma14. ;
sasdate1 = excel_num - 21916;
sasdate2 = excel_num + '01JAN1900'd -2 ;
format sasdate: yymmdd10.;
cards;
1900-01-01 1
1900-02-28 59
1900-03-01 61
1960-01-01 21,916
2018-01-01 43,101
;

SAS import excel date format changes

I need to import an excel, the excel has a few columns and the 1st column A is a date column. Column A has the date format DDMMMYYYY e.g. '01Jan2017' and in excel the data type is date type. But when I import it to SAS, all the other columns remain the same data type (numeric, character, etc.) and value. But column A becomes a number e.g. ('42736' for '01Jan2017'). How do I import the data as it is and without converting the data type to other types?
libname out '/path';
proc import out=out.sas_output_dataset
datafile='/path/excel_file.xlsx'
DBMS=XLSX
REPLACE;
sheet="Sheet1";
run;
It is hard to know without seeing the data. The below is general information, it may not answer your precise problem.
To avoid common errors you should set mixed=yes in your libname. You may also want to include stringdate=yes statement.
The mixed=yes allows for any out of range excel date values.
stringdates=yes brings all dates into SAS in character format, so you will need to use the input() function to convert this into a SAS date.
Date = input( Date , mmddyy10. )
I would suggest that you import the excel with the import wizard in SAS. Afterwards right-click on the query and extract the code, see here: SAS Import Query DE
In the generated code itself you can format each imported column into the desired format.
For the possible format see: https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/leforinforref/n0p2fmevfgj470n17h4k9f27qjag.htm
Hope this helps.
A value of '42736' for '01Jan2017' is an indication that the column in the Excel file has a mix of cells with date values and cells with string values. In that case SAS will make the variable character and store the date values as a digit string that represents the raw number excel uses for the date. To convert '42736' to a date value you need to first convert it to a number and then adjust the number for the difference in the base date used by Excel.
date_value = input(date_string,32.) + '30DEC1899'd ;
To convert the strings that look like '01JAN2017' use the DATE informat instead.
date_value = input(date_string,date11.);
You could add logic to do both to handle a column with mixed values.
date_value = input(date_string,??date11.);
if missing(date_value) then
date_value = input(date_string,??32.) + '30DEC1899'd
;
To have the new variable print the date values in a human readable style attach a date type format to the variable.
format date_value date9. ;

Resources