MATLAB xlsread Function to Import Dates - excel

thanks for taking a look at my question.
I'm having a peculiar issue importing an xlsx file into MATLAB R2016a (Mac OS X) , more specifically importing dates.
I am using the below code to import my bank statement history from the Worksheet 'Past' in the xlsx file 'bank_statements.xlsx'. A snippet of column 1 with the dates in dd/mm/yyyy format is also included.
[ndata, text, data] = xlsread('bank_statements.xlsx','Past');
My understanding is that MATLAB uses filters to distinguish between text and numeric data with these being represented in the 'text' and 'data' arrays respectively whilst 'ndata' is a cell array with everything included. Previously, when running the script on MATLAB 2015a (Windows) the dates from column 1 were treated as strings and populated in the 'text' array, whilst on MATLAB 2016a (Mac OS X) column 1 of the text array is blank. I assumed this was because updates had been made to how the xlsread function interprets date information.
Here's the strange part. Whilst inspecting the text array through the Variables window and referencing in the Command Window shows text(2,1) to be empty, performing the datenum function on this "empty" cell successfully gives the date in a numbered format:
Whilst I can solve this issue by using the ndata array (or ignoring the fact that the above doesn't make sense to me) I'd really like to understand what is happening here and whilst a seemingly empty cell can actually be holding information which operations can be performed on.
Best regards,
Jim

I was able to replicate your problem and although I can't answer the intricacies of what is happening, I could offer a suggestion. I was only able to replicate it when I was converting a string of non-date text, which leads me to believe that there might be an issue with the way the data was imported.
Instead of:
[ndata,text,data] = xlsread('bank_statements.xlsx','Past');
maybe try and add in the #convertSpreadsheetDates function if you have it, along with the range of values you want to import, i.e.
[ndata,text,data] = xlsread('bank_statements.xlsx','Past','A2:A100','',#convertSpreadsheetDates);
Probably not what you are looking for but it might help!

Related

How to properly format dates in Google Sheets or Microsoft Excel

I have a spreadsheet I need to make in Google Sheets. The source of some of the data is exported to an Excel sheet. The data arrives in a dd/mm/yyyy format and I need to display it in a MON d format (Ex Sep 5).
The problem is both excel and sheets look at the date that arrives and think it is mm/dd/yyyy.
For example, 02/08/2022 is believed to be Febuary 8 even though it should be Aug 2. The problem then arises that neither of these platforms end up knowing how to convert this to Aug 2 and I end up having to do this manually.
Does anyone know how to get around this?
I have tried adjusting the format of the date, as well as using DateValue to convert (this fails since it understands the date as mm/dd/yyyy even when it is dd/mm/yyyy).
Any leads would be appreciated!
Thanks!
In Google Sheets, choose File > Settings > Locale and select a locale that uses the dd/mm/yyyy date format, before importing the data. You can then format the date column the way you prefer.
in gs:
=TEXT(REGEXREPLACE(A1&""; "(\d+)\/(\d+)\/(\d+)"; "$1/$1/$3"); "mmm d")
Try the following and format the result to your liking
=INDEX(IF(ISNUMBER(U2:U5),U2:U5,
IF(U2:U5=DATEVALUE("1899-12-30"),,
(MID(U2:U5,4,3)&LEFT(U2:U5,3)&RIGHT(U2:U5,4))*1)))
(Do adjust the formula according to your ranges and locale)
Functions used:
INDEX
IF
DATEVALUE
ISNUMBER
TRUNC
MID
LEFT
RIGHT
Well, for a formulaic solution, if the date is in A1, then the following places the correct date in B1:
=DATEVALUE(TEXT(A1,"DD/MM/YYYY"))
The TEXT function makes a string that will be the same form as your imported string out of the date produced during import. DATEVALUE then gives the proper date you desired.
The trick is in the TEXT step in which you reverse month and day in the string for DATEVALUE.
Naturally, instead of a helper column, it could just be wrapped around any reference to a date from column A, though one would have to remember to do so for all the years the spreadsheet is in use.
If you are importing, not just opening a .CSV file via File|Open and going from there, you have an opportunity to solve all your problems. You use the Ribbon menuing system's Data menu, select the very leftmost thing, Get Data and from the (no arguing THIS isn't a menu) menu that drops down, Legacy Wizards, then finally From Text (Legacy) which will open the old Excel Import Wizard. (You may notice this is very like the Data|Text to Columns Ribbon menu choice and that is because that choice is the old wizard minus the steps at the start that go looking to another file for the data because it knows, by law, that it has to already be in the spreadsheet... in other words, it looks the same because it IS the same.)
Then make selections for the first couple dialogs it presents you to get to the dialog in which you tell it to import columns as whatever: general (let Excel decide), text, date, and do not import. Choose Date and make the selection of DMY to import them properly as you desire them to be so you are never presented with the problem at all.
As you might guess, you can use the abbreviated wizard via the "Text to Columns" feature to do the same thing after import when you see they are reversed. Since it is a single column of data, the result will overwrite the original simplifying your work.
Why does this happen at all? Well, the "locale" folks have the idea. When Excel imports numbers that are in a form it recognizes could be a date, it looks to the operating system settings for the selected ways dates are understood. So if your operating system believes a date should be displayed "Month Day, Year" and Excel has a set of data it thinks fits that mold, it will convert them all using it. So you get those Feb 8's rather than Aug 2's.
Interestingly, it does two other things of note:
It looks at 8, count 'em, 8 rows of data to decide the data fits the pattern. Even with 1,000,000 rows to import, it looks at... 8.
Then it does them ALL as if God himself wrote the "8"... and dates like 25/03/2022 get imported as text not a real date, because they (oh, obviously) can't be dates... "25" can't be a month!
It IS possible to change settings (DEEP settings) to make Excel consider X number of rows in a data set before deciding such things. I found them here, on the internet, once upon a time, though I shouldn't like trying to find them again. It will consider up to a million rows in such an import, but... that'd make it pret-ty slow. And that's a million rows for EACH data column. I won't even say that "adds up" - I'll point out it "multiplies up."
Another technique is to add some number of starting rows to force the desired pattern onto the import. I've heard it works in TIME column imports so it ought to in DATE column imports but I've not verified such.
My bet is you will find the use of the "Text to Columns" feature of most use if you can use a hands-on approach - it does require literal action on your part, but is a fast operation. If you will see others using the spreadsheet though... well, you need a formulaic solution or a VBA one (macro with button for them to have some fun clicking as their reward for doing what they were trained to do instead of complaining to the boss you make bad spreadsheets). For a formulaic solution, the above formula is simple.
Last thought though: there's no error-checking and error-overcoming in it. So a date like "25/03/2022" in the data that imported as literal text is a problem. For handling the latter, an up-to-date approach could be:
=IF(TYPE(A1)=1,DATEVALUE(TEXT(A1,"dd/mm/yyyy")),DATE(INDEX(TEXTSPLIT(A1,"/"),1,3),INDEX(TEXTSPLIT(A1,"/"),1,2),INDEX(TEXTSPLIT(A1,"/"),1,1)))
in which the DATE(etc. portion handles finding text of the "25/03/2022" kind. Lots of less up-to-date ways to split the text Excel would have placed in the column, but since demonstrating what to do if it existed was the point, I took the easy way out. (Tried for a simple version but it wouldn't take INDEX(TEXTSPLIT(A1,"/"),1,{3,2,1}) from me for the input parameters to DATE.) TYPE will give a 1 if Excel imported a datum as a date (number), and a 2 if brought in as text. If empty or strange strings could exist, you'll need to deal with what those present you as well.

Excel converts imported from csv numbers to text

when I import the data from csv, I cannot work on it because the excel treats the numbers as a text. When I try to sum them or get the average I get 0 or error becouse there are none number. It changes when i delete the dot '.' in one cell and put i again. That operation changes type of variable to number and it works. But I don't want to change tousends of data in this way. How can I convert it somehow to make i work?
Thanks for every answer.
Try to use general options selected properly dont import with text format select general format as given in picture.

Concatenated DATEVALUE() and TIMEVALUE() only returns string in Google Sheets

Good morning everyone, currently losing my marbles over this and hoping there's a workaround. I concatenated two values to jerry-rig a timestamp, and converted a string into a number conditionally:
TIMEVALUE(IF(H2="Breakfast","08:00:00",0))+DATEVALUE(C2)
I then formatted this into a timestamp in Google Sheets, and used IMPORTRANGE()= to use in another sheet. BUT, when I plugged the values into MAX(), it always returns DATEVALUE(0), "12/30/1899 00:00:00". I wanted to sort and get the most recent date, but these values refuse to sort. Some notes:
Everything is in datevalue format;
I've tried a workaround using SORT(), which didn't work (it just returned the unsorted range);
Currently using this as a workaround: RIGHT(TEXTJOIN(",",TRUE,FILTER('Sheet1'!C1:DU1,'Sheet1'!C2:DU2=1))), but this is causing issues because I can't use SORT().
I am using a lot of helper sheets and imported ranges (I did read that this can cause problems, but it's not clear on if those are fixable issues.)
This formula does return the values I'm asking for; I just want it to sort, which I know I can't do with string values.
I'm probably missing something huge. Please help!
Sorry—here's a link to a dummy version of the Google Sheet: https://docs.google.com/spreadsheets/d/1NmdEgnDU0fHTeRWpagrr1niORnlfwZO8uYCkSYrfFLA/edit?usp=sharing

Excel functions do not work properly after getting csv output from Pandas

Recently, I confronted a very strange thing in Microsoft Excel. I made a dataframe in Python 3.6 and filled it with some integer numbers, then I used "to_csv" function to get csv output. I opened the file with Microsoft Excel for doing basic statistical analysis and drawing some charts, however; Microsoft Excel doesn't recognize the numbers in the cells as number. For example, when I add two cell, the result will be zero, no matter what are the numbers.
This is a screenshot from my my Excel environment:
In the yellow cell (C101) I tried to get the sum of cells in column C, but as I explained the sum function (and all the other functions like Count or Max) doesn't work properly. I also, have to say, all the cells have "Number" data type. I'm quite confused, any suggestion would help.
I would have written the answer as a comment, but my reputation is too low.
By default, the decimal seperator is set to a dot ('.'). You have to switch it to a comma (',') like this:
df.to_csv(file, decimal=',')
EDIT:
I forgot that you also have to set the seperator, since its default value is comma:
df.to_csv(file, sep=';', decimal=',')

Prevent comma-separated list of numbers being interpreted as single large value

33266500,332665100,332665200,332665300 was the original value, cell should look like this: 33266500,332665100,332665200,332665300 but what I see as the cell value in excel is 3.32665E+34
So the question is I want to convert it into the original string. I have found format function on google and I used it like these
format(3.32665E+34,"standard")
giving it as 332,6650,033,266,510,000,000,000
How to parse it or get back the orginal string? I belive format is the function in vba.
Excel has a 15 digit precision limit. If the numbers are already shown like this when you access the file, there is no way to get the number back - you have already lost some digits. VBA code and formulas will not help you.
If this is not the case, you can add a single quote ' mark before the number to store it as text. This will ensure Excel does not try to treat it as a number and thus lose precision.
If you want the value kept exactly, store the data as a string, not as a number. The data type you are using simply doesn't have the ability to do what you are asking it to do.
If you're starting with an Excel file that has already been created then you've already lost the information: Excel has tried to understand what it was given and its best guess has turned out to be wrong. All you can do (if you can't get the source data) is go back to the creator of the Excel file and tell them what's wrong.
If you're starting with, say, a text file that you're importing, then the news is much better:
If you're importing manually using the Text Import Wizard, then at "Step 3 of 3" you need to set "Column Data Format" for the problem field to "Text".
If you're using a macro, you'll need to specify a value for the TextFileColumnDataTypes property that does the same thing. The easiest way to get it right is to use the Macro Recorder.
If you want the four values in the string to be separate cells, then again, look at the Text Import Wizard settings: in Step 1 of 3 you need to set "Delimited" data type (usually the default) and in Step 2 make sure that "Comma" is checked.
The value needs to be entered into the cell as a string. You need to make whatever it is that inserts the value preceed the value with a '.

Resources