Text to columns: How to prevent date formatting of decimal numers? - excel

With a dataset such as this:
Date, Value
04.03.2020, 13.35
04.03.2020, 13.8
04.03.2020, 21.21
You can split the data using Data > Text to columns and specifying the value separator.
The problem is that Excel recognizes 13.8 as august 13, so the output is:
Date Value
04.03.2020 13.35
04.03.2020 13.aug
04.03.2020 21.21
How can I make sure that Excel never interprets decimal numbers such as 13.8 as dates?
I'm working in a region that uses , as decimal separator, but I often have to work with data that is set up with . as the decimal separator. One work-around is of course to replace , with ; and . with , before opening a .csv file. And if the only other solution is to set it up with VBA, I'm perfectly able to do so myself. But I often find myself trying to help colleagues on their computers without my own VBA configurations ready to go. So if there's any other way to do this using standard Excel system settings, that would be great!
This little problem has bugged me for years and I'm eager to get rid of it once and for all.
Edit:
I'm running Excel version 1908 on Windows 7, Office 365.
This problem often occurs when I'd like to inspect csv files that are recognized as Excel files on my system. There have been suggestions to format the cells as General before splitting the data. That does not seem to work on my end.

Another solution is to get what you want with formulas:
Formula for Dates:
=TEXT(LEFT(A2,FIND(",",A2,1)-1),"General")
Formula for Values:
=TEXT(MID(A2,(FIND(",",A2,1)+2),(LEN(A2)-(FIND(",",A2,1)+1))),"General")
Results:
During Text to Column process at Step 3 you have the option to to select Data Column Format you could also select the Value column and play around with Text or General formating.

Related

How to properly format dates in Google Sheets or Microsoft Excel

I have a spreadsheet I need to make in Google Sheets. The source of some of the data is exported to an Excel sheet. The data arrives in a dd/mm/yyyy format and I need to display it in a MON d format (Ex Sep 5).
The problem is both excel and sheets look at the date that arrives and think it is mm/dd/yyyy.
For example, 02/08/2022 is believed to be Febuary 8 even though it should be Aug 2. The problem then arises that neither of these platforms end up knowing how to convert this to Aug 2 and I end up having to do this manually.
Does anyone know how to get around this?
I have tried adjusting the format of the date, as well as using DateValue to convert (this fails since it understands the date as mm/dd/yyyy even when it is dd/mm/yyyy).
Any leads would be appreciated!
Thanks!
In Google Sheets, choose File > Settings > Locale and select a locale that uses the dd/mm/yyyy date format, before importing the data. You can then format the date column the way you prefer.
in gs:
=TEXT(REGEXREPLACE(A1&""; "(\d+)\/(\d+)\/(\d+)"; "$1/$1/$3"); "mmm d")
Try the following and format the result to your liking
=INDEX(IF(ISNUMBER(U2:U5),U2:U5,
IF(U2:U5=DATEVALUE("1899-12-30"),,
(MID(U2:U5,4,3)&LEFT(U2:U5,3)&RIGHT(U2:U5,4))*1)))
(Do adjust the formula according to your ranges and locale)
Functions used:
INDEX
IF
DATEVALUE
ISNUMBER
TRUNC
MID
LEFT
RIGHT
Well, for a formulaic solution, if the date is in A1, then the following places the correct date in B1:
=DATEVALUE(TEXT(A1,"DD/MM/YYYY"))
The TEXT function makes a string that will be the same form as your imported string out of the date produced during import. DATEVALUE then gives the proper date you desired.
The trick is in the TEXT step in which you reverse month and day in the string for DATEVALUE.
Naturally, instead of a helper column, it could just be wrapped around any reference to a date from column A, though one would have to remember to do so for all the years the spreadsheet is in use.
If you are importing, not just opening a .CSV file via File|Open and going from there, you have an opportunity to solve all your problems. You use the Ribbon menuing system's Data menu, select the very leftmost thing, Get Data and from the (no arguing THIS isn't a menu) menu that drops down, Legacy Wizards, then finally From Text (Legacy) which will open the old Excel Import Wizard. (You may notice this is very like the Data|Text to Columns Ribbon menu choice and that is because that choice is the old wizard minus the steps at the start that go looking to another file for the data because it knows, by law, that it has to already be in the spreadsheet... in other words, it looks the same because it IS the same.)
Then make selections for the first couple dialogs it presents you to get to the dialog in which you tell it to import columns as whatever: general (let Excel decide), text, date, and do not import. Choose Date and make the selection of DMY to import them properly as you desire them to be so you are never presented with the problem at all.
As you might guess, you can use the abbreviated wizard via the "Text to Columns" feature to do the same thing after import when you see they are reversed. Since it is a single column of data, the result will overwrite the original simplifying your work.
Why does this happen at all? Well, the "locale" folks have the idea. When Excel imports numbers that are in a form it recognizes could be a date, it looks to the operating system settings for the selected ways dates are understood. So if your operating system believes a date should be displayed "Month Day, Year" and Excel has a set of data it thinks fits that mold, it will convert them all using it. So you get those Feb 8's rather than Aug 2's.
Interestingly, it does two other things of note:
It looks at 8, count 'em, 8 rows of data to decide the data fits the pattern. Even with 1,000,000 rows to import, it looks at... 8.
Then it does them ALL as if God himself wrote the "8"... and dates like 25/03/2022 get imported as text not a real date, because they (oh, obviously) can't be dates... "25" can't be a month!
It IS possible to change settings (DEEP settings) to make Excel consider X number of rows in a data set before deciding such things. I found them here, on the internet, once upon a time, though I shouldn't like trying to find them again. It will consider up to a million rows in such an import, but... that'd make it pret-ty slow. And that's a million rows for EACH data column. I won't even say that "adds up" - I'll point out it "multiplies up."
Another technique is to add some number of starting rows to force the desired pattern onto the import. I've heard it works in TIME column imports so it ought to in DATE column imports but I've not verified such.
My bet is you will find the use of the "Text to Columns" feature of most use if you can use a hands-on approach - it does require literal action on your part, but is a fast operation. If you will see others using the spreadsheet though... well, you need a formulaic solution or a VBA one (macro with button for them to have some fun clicking as their reward for doing what they were trained to do instead of complaining to the boss you make bad spreadsheets). For a formulaic solution, the above formula is simple.
Last thought though: there's no error-checking and error-overcoming in it. So a date like "25/03/2022" in the data that imported as literal text is a problem. For handling the latter, an up-to-date approach could be:
=IF(TYPE(A1)=1,DATEVALUE(TEXT(A1,"dd/mm/yyyy")),DATE(INDEX(TEXTSPLIT(A1,"/"),1,3),INDEX(TEXTSPLIT(A1,"/"),1,2),INDEX(TEXTSPLIT(A1,"/"),1,1)))
in which the DATE(etc. portion handles finding text of the "25/03/2022" kind. Lots of less up-to-date ways to split the text Excel would have placed in the column, but since demonstrating what to do if it existed was the point, I took the easy way out. (Tried for a simple version but it wouldn't take INDEX(TEXTSPLIT(A1,"/"),1,{3,2,1}) from me for the input parameters to DATE.) TYPE will give a 1 if Excel imported a datum as a date (number), and a 2 if brought in as text. If empty or strange strings could exist, you'll need to deal with what those present you as well.

Excel functions do not work properly after getting csv output from Pandas

Recently, I confronted a very strange thing in Microsoft Excel. I made a dataframe in Python 3.6 and filled it with some integer numbers, then I used "to_csv" function to get csv output. I opened the file with Microsoft Excel for doing basic statistical analysis and drawing some charts, however; Microsoft Excel doesn't recognize the numbers in the cells as number. For example, when I add two cell, the result will be zero, no matter what are the numbers.
This is a screenshot from my my Excel environment:
In the yellow cell (C101) I tried to get the sum of cells in column C, but as I explained the sum function (and all the other functions like Count or Max) doesn't work properly. I also, have to say, all the cells have "Number" data type. I'm quite confused, any suggestion would help.
I would have written the answer as a comment, but my reputation is too low.
By default, the decimal seperator is set to a dot ('.'). You have to switch it to a comma (',') like this:
df.to_csv(file, decimal=',')
EDIT:
I forgot that you also have to set the seperator, since its default value is comma:
df.to_csv(file, sep=';', decimal=',')

How to prevent Excel from handling strings containing a colon as formulas

I am generating csv files, and some cells have the format nn:nnnn , i.e. digits separated by a colon. It's not a time format nor a date format, it's just text cells and I don't want them to be re-formatted at all.
I've added some logic to my code in order to identify if it looks like a legal time format or a date, and if so, I wrap that string like this ="nn:nnnn". But I'm not interested in adding those characters to all the cells.
It almost solved my problem, but there are still some cases such as 07:1155 that MS Excel insists to translate it as 1.09375. Other cells such as 68:0062remain intact. Is there a way to recognize what strings are going to be calculated or translated?
Is there any workaround such as any set-up in MS Excel to tell it not to perform any translation on these kind of text?
Instead of just opening your CSV in Excel, you might try doing (Menus) Data/Get External Data/From Text. Or if you're using VBA, that would be something like:
With ActiveSheet.QueryTables.Add(Connection:="TEXT;H:\csvtest.csv" _
, Destination:=Range("$A$3"))
.Name = "csvtest_1"
.FieldNames = True
etc
End With
You may need to specify Text as the incoming column format.
I've got the following answer from Mr. JP Ronse at the Microsoft Community forum
Try to precede a string like 07:1155 with a single quote.
A single quote prevents Excel from interpreting the value.
For some reason Excel interpret a string like 07:1155 as a time and translates it to the value.
Excel sees 07:1155 as 7 hours and 1155 minutes, translated to values:
07:00 => 0.291666666666667
1155 minutes => (1155/60)/24 => 0.802083333333333
The sum is 1.09375
It looks as there is no translation on values like n:00nn or like n:0nnnnn
Checking on the 2 n's after the colon (not 00) could be a workaround.

FIxing MS Excel date time format

A reporting service generates a csv file and certain columns (oddly enough) have mixed date/time format , some rows contain datetime expressed as m/d/y, others as d.m.y
When applying =TYPE() it will either return 1 or 2 (Excel will recognize either a text or a number (the Excel timestamp))
How can I convert any kind of wrong date-time format into a "normal" format that can be used and ensure some consistency of data?
I am thinking of 2 solutions at this moment :
i should somehow process the odd data with existing excel functions
i should ask the report to be generated correctly from the very beginning and avoid this hassle in the first place
Thanks
Certainly your second option is the way to go in the medium-to-long term. But if you need a solution now, and if you have access to a text editor that supports Perl-compatible regular expressions (like Notepad++, UltraEdit, EditPad Pro etc.), you can use the following regex:
(^|,)([0-9]+)/([0-9]+)/([0-9]+)(?=,|$)
to search for all dates in the format m/d/y, surrounded by commas (or at the start/end of the line).
Replace that with
\1\3.\2.\4
and you'll get the dates in the format d.m.y.
If you can't get the data changed then you may have to resort to another column that translates the dates: (assumes date you want to change is in A1)
=IF(ISERR(DATEVALUE(A1)),DATE(VALUE(RIGHT(A1,LEN(A1)-FIND(".",A1,4))),VALUE(MID(A1,FIND(".",A1)+1,2)),VALUE(LEFT(A1,FIND(".",A1)-1))),DATEVALUE(A1))
it tests to see if it can read the text as a date, if it fails, then it will chop up the string, and convert it to a date, else it will attempt to read the date directly. Either way, it should convert it to a date you can use

Some but not all Excel numbers show as a date

I have a big .xls file. Some numbers show as a date.
31.08 shows as 31.aug
31.13 shows as 31.13 (that is what i want all columns to be)
When I reformat 31.aug to number it shows as 40768,00
I have found no ways to convert 31.aug to 31.08 as a number. All I am able to do is to reformat 31.aug as d.mm and then it shows as 31.08 and when I try to reformat it from 31.08 to number it shows as 40768,00. No way to cheat Excel using different types of cell formats.
How's your regional settings? There are some Regions where the short date is identified by dd.mm.yyyy. (Estonian, for instance). Maybe if you change the regional settings for US / UK and paste the data again it won't be changed.
Worked in a small test I did here. Hope it helps.
Internally Excel stores Dates as integer. 1 is January 1. 1900. If you entered something that Excel interprets as a date then it will be converted into an integer. I think from this point on there is no way back.
There is an setting in Options on the tab "international" where you can define your decimal separator. If you set this to ".", then your Excel should accept 30.12 as decimal number and not as date.
As pointed out by others, Excel interprets some of your data as a date instead of a number, which depends on your regional settings. To avoid this happening try Tiago's and stema's responses, they will work depending on your regional settings.
To repair your problem in a large file after it has happened without re-entering/re-importing your data, you can use something like
=DAY(B5)+MONTH(B5)/100
to convert a "date" back to a number. Excel will still display it as a date when you first enter this, but when you reformat it as "Number" now it will display the value you originally entered.
Since your column seems to contain a mix between correct numbers and dates, you need to add an if() construct to separate the two cases. If you haven't changed the display format yet (i.e. it still displays 31.Aug) you can use
=IF(LEFT(CELL("format";B7);1)="D";DAY(B7)+MONTH(B7)/100;B7)
which checks if the format is a "D"ate format. If you have already changed the format to Number, but know all your correct data is below 40000, you can use
=IF(B5>40000;DAY(B5)+MONTH(B5)/100;B5)
As suggested above, go to Control Panel - Region and Language - Advanced Settings - Numbers - and change the Decimal Symbol from "," to "."
Good luck!
The data you are pasting, is it by any chance a pivot table.
For example, like you, I am copying a lot of data into a large spreadsheet. The data I am copying is from another sheet and it is a pivot table.
If I paste normally, half will show up as numbers, which they are in the source file and half will show up as dates, for no reason, which drives me insane.
If I Paste->Values however, they will all show up as numbers, and as I don't need the pivot functionality in the destination file this solution is fine.
All you have to do is format cell.
1-right click on the cell where you want to insert the number.
2-then click on Number and select 'General' from the number menu.
Hope this will help future people with the same issue.

Resources