A reporting service generates a csv file and certain columns (oddly enough) have mixed date/time format , some rows contain datetime expressed as m/d/y, others as d.m.y
When applying =TYPE() it will either return 1 or 2 (Excel will recognize either a text or a number (the Excel timestamp))
How can I convert any kind of wrong date-time format into a "normal" format that can be used and ensure some consistency of data?
I am thinking of 2 solutions at this moment :
i should somehow process the odd data with existing excel functions
i should ask the report to be generated correctly from the very beginning and avoid this hassle in the first place
Thanks
Certainly your second option is the way to go in the medium-to-long term. But if you need a solution now, and if you have access to a text editor that supports Perl-compatible regular expressions (like Notepad++, UltraEdit, EditPad Pro etc.), you can use the following regex:
(^|,)([0-9]+)/([0-9]+)/([0-9]+)(?=,|$)
to search for all dates in the format m/d/y, surrounded by commas (or at the start/end of the line).
Replace that with
\1\3.\2.\4
and you'll get the dates in the format d.m.y.
If you can't get the data changed then you may have to resort to another column that translates the dates: (assumes date you want to change is in A1)
=IF(ISERR(DATEVALUE(A1)),DATE(VALUE(RIGHT(A1,LEN(A1)-FIND(".",A1,4))),VALUE(MID(A1,FIND(".",A1)+1,2)),VALUE(LEFT(A1,FIND(".",A1)-1))),DATEVALUE(A1))
it tests to see if it can read the text as a date, if it fails, then it will chop up the string, and convert it to a date, else it will attempt to read the date directly. Either way, it should convert it to a date you can use
Related
I had a data set of 8 millon rows in a txt file with tab delimited format without quotes.
I had 5 of the 14 columns with date values in dd.MM.yyyy format.
Problem 1
I am trying to import the file. In "Format your colums" step, if I choose the type of that colums as "date", it gives errors and all cells in columns turns "?"
So I selected "polynomial" and planed to convert attribute type to date later.
Problem 2 (the real one)
I imported the data and put "nominal to date" operator. When I run I got error in line 14.899:
Cannot parse date: Unparseable date: "0"
I find the line and I see that columns separated wrong. There was a tab character in a string in the a prior cell. So values moved one cell right. And this row was not the only one that moved.
I want to split the rows that has the values in wrong data type for spesified attributes. So I cant correct them manually.
How can I do that in Rapidminer?
Or any other ideas to figure theese problems out?
so most likely you need to adjust the date formatting in this pull-down menu:
To be honest, I usually just import as polynominal and then convert to date in my process. It's easier and reproducable.
You appear to have a broken input file.
The best solution, obviously, is to fix the process that generates the data. Espace or replace tab characters and format the date in a non-ambiguous format such as the ISO date format.
Assuming that you can't fix the date, you should probably write a robust parser program yourself. A generic parser such as rapidminer's won't be able to fix every problem.
I have an Excel file storing a thousand lines of dates. Each date seems to be (auto)formatted as a Date. A (PHP Excel) parser I'm using (really can't update/use another one) is parsing this to a string which will occur in the number of days till 1900.
Is there a way to format the values in Excel being simple text "08.03.1991" to get this file parsed correctly?
I could add a quote: "'08.03.1991" but I need an (Excel-based) one-action-solution for all the thousand lines.
Remark: Since this is a file of a user I can't just write simple VBA-Script or so to handle this since there will be new files in the future and the User needs to be able to solve this alone.
I admit I am not quite sure what you have and what you want but it may be worth trying: Select column of dates, apply Text to Columns with Tab as delimiter and in step 3 of 3 select Text.
You could use the TEXT function like this:
=TEXT(A1,"dd.mm.yyyy")
For more details have a look here
I have an excel file, with a date column, but I want to convert the date column to
YY/MM/DD/Time
Ive been searching for 2 hours and no result yet.
This is my data:
Source Data: http://i.stack.imgur.com/75zbS.jpg
Expected Output: YY/MM/DD/Time
Can someone help me how I can do it? I want to insert it into postgresql and I want to change everything to compatible date format.
EDIT: I have tried Right Click -> Format cells -> date but it does not change anything!
Thanks
You could use this method and split the date and time into separate cells:
=DATE((LEFT(A1,4)),(MID(A1,5,2)),MID(A1,7,2))
=TIME(MID(A1,10,2),(MID(A1,12,2)),0)
Once your date value is in a format Excel can recognize, you can then change the formatting to whatever you'd like.
Or if you don't care to have the value in a recognizable date format, you can just get your desired formatting like this (will give you a string that looks like this: YY/MM/DD/Time):
=MID(A1,3,2)&"/"&MID(A1,5,2)&"/"&MID(A1,7,2)&"/"&MID(A1,10,4)
ISO 8601 format would be YYYY-MM-DD H24:MI:SS.
But you can set Postgres to accept various date styles by setting the datestyle setting. You can do that globally in postgresql.conf or temporarily for your session.
SET datestyle = SQL, DMY
For more exotic formats, you can create a temporary staging table, COPY to it and INSERT into your target table from there. Among others, you can use to_timestamp():
SELECT to_timestamp('13/10/14/17:33', 'YY/MM/DD/hh24:mi')
More info and example code in related answers like these:
Replacing whitespace with sed in a CSV (to use w/ postgres copy command)
How to bulk insert only new rows in PostreSQL
Your going to have to parse the date into four columns using fixed parsing.
Then reassemble the columns any way you want.
Just Google with excel parse columns fixed.
I have a big .xls file. Some numbers show as a date.
31.08 shows as 31.aug
31.13 shows as 31.13 (that is what i want all columns to be)
When I reformat 31.aug to number it shows as 40768,00
I have found no ways to convert 31.aug to 31.08 as a number. All I am able to do is to reformat 31.aug as d.mm and then it shows as 31.08 and when I try to reformat it from 31.08 to number it shows as 40768,00. No way to cheat Excel using different types of cell formats.
How's your regional settings? There are some Regions where the short date is identified by dd.mm.yyyy. (Estonian, for instance). Maybe if you change the regional settings for US / UK and paste the data again it won't be changed.
Worked in a small test I did here. Hope it helps.
Internally Excel stores Dates as integer. 1 is January 1. 1900. If you entered something that Excel interprets as a date then it will be converted into an integer. I think from this point on there is no way back.
There is an setting in Options on the tab "international" where you can define your decimal separator. If you set this to ".", then your Excel should accept 30.12 as decimal number and not as date.
As pointed out by others, Excel interprets some of your data as a date instead of a number, which depends on your regional settings. To avoid this happening try Tiago's and stema's responses, they will work depending on your regional settings.
To repair your problem in a large file after it has happened without re-entering/re-importing your data, you can use something like
=DAY(B5)+MONTH(B5)/100
to convert a "date" back to a number. Excel will still display it as a date when you first enter this, but when you reformat it as "Number" now it will display the value you originally entered.
Since your column seems to contain a mix between correct numbers and dates, you need to add an if() construct to separate the two cases. If you haven't changed the display format yet (i.e. it still displays 31.Aug) you can use
=IF(LEFT(CELL("format";B7);1)="D";DAY(B7)+MONTH(B7)/100;B7)
which checks if the format is a "D"ate format. If you have already changed the format to Number, but know all your correct data is below 40000, you can use
=IF(B5>40000;DAY(B5)+MONTH(B5)/100;B5)
As suggested above, go to Control Panel - Region and Language - Advanced Settings - Numbers - and change the Decimal Symbol from "," to "."
Good luck!
The data you are pasting, is it by any chance a pivot table.
For example, like you, I am copying a lot of data into a large spreadsheet. The data I am copying is from another sheet and it is a pivot table.
If I paste normally, half will show up as numbers, which they are in the source file and half will show up as dates, for no reason, which drives me insane.
If I Paste->Values however, they will all show up as numbers, and as I don't need the pivot functionality in the destination file this solution is fine.
All you have to do is format cell.
1-right click on the cell where you want to insert the number.
2-then click on Number and select 'General' from the number menu.
Hope this will help future people with the same issue.
We're exporting our analytics reports in various formats, among them CSV. For some clients this CSV finds it's way into Excel.
Inside the CSV file one of the columns is a Date, for example
"Start Date","Name"
"07-04-2010", "Maxim"
Excel has trouble parsing this date format, obviously depending on the Locale of the user. Is "07" is the day or the month...
Could you recommend some textual format for a Date field that excel will not have trouble parsing? I'm aiming at the most fail safe option possible. I would settle for some escape sequence that will cause excel to avoid parsing the text in the column altogether.
Thanks for helping,
Maxim.
You have two options. Go with the month as a string and the year as 4 digits, or use ISO formatting: yyyy-mm-dd.
If you format your dates as follows in the csv output, Excel will parse the content exactly as a date (other columns for realism only)
43,somestring,="03/03/2003",anotherval
55,anotherstring,="01/02/2004",finalval
so add ="{date}" and it parses as date!