Different results exporting to CSV or Excel - excel

I have a simple report that I want to export to a CSV file. There is only the detail line that is grouped by one field, no group header, and a group footer for totals. The problem is when I export to CSV format, the total row for a group is listed in front of every record?
If I export to Excel and then save as a CSV file, the total row is where it belongs. However one field is spread across 3 columns then those columns are "merged and centered" which adds two commas in the middle of the line. And one column is added at the beginning of the record and two at the end of the record, for 3 more extra commas.
It would be easy enough to write a macro to "clean up" the spread sheet and export as a csv file for my end users. However corporate "insecurity" will not allow the end users to have macros.
Any help, suggestions, pointers to where else to look greatly appreciated.
cheers
bob

The CSV generated by any standard reporting tool does a flat data structure and hence would repeat all data set.
The XLS generated by the reproting tools are typically to be opened in the XLS and its XLS default behaviour to put additional commas for every merged cell.
The best way is to create a report with a layout that has equal data length columns even for the header, ie while formatting the report do not put the header in the center with larger lenght, bold and italics etc, put it as the first column and match the lenght with the data in the detail record.
This way you would be able to create a report that does not look presentable in XLS but would give you required data in the CSV

Related

How do you guarantee that an incoming Excel file is from your original data source and not a fake?

I have a series of Excel files that I send out to customers. They fill them out and send them back with their info. How do I ensure that the excel files coming back in are the same ones I sent out and don't just share the same title and rows/column names?
The data could be falsified with the same title, row/columns. I ideally need some kind of fingerprint, artifact, or key attached to each excel file that ensures it came from my original data source.
I used to add white characters to headings as one simple trick.
Or I would put in cells odd names combined with dates in rows way below or columns far to the right.
Even inserted a name using insert name. You can also define names with vba and sometimes delete does not completely remove them - used that to hide passwords...

AnyLogic: False number format when exporting data to excel

I collect various data in time plots. If I copy the timeplot data and then paste it into Excel, the number format is often wrong. For example, I often get a date like Aug 94 instead of the actual number from the TimePlot. Unfortunately, I can't easily format this date into a number either, since the formatted number does not match the actual number from the timeplot. If I format the date in the same format as the number above and below, then I get the number 34547. However, this number does not correspond to the actual number of the TimePlot. Anyone know how I can prevent this problem?
You can only solve this on the Excel side, AnyLogic provides the raw data for you. Excel then interprets stuff. You can test it by pasting the chart raw data into a txt or csv file.
So either fix your Excel settings or paste into a csv, then into an xlsx.
Or better still: Do not manually paste at all. Instead, write your model results into the AnyLogic database and export to Excel from there: this takes away a lot of the pain for you. Check the example models to learn how to do that.
This is not AnyLogic question, rather an Excel & computer formatting problem. One way of resolving this is changing computer's date and time settings.
Another way is to save your output at txt file in AnyLogic. Replace all . with ,. Then open empty Excel, select Text format for the columns. Copy-paste from the txt file.
In Excel there are a few options
when you paste use paste as text only option
But this does not always work as Excel will still try to format the stuff for you
Use the Paste Special option and then choose text
Also possible this will not work, based on your Excel settings.
Paste using the text import wizard
(This works for me without fail)
On step 2 choose tab delimited
On step 3 choose Column format as text for every column (you need to select them in the little diagram below)
You will then see the data exactly as it came from AnyLogic. See the example below where I purposefully imported some text which has something that Excel will think is a date. You will now be able to see what in your data made Excel thing your data needed to be formatted the way it is and then you can fix it. (post a new question if you struggle with this conversion)
But as noted by other answers first prize is to write all the important data to external files. But I know that even I sometimes want to export data from a chart and review it in Excel. Option 3 works for me everytime

Export and customize a crystal report in excel

I am having an issue is that while exporting a report to excel sheet, there are lots of spaces and empty cells between the data, as well as, the cells are merged.
Is there is a way to export the report and each field will be in a cell or to control that exportation, suppose my report looks like this:
No Trans_No
1 123
2 333
In my excel sheet, I would like
A B
No Trans_No
1 123
2 333
, But currently it is showing a merging of the cells and spaces , so instead of Trans_No will be in CELL B, it is in D.
So, is there is a way to control o export that?
mohs, welcome to StackOverflow.
Crystal Reports and Excel have very different methods and data structures. When exporting a .rpt into .xls format, Crystal has to make many compromises and judgement calls. Here are some suggestions:
Do you absolutely need to use Crystal in this process?
A. You can import data directly from your data source into Excel (without using Crystal) using Data->Import External Data.
B. You can export from Crystal into CSV format. If the Excel file is being made just for a machine to read it, CSV is a better option.
Keep your Crystal Report very simple.
A. After you drag & drop fields onto your design, do not resize or overlap them.
B. Make sure in your options, you have snap to grid checked.
C. Are your fields horizontally aligned? If not, they will probably be put on different rows.
D. If you are grouping data, you may want to suppress the group headers & footers.
If you are finding empty rows between your data, you can filter these out in Excel:
Select column
Data > Filter (Excel 2010)
Dropdown > uncheck 'Blanks'
I don't use Crystal Reports, but could you export to a CSV file, then import into Excel. The import will allow you to specify the delimiters and should format your data better.
From experience with exporting from older versions of Crystal to Excel, a couple of options:
(1) Export to CSV and open the CSV file in Excel.
This had the disadvantage that instead of appearing at the top of the report above the data values, the column headings would appear on every line of the output before the column values - like so:
No Trans_No 1 123
No Trans_No 2 333
This issue may have been resolved in CR XI - if not, the workround we used for this was to suppress column headings (so that only the values were included in the output), then copy and paste a standard spreadsheet heading for the report into the output in Excel.
(2) Consistently format all fields to the same, minimum size (typically, two grid widths), with columns aligned by snapping the left edge of fields to guidelines.
This produces output which is almost unreadable in the standard report viewer, but which should align correctly in Excel.

Skipping rows when importing Excel into SQL using SSIS 2008

I need to import sheets which look like the following:
March Orders
***Empty Row
Week Order # Date Cust #
3.1 271356 3/3/10 010572
3.1 280353 3/5/10 022114
3.1 290822 3/5/10 010275
3.1 291436 3/2/10 010155
3.1 291627 3/5/10 011840
The column headers are actually row 3. I can use an Excel Sourch to import them, but I don't know how to specify that the information starts at row 3.
I Googled the problem, but came up empty.
have a look:
the links have more details, but I've included some text from the pages (just in case the links go dead)
http://social.msdn.microsoft.com/Forums/en-US/sqlintegrationservices/thread/97144bb2-9bb9-4cb8-b069-45c29690dfeb
Q:
While we are loading the text file to SQL Server via SSIS, we have the
provision to skip any number of leading rows from the source and load
the data to SQL server. Is there any provision to do the same for
Excel file.
The source Excel file for me has some description in the leading 5
rows, I want to skip it and start the data load from the row 6. Please
provide your thoughts on this.
A:
Easiest would be to give each row a number (a bit like an identity in
SQL Server) and then use a conditional split to filter out everything
where the number <=5
http://social.msdn.microsoft.com/Forums/en/sqlintegrationservices/thread/947fa27e-e31f-4108-a889-18acebce9217
Q:
Is it possible during import data from Excel to DB table skip first 6 rows for example?
Also Excel data divided by sections with headers. Is it possible for example to skip every 12th row?
A:
YES YOU CAN. Actually, you can do this very easily if you know the number columns that will be imported from your Excel file. In
your Data Flow task, you will need to set the "OpenRowset" Custom
Property of your Excel Connection (right-click your Excel connection >
Properties; in the Properties window, look for OpenRowset under Custom
Properties). To ignore the first 5 rows in Sheet1, and import columns
A-M, you would enter the following value for OpenRowset: Sheet1$A6:M
(notice, I did not specify a row number for column M. You can enter a
row number if you like, but in my case the number of rows can vary
from one iteration to the next)
AGAIN, YES YOU CAN. You can import the data using a conditional split. You'd configure the conditional split to look for something in
each row that uniquely identifies it as a header row; skip the rows
that match this 'header logic'. Another option would be to import all
the rows and then remove the header rows using a SQL script in the
database...like a cursor that deletes every 12th row. Or you could
add an identity field with seed/increment of 1/1 and then delete all
rows with row numbers that divide perfectly by 12. Something like
that...
http://social.msdn.microsoft.com/Forums/en-US/sqlintegrationservices/thread/847c4b9e-b2d7-4cdf-a193-e4ce14986ee2
Q:
I have an SSIS package that imports from an Excel file with data
beginning in the 7th row.
Unlike the same operation with a csv file ('Header Rows to Skip' in
Connection Manager Editor), I can't seem to find a way to ignore the
first 6 rows of an Excel file connection.
I'm guessing the answer might be in one of the Data Flow
Transformation objects, but I'm not very familiar with them.
A:
Question Sign in to vote 1 Sign in to vote rbhro, actually there were
2 fields in the upper 5 rows that had some data that I think prevented
the importer from ignoring those rows completely.
Anyway, I did find a solution to my problem.
In my Excel source object, I used 'SQL Command' as the 'Data Access
Mode' (it's drop down when you double-click the Excel Source object).
From there I was able to build a query ('Build Query' button) that
only grabbed records I needed. Something like this: SELECT F4,
F5, F6 FROM [Spreadsheet$] WHERE (F4 IS NOT NULL) AND (F4
<> 'TheHeaderFieldName')
Note: I initially tried an ISNUMERIC instead of 'IS NOT NULL', but
that wasn't supported for some reason.
In my particular case, I was only interested in rows where F4 wasn't
NULL (and fortunately F4 didn't containing any junk in the first 5
rows). I could skip the whole header row (row 6) with the 2nd WHERE
clause.
So that cleaned up my data source perfectly. All I needed to do now
was add a Data Conversion object in between the source and destination
(everything needed to be converted from unicode in the spreadsheet),
and it worked.
My first suggestion is not to accept a file in that format. Excel files to be imported should always start with column header rows. Send it back to whoever provides it to you and tell them to fix their format. This works most of the time.
We provide guidance to our customers and vendors about how files must be formatted before we can process them and it is up to them to meet the guidlines as much as possible. People often aren't aware that files like that create a problem in processing (next month it might have six lines before the data starts) and they need to be educated that Excel files must start with the column headers, have no blank lines in the middle of the data and no repeating the headers multiple times and most important of all, they must have the same columns with the same column titles in the same order every time. If they can't provide that then you probably don't have something that will work for automated import as you will get the file in a differnt format everytime depending on the mood of the person who maintains the Excel spreadsheet. Incidentally, we push really hard to never receive any data from Excel (only works some of the time, but if they have the data in a database, they can usually accomodate). They also must know that any changes they make to the spreadsheet format will result in a change to the import package and that they willl be charged for those development changes (assuming that these are outside clients and not internal ones). These changes must be communicated in advance and developer time scheduled, a file with the wrong format will fail and be returned to them to fix if not.
If that doesn't work, may I suggest that you open the file, delete the first two rows and save a text file in a data flow. Then write a data flow that will process the text file. SSIS did a lousy job of supporting Excel and anything you can do to get the file in a different format will make life easier in the long run.
My first suggestion is not to accept a file in that format. Excel files to be imported should always start with column header rows. Send it back to whoever provides it to you and tell them to fix their format. This works most of the time.
Not entirely correct.
SSIS forces you to use the format and quite often it does not work correctly with excel
If you can't change he format consider using our Advanced ETL Processor.
You can skip rows or fields and you can validate the data the way you want.
http://www.dbsoftlab.com/etl-tools/advanced-etl-processor/overview.html
Sky is the limit
You can just use the OpenRowset property you can find in the Excel Source properties.
Take a look here for details:
SSIS: Read and Export Excel data from nth Row
Regards.

Excel changes date formats

I run a process to produce a rather large CSV file of data. Sometimes I find it helpful to open the CSV in excel, make changes manually, and then re-save. However, if there are dates in the CSV, excel automatically reformats them. Is there a way to prevent excel from doing this? It would be helpful if I could turn off all text formatting altogether.
If you prepend an apostrophe ' to the beginning of any date string during the export process, Excel will read the value literally (i.e. as plain text) rather than trying to convert it to a date.
This particular solution is handled during the export process. I'm not sure how you would change Excel to treat the file differently at runtime.
Excel does some nasty tricks when outputting XML. One of its tricks is to drop left most column delimiters if 16 or so consecutive rows have no values for these columns. This means that if you're splitting the lines up based on commmas then these rows will have a different number of columns to the rest.
It will also drop any initial 0's so things like numeric Ids can become messed up.
Another risk you run is chopping the file off short since Excel can only support a maximum number of rows. (Prior to Excel 2007 this was around 65536)
If you need to do anything to a CSV file other than read it use a text editor.
When you import the CSV file into Excel, be sure to pre-format the date column as text. There's a frequently overlooked option in the parsing that allows you to control the format column by column. This also works well for preventing the leading zeros in New England ZIP codes from getting dropped in your contact lists.
If you used the excel file version which is 2010 or later (not sure lower version), you can set up to use current operation-system date format or not in Excel/CSV file.
Right Click cell with date value (e.g. '9/12/2013') in CSV file and pop up the menu
Click 'Format Cells' and open a pop up screen
Go to 'Number' tab and you can see 'Date' was selected in 'Category' (left side) and 'Type' on the right side
Observed that there are two types of Date format (one is with () and another is not with ()). Read the comment there and you can find that you can use the date format which is not with date. It means that your changes to the CSV file will not be applied with your current operation-system date format. So, I think date format won't be changed in CSV file in this case.

Resources