concatenate long strings of text for uploading on website - excel

We have used the concatenate function on excel to add a string of text on excel. We are trying combine around 10 fields with a total of 300 characters. The concatenate function works on excel and once we paste the values to remove the formula, the correct text strings are created. However, the problem arise when we try to use this information as a CSV, tab delimited file to import into our webstore, the concatenate text is not recognised. When we inspect the format of the cell, the characters are displayed as a bunch of ############, rather than the text and I believe this is the reason why it is not allowing us to import the file. Small text strings work, however long strings do not work. We have to use open calc spreadsheet to concatenate, however this has the same problem. We have saved the file at UT8.

You shouldn't have a cell to inspect if you are saving as a csv. You should have a text file to examine. You also don't need to paste the formulas as values, as none of the formulas will remain after you save as a text file.
After you get the Excel file ready with the concatenate formulas, save the sheet as either a tab-delimited .txt or comma-delimited .csv.
Open it up in notepad to verify that the values are still what you expect them to be rather than the ##### characters.
At that point, your import should work.

Related

Save Positive Numbers in parentheses in csv

I want to save the positive numbers in parentheses in .csv. For example:
My number = (0.235)
When I try to use number format, it automatically converts this number to -0.235. I changed the format, use text format and then it works. However, once I save and re-open the data it again converts to -0.235. I should save it in .csv format. Any advice?
Format -> Cells -> Custom - add new with.. (0);-0.
Then save as normal Excel.
If you save as CSV - then open it in Notepad or TextEdit you will see the parentheses and minus sign are saved.
But when re-opening it in Excel it reconverts it back to -1/1 etc losing your parentheses.
Best to use Excel format when saving to preserve the formatting options.
The problem is not when you save it in CSV (assuming you converted to text format as you mention above; or use a custom number format like that shown by #JGFMK). The problem is when you open it in Excel.
As is usually the case, in order to prevent Excel from doing automatic conversions, you need to IMPORT the csv file and not OPEN it. You will then have the opportunity to designate that column as TEXT, and your parentheses will be preserved.
When you are dealing with any CSV file, formatting is not included. So Excel tries to interpret the values as best it can (as if you were typing the value into a cell manually). This can cause issues not only with your circumstance, but also with certain fractions, ratios, dates not in the format of the computer system, etc. For all these reasons, you are often best off importing rather than opening any csv file in Excel.

Excel IFilters, Concatenated Strings xlsx / xls

I have an application that uses the delivered MS Office IFilters to extract text content from Excel files.
I have an issue with .xlsx files and concatenated strings. The IFilter extracts text, but not concatenated strings.
xls returns concatenated strings (I know they are different file formats and that .xlsx is essentially a zip file with the data being stored as xml). Essentially though, xls returns concatenated strings, xlsx does not.
An example is:
A1=ABC, H2=123, G3=XYZ, D1=Concatenate(A1, H2, G3)
xls IFilter returns the concatenated string as ("ABC123XYZ"), the same as it appears visually in the file, xlsx does not return the concatenated values.
If the cells are adjacent, it may appear that xlsx is returning the concatenated values, but it is not, only the cell values are returned.
I have tried unzipping the xlsx and parsing the .xml files, but again, it does not return the concatenated string.
I'm really after suggestions as how best to handle this. Ultimately I need to be able to extract the concatenated strings from xlsx.
Is my only option to convert the file to xls before extracting the text? Is there an easy way to do this dynamically with no real performance hit and without actually saving the file? Would I be better off 'extracting' the text using Microsoft.Office.Interop.Excel and somehow copying and pasting into a listview? Seems like either would be a huge performance hit.
Any help and advice is gratefully received!

Dot at end of numbers in CSV file

I have an excel file with some numbers followed by a dot, say:
12345.
I created a simple macro to convert and save the excel file into a .csv file.
The problem is the CSV file does not save the dot. The data comes out as 12345 and not 12345. I also tried manually adding the dot and saving the csv file, but the dot just dissappears.
Anyway to add the dot into the CSV file?
http://i.stack.imgur.com/UoboK.jpg
Add a single quote/apostrophe in front of the number (without a closing quote at the end), so that it reads it as text and does not auto-format it as number.
For example you have:
12345.
Excel will auto-format that column as a number and drop the period unless you specify that it's a text only column.
Change it to:
'12345.
Excel will read that value as a text column and will not save the apostrophe in a text-based CSV file.
Edit:
In testing this, I found that ff you save it as "12345.", Excel changes the CSV file to """12345.""".

Excel: Default to TEXT rather than GENERAL when opening a .csv file

Is is possible to change the default data type Excel uses when opening a .csv file? I would like Excel to default to TEXT rather than General for the Column Data Format when reading a .csv file.
I would like to be able to open a .csv without having leading 0's removed from my data. Currently I use the Import External Data wizard when reading a .csv file but I would prefer to be able to use File/Open or to just double click on the .csv file.
One option is to record a macro of the import process, that way you can define the TextFileColumnDataTypes to be Text.
When you record the macro you will see that the format is set with the line .TextFileColumnDataTypes = Array(2, 2, 2)
where 2 sets the Text format and the 3 elements in the array refer to 3 columns.
You can set the array to contain more elements than the number of columns you expect to have in your text files as any extra are ignored.
You can press F8 to launch the Macro dialog which shortens the process such that it';s similar to opening from the file menu (although still not as convenient as being able to double click a file).
I found a useful example macro with some further explanations here
This goes into a bit more detail eexplaining what the relevant settings do, e.g. setting the correct delimiter in the macro etc.
If you have the option, you can save the data to an XML spreadsheet (I know, these files get large very fast) - to open it, just drag it to an open Excel window. This is the only way I know of to get the result you'd like. -- It is only useful for moderate to small data sets.

Excel CSV - Number cell format

I produce a report as an CSV file.
When I try to open the file in Excel, it makes an assumption about the data type based on the contents of the cell, and reformats it accordingly.
For example, if the CSV file contains
...,005,...
Then Excel shows it as 5.
Is there a way to override this and display 005?
I would prefer to do something to the file itself, so that the user could just double-click on the CSV file to open it.
I use Excel 2003.
There isn’t an easy way to control the formatting Excel applies when opening a .csv file. However listed below are three approaches that might help.
My preference is the first option.
Option 1 – Change the data in the file
You could change the data in the .csv file as follows ...,=”005”,...
This will be displayed in Excel as ...,005,...
Excel will have kept the data as a formula, but copying the column and using paste special values will get rid of the formula but retain the formatting
Option 2 – Format the data
If it is simply a format issue and all your data in that column has a three digits length. Then open the data in Excel and then format the column containing the data with this custom format 000
Option 3 – Change the file extension to .dif (Data interchange format)
Change the file extension and use the file import wizard to control the formats.
Files with a .dif extension are automatically opened by Excel when double clicked on.
Step by step:
Change the file extension from .csv to .dif
Double click on the file to open it in Excel.
The 'File Import Wizard' will be launched.
Set the 'File type' to 'Delimited' and click on the 'Next' button.
Under Delimiters, tick 'Comma' and click on the 'Next' button.
Click on each column of your data that is displayed and select a 'Column data format'. The column with the value '005' should be formatted as 'Text'.
Click on the finish button, the file will be opened by Excel with the formats that you have specified.
Don't use CSV, use SYLK.
http://en.wikipedia.org/wiki/SYmbolic_LinK_(SYLK)
It gives much more control over formatting, and Excel won't try to guess the type of a field by examining the contents. It looks a bit complicated, but you can get away with using a very small subset.
This works for Microsoft Office 2010, Excel Version 14
I misread the OP's preference "to do something to the file itself." I'm still keeping this for those who want a solution to format the import directly
Open a blank (new) file (File -> New from workbook)
Open the Import Wizard (Data -> From Text)
Select your .csv file and Import
In the dialogue box, choose 'Delimited', and click Next.
Choose your delimiters (uncheck everything but 'comma'), choose your Text qualifiers (likely {None}), click Next
In the Data preview field select the column you want to be text. It should highlight.
In the Column data format field, select 'Text'.
Click finished.
You can simply format your range as Text.
Also here is a nice article on the number formats and how you can program them.
Actually I discovered that, at least starting with Office 2003, you can save an Excel spreadsheet as an XML file.
Thus, I can produce an XML file and when I double-click on it, it'll be opened in Excel.
It provides the same level of control as SYLK, but XML syntax is more intuitive.
Adding a non-breaking space in the cell could help.
For instance:
"firstvalue";"secondvalue";"005 ";"othervalue"
It forces Excel to treat it as a text and the space is not visible.
On Windows you can add a non-breaking space by tiping alt+0160.
See here for more info: http://en.wikipedia.org/wiki/Non-breaking_space
Tried on Excel 2010.
Hope this can help people who still search a quite proper solution for this problem.
I had this issue when exporting CSV data from C# code, and resolved this by prepending the leading zero data with the tab character \t, so the data was interpreted as text rather than numeric in Excel (yet unlike prepending other characters, it wouldn't be seen).
I did like the ="001" approach, but this wouldn't allow exported CSV data to be re-imported again to my C# application without removing all this formatting from the import CSV file (instead I'll just trim the import data).
I believe when you import the file you can select the Column Type. Make it Text instead of Number. I don't have a copy in front of me at the moment to check though.
Load csv into oleDB and force all inferred datatypes to string
i asked the same question and then answerd it with code.
basically when the csv file is loaded the oledb driver makes assumptions, you can tell it what assumptions to make.
My code forces all datatypes to string though ... its very easy to change the schema.
for my purposes i used an xslt to get ti the way i wanted - but i am parsing a wide variety of files.
I know this is an old question, but I have a solution that isn't listed here.
When you produce the csv add a space after the comma but before your value e.g. , 005,.
This worked to prevent auto date formatting in excel 2007 anyway .
The Text Import Wizard method does NOT work when the CSV file being imported has line breaks within a cell. This method handles this scenario(at least with tab delimited data):
Create new Excel file
Ctrl+A to select all cells
In Number Format combobox, select Text
Open tab delimited file in text editor
Select all, copy and paste into Excel
Just add ' before the number in the CSV doc.
This has been driving me crazy all day (since indeed you can't control the Excel column types before opening the CSV file), and this worked for me, using VB.NET and Excel Interop:
'Convert .csv file to .txt file.
FileName = ConvertToText(FileName)
Dim ColumnTypes(,) As Integer = New Integer(,) {{1, xlTextFormat}, _
{2, xlTextFormat}, _
{3, xlGeneralFormat}, _
{4, xlGeneralFormat}, _
{5, xlGeneralFormat}, _
{6, xlGeneralFormat}}
'We are using OpenText() in order to specify the column types.
mxlApp.Workbooks.OpenText(FileName, , , Excel.XlTextParsingType.xlDelimited, , , True, , True, , , , ColumnTypes)
mxlWorkBook = mxlApp.ActiveWorkbook
mxlWorkSheet = CType(mxlApp.ActiveSheet, Excel.Worksheet)
Private Function ConvertToText(ByVal FileName As String) As String
'Convert the .csv file to a .txt file.
'If the file is a text file, we can specify the column types.
'Otherwise, the Codes are first converted to numbers, which loses trailing zeros.
Try
Dim MyReader As New StreamReader(FileName)
Dim NewFileName As String = FileName.Replace(".CSV", ".TXT")
Dim MyWriter As New StreamWriter(NewFileName, False)
Dim strLine As String
Do While Not MyReader.EndOfStream
strLine = MyReader.ReadLine
MyWriter.WriteLine(strLine)
Loop
MyReader.Close()
MyReader.Dispose()
MyWriter.Close()
MyWriter.Dispose()
Return NewFileName
Catch ex As Exception
MsgBox(ex.Message)
Return ""
End Try
End Function
When opening a CSV, you get the text import wizard. At the last step of the wizard, you should be able to import the specific column as text, thereby retaining the '00' prefix. After that you can then format the cell any way that you want.
I tried with with Excel 2007 and it appeared to work.
Well, excel never pops up the wizard for CSV files. If you rename it to .txt, you'll see the wizard when you do a File>Open in Excel the next time.
Put a single quote before the field. Excel will treat it as text, even if it looks like a number.
...,`005,...
EDIT: This is wrong. The apostrophe trick only works when entering data directly into Excel. When you use it in a CSV file, the apostrophe appears in the field, which you don't want.
http://support.microsoft.com/kb/214233

Resources