How to Convert semicolon delimited file to csv - excel

How to change delimiter semicolon to comma of a file in google sheets or other way? I have already tried text to column feature but my file is too large around 10 MB so it doesn't work.

I have been through that path quite a few times, since business logic likes to use Excel, and software likes CSV.
Excel, and probably Google Sheets etc, are a pain to use for CSV. Especially Excel because it tries to be "friendly" for you, so it typically exports with separators determined by the Windows Locale (ex: ";" or "," ) - and excel usually shifts between quotes and no quotes.
If you HAVE the text file, and wish to IMPORT into excel, find the combination of separators that suits your language variant, using a test file. To convert the big text file, a tool like AWK is simple and awesome.
To EXPORT from Excel, I would again just go with whatever Excel wants to do, then adapt the file after using AWK or similar. Easier for me than trying to find settings for separator and quoting.
There is an article here that I find helpful.
PS: I should mention that in most of European region, CSV IS semicolon by default.

Related

Opening CSV files in Excel 2016

I have a new install of Excel 2016, that hates CSV files. It opens them with everything in one column flagpole style, down column A, with commas and speech marks visible.
Salient points:
I have two machines, desktop and laptop, both running same version of Excel. Desktop works fine, opens the same problem files formatted correctly.
I can create CSV files on laptop, save those, open them again on laptop, and it's fine.
Even opening it in Notepad++, saving in the hope of some sort of file format normalisation, and still no good.
I have compared regional settings and almost all settings in Excel.
I tried renaming the file to TXT, it brought up the text file conversion dialogue, I chose comma delimited. First time it ignored that, still got everything in column A, second attempt, that actually worked, however, that is a pants solution, I want to be able to just natively open CSV files without saving as TXT, I use many different ones every day.
Anyone got any ideas?
Thanks in advance.
CSV files are character separated value files, not necessarily comma separated. For more than half the world the separator character is a semicolon (;), not a comma (,)
Excel 2016 properly respects your Windows regional settings, and uses the specified List Separator character
One solution is to change your regional settings for the List Separator attribute to the character you want Excel to default to using, e.g. a comma (,)
This can be changed in the operating system Control Panel, under Region settings, Additional Settings, List separator
For various reasons some people seem to have the incorrect regional settings for the culture they most commonly work in, and therefore have semicolon as the default separator
If you prefer not to change your operating system regional setting to what you think is normal for CSV files, you can change the default behavior in Excel with the Use system separators checkbox under the File/Options/Advanced menu
If you want custom options each time you open a CSV file, use the Data/From Text menu, but this becomes slow and awkward for lots of files
CSV References:
https://en.wikipedia.org/wiki/Decimal_separator (see map of world using comma as decimal point separator, it's very common, and hence CSV's often use semicolon separators)
https://data-gov.tw.rpi.edu/wiki/CSV_files_use_delimiters_other_than_commas
https://en.wikipedia.org/wiki/Comma-separated_values (spec point 3)
https://ec.europa.eu/esco/portal/escopedia/Comma-separated_values_%2528CSV%2529
https://parse-o-matic.com/parse/pskb/CSV-File-Format.htm
I've found a way of saving messy CSV files into a nice table format but I'm not sure if it will work for your case.
Data -> New Query -> From File -> From CSV
By opening the CSV file this way, a pop-up 'Query Editor' window will appear with a nicely organised table format where you can edit, save and load into your excel sheet.
I hope this helps.
For me the solution was to:
Data > From Text > Choose your csv file
Then you can define all the import settings for csv files.
I found another way to fix this, without changing your windows local settings.
In Excel, you go to File > Options > Advanced.
Un-check the "Use System Separators" within the Editing Options and change the Decimal separator with "," and the Thousands Separator ".".
Even it does look more like a bug than a feature of Excel 2016, it works without changing the Windows Local settings, and it's just a local Excel change.
Just had this same problem. Changing the file extension from csv to txt and opening in Excel brings up the classic wizard so you can map the strings to fields.
The correct answer is to edit your regional settings as suggested above (if a long term change in behavior is desired)
Control Panel -> Region -> Additional Settings -> List separator:
But for my purposes a simple Edit -> Find and Replace using Notepad to replace all commas with semi-colons was a quick and dirty solution that I preferred.
Despite the comment that csv means 'Character Separated Values', in Office 2016 my .csv file association to Excel still says 'Microsoft Excel Comma Separated Values File'.
I have quite a complex csv file where none of the suggestions worked out for me. So I ended up using LibreOffice Calc for the job. It worked like a dream.
I had the same problem, I fixed it in this way (Excel 2020)
Data -> Text to Columns
Now you can configure as you wish the CSV delimiters/endlines...
I had the same issue on Mac OS X El Captain. The answer given here worked for me. Reproducing it here in case the link doesnt work in future:
Close the Excel application
Click on the Apple button
Select System Preferences
Select Language and Region
Click Advanced
Change the Decimal separator from a comma (,) to a full stop (.)
Then click on Ok/Save
Test the Excel import again
When changing the list separator, make sure it doesn't overlap with the decimal symbol and the digit grouping characters. I had to change my list separator to (,) my decimal to (.) end my digit grouping to ('). Now .CSV opens lekker!!!
In my excel, it's: data> get data> from file> from text/csv
Try opening in excel, then using text to columns, based on commas.
You could probably create some simple vba to open it in that way too.

Create csv immediately recognizable by Excel (both US and EU)

In many EU countries a comma ',' is used as the decimal separator, whereas in the US a dot '.' is used.
CSV (Comma Separated Values) files are supposed to use the comma to separate cell values. However, often a tab '\t' or other characters are used instead.
What's interesting, Excel if you save a .csv file using Microsoft Excel in a EU country using the comma as a decimal separator, the value it uses to separate cell values is not an escaped comma, but a semicolon ';'. Looking on the net it seems that, if you are in the US, Excel will save .csv files using a proper comma (I can't verify this).
I'm trying to find a way to create a csv file that can be recognized by Excel without any user action, both in the EU and the US.
Here's an example using Excel with an Italian locale
The above, saved as .csv (MS-DOS), translates to
foo;foo bar;
foo'bar;"foo""bar";
foo,bar;foo.bar;
foo:bar;"foo;bar";
foo/bar;foo\bar;
"foo
bar";foo|bar;
foo;bar;foobar
this is to make the empty line appear
It may be possible that, depending on the local "list separator", this may not be recognized correctly.
I've read that the new Excel 2013 needs sep=; to be set as the first line in order to work correctly. This is an ugly hack, but it seems to also be working for Excel 2010 (except it gets overwritten on save)...
Does the above text work for you, if you save it as a csv?
Is there a less hacky way to tell Excel which character is the cell separator, without having the user to set things up?
Thanks.
Time to head back to a time before visual anything, and grab a command from the past. It will involve you manually writing the file out with VBA, but it has the criteria you expect: Write
Open "c:\tmp\myfile.csv" for output as #1
for i=1 to 100
write #1,range("A"&i),range("B"&i),range("C"&i)
next i
close #1
You will have to do a little manual work - it doesn't translate a single quote into a double quote, but the rest is as desired:
the Write # statement inserts commas between items and quotation marks around strings as they are written to the file
Numeric data is always written using the period as the decimal separator.
Dates are written as #yyyy-mm-dd hh:mm:ss#

Difference between Excel .csv and plain .csv?

I am running Windows 7 and have MS Office installed. Any time I download a .csv file the "file type" line in the "save as..." dialog defaults to "Microsoft Office Excel comma separated values file". Is there actually a Microsoft specific format that is distinct from "plain" .csv?
Googling the relevant terms returns various incredibly uninformative pages such as this one. Is any information lost or gained, or anything encoded differently by using this format than by just treating a file as a .csv, conforming to the general standards?
Yes, there are almost certainly differences.
From the top of my head: English Excel uses "," as a seperator. German locale uses ";" as a seperator, requiring an additional importing step if you want to import a csv with a comma seperator. This is not unique to german locales, roughly 1/4 to 1/3 of the world uses ";".
Also, there might be differences in how complicated strings are escaped (; and " in texts) which are probably different from program to program.
This is not excels fault, since the csv "format" is not really standardised and there are uncountable numbers of programs which are rolling their own csv parser, which leads to all sorts of problems because they forgot to handle corner cases.
I once read the comment that csv is the plague of data exchange formats because it is so difficult to do right. I could not agree more, I have to deal with them on a daily basis and they are extremly annoying to work with.
Open source fans will hate me for this, but I think csv is a poor choice for data exchange, even xlsx is better because it has rules which are well defined.
There are two things going on. The abbreviation (and suffix) "CSV" can mean character-separated values or it can mean comma-separated values. "Microsoft Office Excel comma separated values file" is a disambiguation, and means that you have a number of values in a record, with the field values separated by a comma.
The values themselves, in comma-separated value files, may contain commas if they are properly stropped (quoted). Usually, the stropping is putting a double quote around some or all of the field.
MS Excel also supports newlines in the middle of fields, again being properly stropped.

Delimiter in Excel

I'm having problem with finding a unique delimiter for excel when I import a bunch of data from csv. I wanted "^;" to be the unique delimiter. However, Excel isn't smart enough to know that I just want "^;" as delimiter NOT IN CONJUNCTION with single "^" and/or single ";" even I checked both "Semicolon" and "Other" (i.e. ^) as well as selecting "Tread consecutive delimiters as one". Is there any way to achieve that in Excel where "^;" will be the only unique delimiter it has to look for. Or it's just a painful reality we need accept in MS Excel?
Openoffice allows you to specify the separation string when saving your file as .csv.
As far as I know there is no way of having a multi character delimiter when importing with Excel.
If it's possible with your dataset, I suggest using find and replace to change all instances of "^;" to a more common delimiter format. Having done this you should be able to import into Excel easily.

CSV for Excel, Including Both Leading Zeros and Commas

I want to generate a CSV file for user to use Excel to open it.
If I want to escape the comma in values, I can write it as "640,480".
If I want to keep the leading zeros, I can use ="001234".
But if I want to keep both comma and leading zeros in the value, writing as ="001,002" will be splitted as two columns. It seems no solution to express the correct data.
Is there any way to express 001, 002 in CSV for Excel?
Kent Fredric's answer contains the solution:
"=""001,002"""
(I'm bothering to post this as a separate answer because it's not clear from Kent's answer that it is a valid Excel solution.)
Put a prefix String on your data:
"N001,002","N002,003"
( As long as that prefix is not an E )
That notation ( In OpenOffice at least) above parses as a total of 2 columns with the N001,002 bytes correctly stored.
CSV Specification says that , is permitted inside quote strings.
Also, A warning from experience: make sure you do this with phone numbers too. Excel will otherwise interpret phone numbers as a floating point number and save them in scientific notation :/ , and 1.800E10 is not a really good phone number.
In OpenOffice, this RawCSV chunk also decodes as expected:
"=""001,002""","=""002,004"""
ie:
$rawdata = '001,002';
$equation = "=\"$rawdata\"";
$escaped = str_replace('"','""',$equation);
$csv_chunk = "\"$escaped\"" ;
Do
"""001,002"""
I found this out by typing "001,002" and then doing save-as CSV in Excel. If this isn't exactly what you want (you don't want quotes), this might be a good way for you to find what you want.
Another option might be use tab-delimited text, if this is an option for you.
A reader of my blog found a solution, ="001" & CHAR(44) & "002", it seems workable on my machine!
Pretty old thread but why don't you just add whitespace after your value. It will be then treated as string and no leading zeros will be stripped.
"001,002"." "
Since no-one mentioned it already, figured it was worth mentioning it in this old post.
If you add a horizontal tab character \t before the number, then MS Excel will also show the leading zero's. And the tab character doesn't show in the excel sheet. Even if it's surrounded by double-quotes. (F.e. \"\t001,002\")
It also looks nicer in Notepad++, compared to putting a \0 aka NULL before such number.
Looking more at the Excel spreadsheet it looks what you want can't be done using CSV.
This site http://office.microsoft.com/en-us/excel/HP052002731033.aspx says "If cells display formulas instead of formula values, the formulas are converted as text. All formatting, graphics, objects, and other worksheet contents are lost. The euro symbol will be converted to a question mark."
However, you can change how you load it to get the result you want. See this web page:
Microsoft import a text file.
The key thing is to choose Import External Data-Import Data-Text Files, go Next, Next, and then tick "Text" under column data format. This will prevent it being interpreted as a number, and losing formatting.
I was fiddling around with CSV to Excel (i use PHP to create the CSV, but i guess this solution works for any language. When you spot that a leading characters (such as + , - or 0 are disappearing, create the CSV with chr(13) as a prefix. This is a non printable character and it works wonders for my Excel Office 2010 version. I tried other non printable characters, but with no luck.
so i use Chirp Internet solution but tweaked with my prefix:
if (preg_match("/^0/", $str) || preg_match("/^\+?\d{8,}$/", $str) || preg_match("/^\d{4}.\d{1,2}.\d{1,2}/", $str)) {
$str = chr(13)."$str";
}
If you are using "Content-Disposition" and exporting from asp to excel using HTML tags,then you have to add "style='mso-number-format:\#;'" to that tag and making it to accept only Text values ,thereby leading zeroes omission will be avoided,If Forward slash"\" is accepted use double forward slash "\"
All the suggested answers don't seem to work for me right now ("=""blahblah""" and others) in all current Excel versions or Numbers app on OS X.
The only solution I found to be working by fiddling around is to add an escaped null character at the beginning of the string (which is \0 in PHP or C based languages). Everything ends up treated as is without being calculated or processed by the software when opening the calc sheet.
echo "\0" . $data;
Excel uses a default formatting for CSV columns depending on the content. So if you have 001 in a csv, excel will automatically turn it to 1.
The only way to keep the leading zeros in excel from a csv file is by changing the extension of the csv file to .txt, then just open excel, click on open, select the txt file, and you'll see the Text Import Wizard. Select your csv format (separated by commas), then just make sure you select "Text" as the format.
And that's it, now you can export that previous csv data to any other while keeping the leading zeros.
This is straightforward using Excel's Power Query functionality that allows you to perform step-by-step transformations.
Original File:
Add a Custom Column:

Resources