Converting MS documents to csv files - linux

I have a bulk of MS documents and I'm using ubuntu os. I need to convert all of these documents to CSV format.
Is there any way to do it?

You could try to open them with openoffice.org and then save as cvs files.

Look for the 'xlhtml' package.
It's far from perfect but if your excel documents are simple it'll probably work.
It'll only convert Excel files. If you want to convert Word to plain text look at 'antiword'.

Related

How to programmatically create MS Office .doc or .docx files on a linux server

In the past I've used catdoc for reading .doc files, but now I need to write them.
What is the best way to go about this? I don't need it to be perfect or fully featured.
a quick and dirty way would be, to write your file in HTML and save the file as .doc
Because word can open HTML you would have a Word File^^
Beware that if you open the file with word sometimes the "web-view-mode" is selected

Batch convert xls-Files to csv

I need to convert over 100 Excel files to CSV. Worse these files consist of multiple sheets and I only need one of them.
At first I stumbled upon the Perl program xls2csv. Luckily I even found on XLS file conversion at the bottom a convenient script that converts all sheets into seperate csv files. But unluckily this converter is broken and skips lines.
I also tried pyodconverter but that only converts the first sheet.
Any suggestions? It would be ok if that conversion had to be done on Windows though I would really prefer Linux. And if it has to be Windows it would be nice if it wouldn't need an Excel installation.
There's a very useful java library called Apache POI at http://poi.apache.org/
The following link provides an example application that converts xls to csv.
http://svn.apache.org/repos/asf/poi/trunk/src/examples/src/org/apache/poi/hssf/eventusermodel/examples/XLS2CSVmra.java
If you know java you can adjust it to your needs. Since it's java it runs also on linux.
you could also have a look at StatTransfer... (Win only, I'm afraid)
I know this is late but there is actually an HTA (HTML Application) which can do this. The details and download link can be found here.

How can I convert an Excel file on a Linux server to a delimited text file?

I am running a Linux server and one of our suppliers only knows how to send me an Excel file which I need to import into our system daily. Does anyone know of a good way to export the Excel file to a delimited file? Preferably with php or perl.
Thanks!
Chris Edwards
Java library POI does this quite well, with very simple API.
http://poi.apache.org/
OpenOffice (or LibreOffice) has a scripting ability, alas, which I have never looked at. However, it seems it would be straightforward to open the Excel file using Calc, and then do a Save As .csv operation.

use OpenOffice Calc to open Excel files and convert to CSV or Tab-delimited

Is there any type of automation available where I can use OpenOffice Calc to open Excel files and convert them to CSV or tab-delimited files?
I'm currently using PHPExcel to open the files and iterate through them and import each row into a database but have begun to run into memory issues with large files and need another alternative.
These are xls and xlsx files so it has to work for all of them.
If there is, how would I go about programming this in PHP?
If you have other alternatives, please feel free to suggest them.
OpenOffice can be run in server mode and used to convert files between a number of supported formats.
I have used this mainly with Java thru the JODConverter library available at http://www.artofsolving.com/opensource/jodconverter
A quick websearch brought up http://sourceforge.net/projects/phopo-org/ which claims to be a PHP implementation

mass convert Excel files into tab-delimited text files

Is there a tool to convert a large number of excel files into tab delimted files automatically?
I just through this together, its not pretty but should do what you need. Tested on WindowsXP / Office2007.
download from: http://stembro.byethost17.com/utility_scripts/xl2tab/xl2tab.html
Extract the xl2tab.vbs file to the directory containing the excel files and double-click to run. It will place the converted files into a new directory called "output." The original directory-structure remains intact within the output folder.
I don't think there are any good free tools to do so right now, but you could look into using the Open Office API to write something,
[http://www.oooforum.org/forum/viewtopic.phtml?t=7657&highlight=convert+xls+csv+command+line][1]
Or for a quick and dirty solution, you could record and Open Office Calc macro that would do it, and launch that macro from the command line.
This might also help http://dag.wieers.com/home-made/unoconv/
convert to csv, and maybe replace , with tabs?

Resources