Convert Excel documents to wiki markup [closed] - excel

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed last month.
Improve this question
Is it possible to convert Excel spreadsheets into MediaWiki markup? I stumpled upon strange recommendations exporting to HTML and convert it into markup. Is there a solution, maybe exporting to XML first or directly converting it?

I know of three options:
Install a WYSIWYG editor extension like FCKeditor (see also Official). Advantage: fairly easy Paste As Word (and therefore also as Excel) button. Disadvantage: installation can be tricky.
Use a macro in Excel. Advantage: a one-click creation of markup. Disadvantage: client-side solution (so need it for all users).
My preference is the FCKeditor option because once it is installed it works pretty well.

You can do this in a graphical interface, with proper software.
Install LibreOffice and add the "wiki publisher" extension, often contained in a package named libreoffice-wiki-publisher.
Open your spreadsheet with LibreOffice Calc, copy your table.
Open a new Writer document, paste with Edit > Paste special (RTF style).
Find "Export" in the menu, select "MediaWiki (txt)" in the format dropdown and confirm.
(Optional.) In your preferred text editor, remove any excess table HTML markup with a couple of simple text replacements and replace the first line {| with {| class="wikitable" to have pretty backgrounds and borders.
LibreOffice is free/open source software and is better than Excel at handling Excel's own spreadsheets, as you see. There used to be an Office plugin too but I've not heard of anyone using it recently.
You may need table styles, available in LibreOffice 5+; but in the meanwhile you can just apply CSS classes to your MediaWiki table.
Alternatively, just copy and paste your table in a page powered by VisualEditor, which is quite good for tables. If your wiki doens't have it, you could still use the MediaWiki.org sandbox: paste your rich text, click the pencil button at the top and then "wikitext/source editing", cut the wikitext and paste into your wiki.

My port of Shan Carter's Mr. Data Converter now supports the Wiki table format. You can copy & paste directly from Excel or from a CSV file.
http://thdoan.github.io/mr-data-converter/

Here is a simple python script that I threw together for my needs. This doesn't handle cell formatting or anything of that nature, but if you just need to get a large table into the MediaWiki format, it'll do the trick. It depends on xlrd.
Usage of this script is as simple as
python xl2wiki.py input.xls
If you want to save the output to another file, just do
python xl2wiki.py input.xls output.txt

MediaWiki supports HTML syntax for tables. The wikitext doesn't look nice and is harder to edit, but if you are just going to copy&paste anyway, it works. And there should be plenty of tools for converting from Excel (or CSV, ODS) to HTML.
Damn, I should find the time to add native CSV supprot to MediaWiki.

I use this macro available on the main Wikipedia site.
It convert the Excel tables to wiki formating. The output is pretty close to the original excel file. it does transfer font formating and cell colors. There is limitation on the borders but they come from mediaWiki system.
You can find the Code at :
https://de.wikipedia.org/wiki/Wikipedia:Technik/Text/Basic/EXCEL-2003_Tabellenumwandlung_VBA

It's 2021 now.
You can copy basic Excel spreadsheets directly into MediaWiki's Visual Editor.
The only thing that's missing in the copy/paste method is cell formatting.

Related

DataTables export to CSV but maintain styling / force formatting

Using datatables and the tabletools plugin which works.
I am however looking for a way to also export the styling of the table if this is possible?
I am creating an 'online' version of a commonly used excel sheet used in my company. Users keep on messing up the formulae / deleting rows etc and although I can create something much simpler and maintainable in the browser, its needs to be the same styling as before once exported.
Just wondering if this is possible?
I could also use the copy to clipboard feature of tabletools if that helps the case.
CSV is a dumb format. It only knows about data and is not meant to store formatting information.
If you want to preserve the formatting, you need to export your file as XLS or XLSX (or PDF if you don't want people to edit the file).

How to make a table in Xpages that works like Excel

I have an excel file with data and need to make this available on web.
The web version of the excel file need to have the following features
Switch between read and edit mode
All cells should be editable at the same time
Inline editing of each cell
Save all cells that have been changed with a single button.
Ability to add and remove rows
Store values in notes document(s)
I have looked at the Dojo Grid JSON REST control in the extension library sample database and it does basically all that I want but I am not happy with the presentation and it seem a bit limited as I later on may need to add other actions to the table cells.
I am looking for an html table version
Which controls should I use to accomplish this? and how can I create a submit button that saves all cells/rows?
Thanks for helping out
Thomas
There is also a project on OpenNTF that gives you a full fledged spreadsheet, that can even load Excel files. It is based on the OpenSource ZK-Spreadsheet
Have a look!
The OpenNTF project was one of the winners in the first OpenNTF contest.
All of those options are possible with the EXTJS grid
You can see some examples here
http://demo.xomino.com/xomino/extjs.nsf
or on the blog
http://xomino.com/extjs
but also check out the examples on the sencha page
http://docs.sencha.com/extjs/4.2.2/extjs-build/examples/

PostgreSQL: Copy/paste resulting with headers into Excel without code

I used MS SQL Server 2008 R2 (MS SQL) where I could right click the query result, copy/paste it with headers to Excel for easy exploration. Now with PG Admin (PostgreSQL) I have to do export (File > Export > CSV) then bunch of Excel steps (Text To Columns).
Is there an easy way to copy/paste the query result with headers into Excel?
For pgAdmin 4, there is an option to "Copy with headers". It is a drop-down beside the copy button in the Query Tool menu:
PgAdmin seems to make semi colon the default field separator. Excel seems to like tabs by default.
You could try and change excel or each time just do the "text to columns" feature.
I personally would go to Preferences->Query tool->Results grid and change the following
Result copy quote character: "
Result copy field separator: Tab
Copy column names: True
This will make it more behave more like sql management studio.
There's a lot of different ways to accomplish what you want here. The question is a bit confusing because you are talking about Excel, but then you table about '/var/lib/postgres/myfile1.csv', which makes me think you are now using some favor of Linux.
I'm using Ubuntu 12.04 with pgAdminIII 1.16.0. And I have Open Office installed with LibreOffice 3.5.4.2 as the Excel replacement.
I'm not sure why you want to take the information out of the grid in pgAdminIII, but assuming just wanting to take the data and move it over to a spreadsheet to play it for some reason, then about the easiest way to do it is run your query and click the upper left corner of the results (which just like a spreadsheet selects everything) and copy. Then, you should be able to open LibreOffice and paste in the information. It will bring up the same dialog as you would see when importing a CSV file.
Also, you should be able to start psql and then do a "COPY" command. If you get a permissions error, then try the suggested "\COPY" instead. Please see the PostgreSQL docs. Here is a link to a wiki page here.
If I'm missing what you are trying to do, please ask questions in the comments section, and I'll try to improve my answer accordingly.
You have to set your query tool output to text not the grid data. That way the Column names and the query results are all in the same cut past text file. When you do this you are no longer doing CSV. The whole results and field names comes over as a text file in the cut and paste process.
Answering to quite an old post:
The answer by #Phillip Fleischer seems to be the best way, at least in pgAdmin III. But for pgAdmin III version 1.22.2 (the one I am using), instead of Preferences..., the settings mentioned were seen under File > Options > Query tool > Results grid.

Working with Office "open" XML - just how hard is it?

I'm considering replacing a (very) large body of Office-automation code with something that works with the Office XML format directly. I'm just starting out, but already I'm worried that it's too big a task.
I'll be dealing with Word, Excel and PowerPoint. So far I've only looked at Word and Excel. It looks like Word documents should be reasonably easy to manipulate, but Excel workbooks look like a nightmare. For example...
In Word, it looks like you could delete a paragraph simply by deleting the corresponding "w:p" tag. However, the supplied code snippet for deleting a row in Excel takes about 150 lines of code(!).
The reason the Excel code is so big is that deleting a row means updating the row indexes of all the subsequent rows, fixing up the "shared strings" table, etc. According to a comment at the top, the code snippet is not even complete, in that it won't deal with a workbook that has tables in it (I can live with that).
What I'm not clear on is whether that's the only restriction that the sample code has. For example, would there also be a problem if the workbook contained a Pivot Table? Or a chart that references data from the same sheet? Or some named ranges? Wouldn't you also have to update the formulae for any cells (etc.) that referenced a row whose row index had changed?
[That's not to mention the "calc chain", which (thankfully) I think you can simply delete since it's only a chache that can be re-built.]
And that's my question, woolly though it is. Just how hard do you have to work do something as simple as deleting a row properly? Is it an insurmountable task?
Also, if there are other, similar issues either with Excel or with Word or PowerPoint, I'd love to hear about them now, before I waste too much time going down a blind alley. Thanks.
Having worked with the Open XML SDK 2.0 for almost two years now I can say that doing seemingly trivial tasks can take many hours and sometimes days to figure out how to do it properly. For example, deleting an Excel row should be fairly straightforward and easy to do right? Nope because not only do you need code to delete your row, but then you have to update all the row indices, update any merged cell references, update hyperlink references, etc. Our internal delete method is close to 500 lines of code to just delete a row and I'm sure we don't have all the cases accounted for either.
The biggest complaint I have is the lack of documentation on how to do the most common tasks. The MSDN section on the Open XML SDK is very limited and whenever you need to do anything complicated you are really on your own. I've had to read the Open XML standard a lot to figure out what certain elements mean and how they should be implemented since I could find very little online.
The other challenging part is if you insert an element in a spot where it doesn't belong or put an invalid attribute on an element you will get a corrupt file when you try and open it. Most of the time you will not get any information on what caused the error and you will have to look at the Open XML standard spec to see what you did wrong.
If you need a fast turnaround time on converting that Office automation code into Open XML and what you are doing is not really basic, then I would say pass. If you have time and the patience to read up on the Word, Excel and PowerPoint XML structures and get familiar with how they relate then I say go for it. In my opinion it is really the only way to have very fine control over these office documents, but there will be a great learning curve when you start.
Oh and just for fun here is how much code is needed to add a comment to an Excel cell.
Just for completeness, here are some libraries I found for working with Excel XML:
www.extremexml.com - a layer on top of the Open XML SDK classes; focusses on injecting data into an existing spreadsheet; handles many of the cross-reference problems I identified in my question. Open source but GPL2 not LGPL. Code looks nice, and documentation is excellent. Does not appear terribly active on codeplex though.
Closed XML - another layer on top of the Open XML SDK - again open source, but with a less restrictive license (MIT). Looks nice, and looks more "active" than the above.
SpreadsheetLight - from what I can tell, a closed-source library sitting atop the Open XML SDK classes. Targeted more at those looking to create a spreadsheet from scratch rather than making changes to existing spreadsheets.
Here is another third party library dedicated to working with OpenXML:
http://www.officewriter.com
In the example cited by amurra above of deleting Excel spreadsheet rows, this is a single method call with this tool. It updates formulas and all the other references for which it seems that 500 lines of code would be required for otherwise.
The OpenXML SDK itself is a great tool for very simple things, but you still have to concern yourself with a lot of the internals of the file format and packaging structure to get things really right.
Here are some additional libraries that can manipulate with OOXML formats:
- GemBox.Spreadsheet (XLSX)
- GemBox.Document (DOCX)
Also GemBox published some articles that demonstrate how to manipulate with OOXML file format with pure .NET (without a use of any library), I think you'll find this interesting:
www.codeproject.com/Articles/15593/Read-and-write-Open-XML-files-MS-Office
(Introduction to SpreadsheetML format and an explanation on how we can read and write worksheet's cell content)
www.codeproject.com/Articles/649064/Show-Word-File-in-WPF
(Introduction to WordprocessingML format and demonstration on how we can read document's text)

Convert richtext strings to excel

I have a form that has TinyMCE for richtext formatting. All of our data is available to export as an HTML report, PDF Report, and Excel Spreadsheet (report).
The fields, that we allow richtext in, show up as the formatted values in both the HTML and PDF reports, but in Excel we show them as strings. For instance:
<b>this part is bold</b><br />line 2 here.
I need a way to make that show up as bold/line-break in excel rather then just showing that string, or at least a way to strip the HTML tags out of there and just show plain text (though I would really like to at least keep the line breaks). Is there some type of macro I can include in the excel download or some C++ program that can convert it or something?
Thanks for your time!
I've done something similar with PHPExcel
The trick is to take your formatted data and find a pattern. In your case, it would probably be table rows/table cells. Iterate through that structure setting the excel cell values as you go. For complex formatting you could fairly simply regex replace what is necessary to get formatted as you desire. The theory may sound a little complicated, but once you get down to it, it's only an hour or two's worth of work.
Certainly there are equivalent programs based on other server technologies. But this one has worked brilliantly for me over the years, and I trust it to work on sites for very big clients with crazy inbound traffic numbers...and it's never failed. It's the only reliable way I've found to write perfect, properly formatted Excel without requiring the user to jump through hoops to get a specific browser.

Resources