A new idea on how to beat the 32,767 text limit in Excel - excel

So as many others have asked in the past is there a way to beat the 32k limit per cell in Excel?
I have found ways to do it by splitting the work load into two different .txt files and then merging the two .txt files, however it is a giant PITA and more often then not I end up only using excel to its limits as I do not have time to validate the data after .txt file merges anymore this is a long process and tedious IMO.
However I think that if the limitation is there it is there because it was coded when Microsoft developed Excel, and since they have yet to raise it (2013 version the limit is still the same limit so it would do no good to upgrade)
I also know that many will say if you have a need for information in a single cell in that length then you should use ACCESS well I have no idea how to use ACCESS or how to import a tab delimited file into ACCESS like you would into EXCEL, and then even if I could figure that out I still now have to figure out how to learn all the new commands and he EXCEL equivalents if there is even such a thing.
So I was browsing some blog posts the other day on how to beat limitations by software and I read something about reverse engineering.
Would it be possible to load excel into a hex editor, go in and change every instance of 32767 to something greater?

While 32767 may seem like an arbitrary number, it's actually the upper limit of a 16-bit signed integer (called a short in C). The range of a short goes from -32768 to 32767.
A 16-bit integer can also be unsigned, in which case its range is 0 to 65535.
Since it's impossible for a cell to have a negative number of characters, it seems odd that Microsoft would limit a cell's length based on a signed rather than unsigned 16-bit integer. When they wrote the original program, they probably couldn't imagine anyone storing so much information in a single cell. Using shorts may have simplified the code. (My first computer had only 4K of memory, so it's still amazing to me that Excel can store 8 times that much information in a single cell.)
Microsoft may have kept the 32767 limit to maintain backward compatibility with previous versions of Excel. However, that doesn't really make sense, because the row and column counts greatly increased in recent versions of Excel, making large spreadsheets incompatible with previous versions.
Now to your question of reverse-engineering Excel. It would be a gargantuan task, but not impossible. In the early '90s, I reverse-engineered and wrote vaccines for a few small computer viruses (several hundred bytes). In the '80s, I reverse-engineered an 8KB computer chess program.
When reverse-engineering an executable, you'll need a good disassembler or decompiler. Depending on what you use, you may get assembly-language or C code as the output. But note that this will not be commented code, and you will not see meaningful variable or function names. You'll have to read every line of code to determine what it does. And you'll quickly discover that the executable is the least of your worries. Excel's executable links in a number of DLL files, which would also need reverse-engineering.
To be successful, you will need an extensive knowledge of Windows programming in addition to C or Intel assembly code – not to mention a large amount of patience. Learning Access would be a much simpler task.
I'd be interested in why 32767 is insufficient for your needs. A database may make more sense, and it wouldn't necessarily need to duplicate the functionality of Excel. I store information in a database for output to Web pages, in which case I use HTML+JavaScript for anything that needs to be interactive.

In case anyone is still having this issue:
I had the same problem with generating a pipe-separated file of longitudinal research data. The header row exceeded the 32767 limit. Not an issue unless the end-user opens the file in excel. Work around is to have end-user open file in google sheets, perform the text-to-columns transformation, then download and open file in excel.
https://support.clarivate.com/ScientificandAcademicResearch/s/article/Web-of-Science-Length-limit-of-cell-contents-in-Excel-when-opening-exported-bibliographic-data?language=en_US

Jack Straw from Wichita (https://stackoverflow.com/users/10327211/jack-straw-from-wichita) surely you can do an import of a pipe separated file directly into Excel, using Data>Get Data? For me it finds the pipe and treats the piped file in the same way as a CSV. Even if for you it did not, you have an option on the import to specify the separator that you are using in your text file.
Kind regards
Sefton Hall

Related

Standard method for creating file read/write 'library'?

I am attempting to create a file read/write 'library' using Excel VBA for an old file format. Just for info, the format is LIS79, an old oil industry format for writing well-site data to tape - largely streams of wellbore measurements like density, resistance, temperature etc. The full spec is here (PDF doc) - it's pretty long and boring.
As I'm using Excel VBA I guess it's not really a library, but a collection of subs and functions etc.
I have been tapping away for a month or so and am making good progress, though it's becoming increasingly complicated - the number of subs and functions I need to write keeps on growing.
I figure I'm probably not the fist person to try and write a file read/write library from it's specification. So I've been searching StackExchange, searched using Google and browsed programming books at the local university library, to see if any sort of standard process or method might exist that would make the whole task a bit simpler. Alas I haven't been able to find anything, though I have no formal programming background so it's possible I don't know precisely what to search for.
Does anyone know of a standard method, procedure or guidelines for creating file read/write libraries that you could refer me to?
Or is it just a matter of persevering - file read/write libraries can be long and complicated things to create?
Many thanks.

What is the common knowledge about NPOI, EPPlus and Koogra as of 2015?

Yes, Koogra only reads. EPPlus only supports .xlsx and is buggy in edge cases.
What else should one know for choosing between them?
Is one of them much slower than others?
NPOI seems way to complicated and is a Java port, so is it worth
using?
Should one use EPPlus for .xlsx and NPOI for .xls?
What is the general knowledge about them today?
Jet/ACE OLE DB either read worksheet as strings, or as typed columns, so you either lose numbers precision or you must have headers in the first row. Thus, they are to be avoided.
No library supports XLSB.
Speed.
For a large XLS, the time of reading for NPOI:Jet:Koogra:EDR is 14:8:7:5.
For the same XLSX, the time for EPPlus:NPOI:Koogra:EDR is 52:36:20:16.
For relatively small files with many tabs EPPlus can be a bit faster than EDR.
Errors (#DIV/0!, #VALUE!) etc.
EDR and Koogra don't explicitly support errors. EDR reads them as usual strings, Koogra -- as blank cells.
NPOI and EPPlus do.
Koogra reads dates as [OLE date] numbers and they are undistinguishable from real numbers. Also it sometimes reads numbers with many decimals digits incorrectly. EDR gets this fine. So, no to Koogra.
NPOI is complicated, 5 dlls of 4 MB. Koogra and EDR are simple, 200 KB and two dlls (themselves and zip) each.
EDR works as a IDataReader, so it reads data sequentially. It also has built-in function to get a DataSet. With sequential read yoou can only go through first sheet in the work book. Koogra supports random access to cells and sheets.
EDR is based on SharpZip, Koogra is based on Ionic.Zip. The former allows to open a file from .zip a Stream which can be benefical for other parts of the project.
I haven't looked at writing aspects of NPOI, so without the need to distinguish errors, I would go with EPPlus for .xlsx and with EDR for reading .xls.

Importing from Excel to Visma Contracting

My company is using this program called Visma Contracting. This program is used to type inn a product number (example: 10 240 75) and how many of this item was used (example: 4). So these numbers we (me and my co-workers) put into an excel sheet and deliver to the guy in charge of the systematizing these (A column being product number, an B column being the used amount). From there it is retyped from excel to Visma. This is madness! There must be a way for the two programs to talk to each other? I have talked to the Visma support and they are giving me nothing else then a no. I wish i could give more info about this Visma stuff, but i fear that it is a locked program. I have also been searching around for a 3rd party software that can eliminate this massive annoying problem, with no luck. Does anyone have anything that might ease my itch?
Thanks in advance!
Sounds impossible but...
If you want to try blind keystrokes check out Mouse and Keystroke Recorder I have some experience automating stuff with this program. It works sometimes with varying degrees of reliability.
Be forewarned that nobody recommends this as it could cause problems. It simply plays back keystrokes without being aware of what it is doing. When used with care it can work but it could be dangerous.
Or use SendKeys from Excel VBA; that might work better as the data is already in Excel. But the same warnings apply. Use at your own risk.

How to read excel(2007+ xlsx) sheet using actionscript(AIR)?

How to read excel(2007+ xlsx) sheet using actionscript(AIR)?
as3xls
An Actionscript 3 library for reading and writing Excel files. Currently reading numbers, text, and formulas from Excel version 2.0-2003 and writing numbers, text, and dates to Excel 2.0 is supported. No server-side help is needed.
SUPPORT INFORMATION
Documentation and samples are at http://code.google.com/p/as3xls/
I wrote this: https://github.com/childoftv/as3-xlsx-reader I'd love to know if it helps
Do you have any idea how... Inefficient this is?
Excel uses a complex setup for files, and unless you want to write a full-scale parser for its spreadsheets (which, believe me, will be difficult, alone to figure out what the format chars do), you'd be better off finding another solution.
Say, using a "save to XML" option would make your job a few thousand times easier, without exaggeration. AS3 has no native support for Excel, there is no real point for it to have such. But it has great integrated methods for working with XML.
If possible, save the Excel files to XML and parse those.
Better still, use databases, and parse them as XML through PHP.
I did a search and came up with this: http://code.google.com/p/php-excel-reader/
Once you've got it in PHP, passing it on to Flash is no problem at all. I'd recommend turning it into straight arrays of objects and converting it to AMF3 via Zend_Amf, AMFPHP or WebOrb, whichever one you're most comfortable with. You can then create tables, manipulate the data or whatever you like. It'd also be a lot faster and lighter than using XML.
PK
I took a look at the xlsx breakdown and it would take me 1 week to write an xlsx writer that could do basic formatting and formulas. I've only spent 1 hour perusing through the directories in an xlsx file and all you'd have to do is create the same directory structure...mostly cut and paste some strings..and then zip it and call it xlsx.
I tried this theory by manually making an xlsx file using 7zip. I downloaded childoftv's reader and, though I don't need the reader, the package includes a few zip/unzip classes that would prove helpful for anyone who wants to make a xlsx writer.
Long story short, the setup isn't complex, somebody just has to take a week out of their busy schedule to do it. I need this functionality so if nobody's done it yet, then I'll have to. Hopefully my search will find something better than a forum where the general consensus is "it's too hard, give up."

Free VB6/VBA profiler and best Excel practices

We have a lot of reports that are generated via VBA & Excel. Only a small percentage of the reports are actual calculations - the majority of the work is sql calls and formatting/writing of cells. The longest of which takes several hours, the majority takes around 20-30 mins each.
The VBA/Excel code plugs into a dll that the VB6 desktop apps use - it's here that all the sql calls are made. While I am sure that there is room for improvement here, it's not this that concerns me - the desktop apps are fairly snappy.
Two VBA functions are used in abundance: These are called GetRange and SetupCell and they nearly always appear together. The GetRange function is a wrapper for the Excel.Range object. It takes a sheet, and 4 values for the extents of the range. Its main use is to pick the cell for editing. There doesn't appear to be much chance of optmising it, but is it the best way?
Its partner is SetupCell. This takes a Excel.Range object, text and a dozen parameters about the cell (font, borders, etc). Most of these parameters are optional booleans but again, it seems very wasteful. Some of these can be set posthumously but some are dependant on the values contained in the cell.
There's quite a lot of code contained in these functions, mainly if statements and work won't appreciate me posting it.
I guess I've got two questions: Is there a better way and what is it and is there are free profiler that I can use to see if the bulk of the time is here or in the dll?
several hours is ridiculous for a report.
If the problem is VBA buy "Professional Excel Development" (stephen Bullen, Rob Bovey et al): this has a free VBA profiler called PerfMon.
If the problem is Excel Calculation see http://msdn.microsoft.com/en-us/library/aa730921.aspx?ppud=4
But I would guess that the problem is the high overhead associated with referencing things cell-by-cell: you should always work in large blocks of cells at a time.
Have you thought about using an actual reporting solution? What's your backend db? If you are using MSSQL 2000 or higher there is a fairly decent reporting solution you can use free of charge. SQL Server Reporting Services.
It sounds as if the reports are spending most of their time formatting cells. This could be why the reports seem so slow and the desktop app doesn't.
Alternatively, if you know the formatting before hand and it is fairly static, you could pre-format the sheets to cut down on some of the work.
I will throw this in there as well. Most reporting solutions will allow for conditional formatting and such, but since they are designed to work as such performance will be much better than having Excel do it.
This isn't a profiler recommendation, but it is a suggestion for speeding up Excel macros that are spending their time updating the screen. I've had excellent results by turning off screen updating while the macro is running: set Application.ScreenUpdating= False, and also using a number of other similar settings. Just be sure to turn them back on again when the macro finishes :P
It's not free but you can profile with this. I suspect the demo will be adequate to your needs: http://www.aivosto.com/vbwatch.html
It sounds like the VBA code (or the VB code that's writing to the sheets) is doing so line by line, this can take ages, and is poor design. Write to Excel as a variant in one go. Format the sheet after the data is all imported.
Thanks
Ross

Resources