Sorting data from excel spreadsheets into new files - excel

so my issue comes from the excel data I currently have which I need to convert into 4 separate forms, each with different details. The specfics don't really matter, but what I'm trying to do is code some kind of script that would do into this data and extract the stuff I need, therefore saving me tons of time copying and pasting.
The problem is, i'm not really sure where to start. I have done some research so I am familiar with csv files and I already have a pretty good grasp on java. What would be the best way to approach this problem, from what I have researched, python is very helpful at these string type manipulations, but I also know that it could be done in java using buffered reads/file writes, but I feel like that could get really clunky.
Thanks

Related

How to properly write Excel macros for other people

I recently started writing VBA macros, even though I never learned VBA. So basically I am translating my python knowledge to VBA, which works most of the time.
However I want to improve my coding skills, especially when it comes to working with other people.
Usually I am writing a Macro and then install it on the computer of my colleagues, so that they can use it as well. But the way we develope the macro is currently a try-and-error approach, so I have to edit the code quite often. Currently I just copy paste the new lines into an Email and the person has to exchange that line by themselves.
My Question: How do professionals handle updating VBA Macros? Is there something like "patches"? Or can you have a shared "Personal.xlsb", so that all changes are automatically updated for everyone else as well? I am the only person editing the code anyway, the others just use it.
I googled a lot and also looked through stackoverflow, but only found this: How to write a patch to Excel VBA codes? where the answer basically was, that you can't write patches. But there surely must be a way, to update my macros in a better way, right?
Thanks in advance!

Can Haskell [easily] do COM?

Alright, so I don't really know much about COM. What I do know is that if you write code in one of the Microsoft-sponsored programming languages, then you can write something like 3 lines of code to launch Excel, open a blank workbook, stuff some data into the cells and tell Excel to graph it. But I have no idea how this black magic actually works; all I know is that it's related to COM somehow.
Is it possible to do this kind of thing with Haskell? Is it "easy", or is it going to be hellishly difficult? Because if it's easy, I might try and get this to work, but if it's really hard, there are simpler ways to make Excel graph things...
I'm aware that you don't actually need to learn COM just to graph stuff. (E.g., I could use GraphVis or GNUplot, or Google Chart, or write a small Cairo function, or...) I'm interested in how easy or hard it is to do COM with Haskell, and this is just a motivating example.
HDirect used to be the standard, as it was last uploaded 3 years ago I imagine it's bitrotted a fair bit.
Looks like there's a new package aimed at doing the same sorts of things.
Sorry, I'm may be a little bit late.
There is already someone who plays with excel:
Excel Automation with haskell gives a seg fault
I've written some scripts who communicate with Clearcase and Clearquest.
It was quite easy until I get problems with variant StringArray (look
at my question on SO).
I've used HDirect in order to generate the Haskell glue code. The procedure is:
launch the OLE/COM Object Viewer and select View Typelib in the File menu
select the DLL (e.g. ccauto.dll for Clearcase)
save the IDL file
run HDirect on this IDL file in order to get the haskell glue code
import it in your project

Does a good wrapper class and/or library for open xml Excel editing exist?

I'm looking for a nice library for editing and/or generating Excel documents on our Windows server. I feel that the open xml sdk is probably the way to go, but to me the learning curve seems steep and our dev time is limited. I think that it just shouldn't be that difficult to edit an Excel document. I'm ready to reinvent the wheel, but thought it would be worthwhile to ask first whether there is a good project/library out there that wraps open xml and makes interacting with Excel easier.
In this official MS tutorial, the code that retrieves the value of a cell is dozens of lines long.
http://msdn.microsoft.com/en-us/library/bb739834.aspx?cs-save-lang=1&cs-lang=vb#ManipulateOpenXMLExcelPowerPoint_RetrievetheValueofaCellinaWorksheet
That seems incredibly unwieldy, and I'm hoping for a better interface to this functionality.
Note: I've puzzled about how to make this question more StackOverflow friendly, but I don't know of a better way.
One of the comments mentioned a wrapper library that is exactly what I've been looking for. closedxml.codeplex.com
Try SpreadsheetLight. Disclaimer: I wrote SpreadsheetLight.
And this question is possibly a duplicate of this:
OpenXML libraries (alternatives to ClosedXML)
I use Simple OOXML and it's quite neat. And free.

Data manipulating environment

I am looking for something* to aid me in manipulating and interpreting data.
Data of the names, addresses and that sorts.
Currently, I am making heavy use of Python to find whether one piece of information relate to another, but I am noticing that a lot of my code could easily be substituted with some sort of Query Language.
Mainly, I need an environment where I can import data in any format, be it xml, html, csv, or excel or database files. And I wish for the software to read it and tell me what columns there are etc., so that I can only worry about writing code that interprets it.
Does this sound concrete enough, if so, anyone in possession of such elegant software?
*Can be a programming language, IDE, combination of those.
Have you looked at the Pandas module in Python? http://pandas.pydata.org/pandas-docs/stable/
When combined with Ipython notebook, it makes a great data manipulation platform.
I think it may let you do a lot of what you want to do. I am not sure how well it handles html, but it's built to handle csv, excel and database files

What would you switch to using instead of excel in a corporate workplace, when you aren't a programmer by trade?

I have a friend that is working on a company without any real IT people, and they've gone the classical corporate route of stringing things together with Excel macros whenever they need something. I was trying to figure out what alternatives are available for someone that isn't a programmer by trade.
What is an easy alternative to Excel when you want to distribute data offline together with forms for manipulating it, that doesn't have a steep learning curve? I was going to suggest he learn Python and SQL-lite, but I'm hoping StackOverflow can come up with a wiser answer.
Honestly, for non developers (and if you do not have a dev staff in-house) there really isn't anything wrong with Excel.
That being said, Lightswitch is a new and fairly interesting option for basic forms over data work (although it's still a bit green).
IMO once you go down the route of languages like Python, etc. you're really looking at someone who is going to have to be a programmer (and they may be shooting themselves in the foot on a regular basis).
In that type of environment users end up with Excel or Access to manipulate data. Excel is convenient where cells in row are calculated the same way, but with exceptional cases. Access is better for calculating over multiple rows easily, data management forms (yes Excel can do it too, but Access is easier) and formatted reports.
The best situation I've arrived at in this type of environment is standing up read only "data warehouse" that Excel and Access users can link or download data from to manipulate on their own. For this situation SQL Server is probably the right choice and I use quotes around "data warehouse" because I don't mean it in the technical sense, but rather just a convenient repository. That way you have one definitive system of record. Then any report generated in either tool repeatedly becomes a candidate for incorporation into that warehouse.

Resources