Convert excel formula code to programming language - excel

Hi I have this code and I want to translate it to any other programming language such as Python Java ruby
As you can see this is a raffle game user presses f5
And generates a raffle number such as Aqua 2231 7533 and stores all entries and at the end makes a random selection
This part is generation part
=LOOKUP(RANDBETWEEN(1,3),{1,2,3},{"Aqua","Blue","Red"})&" "&TEXT(RANDBETWEEN(0,9999),"0000")&" "&TEXT(RANDBETWEEN(0,9999),"0000")
And this part is the random selection
=INDEX(A:A,RANDBETWEEN(1,COUNTA(A:A)))

There are a couple of libraries which may be able to help. I don't know if they support the LOOKUP, RANDBETWEEN or TEXT functions specifically but as they are open source libraries there's potential for you, or someone you know, to add that particular functionailty (all of these projects welcome such input).
The libraries for Python that read Excel and generate Python code are;
PyCel
Formulas and Schedula
xlcaclulator (This is my project)
Koala
In each case these libraries have an "evaluator" which can actually run the generated Python code. Some of these evaluators are integrated with the Excel file reading functionality, in other cases it's separate.
PyCel and Koala2 have the evaluation functionality integrated with the Excel file reading functions. xlcalculator has them in separate objects. Formulas and Schedula have them in separate projects.
It's not easy for them to print out or save "vanilla" Python code due to what needs to happen to evaluate the generated code. The evaluation code needs access to the Python implementation of the Excel formula and, as things are at the moment, all of these projects use their own libraries with their own objects and their own mechanisms to call said objects. xlcalculator has attempted to extract the Python implementations of the Excel formulas to have a library anyone can use but it is also the most recently written so who knows how successful that will become. That library is called xlfunctions.
Providing the above mentioned projects support LOOKUP, RANDBETWEEN and TEXT functions, they will be able to read the function you've written (without the need for Excel to be installed), translate the function into Python code and execute the resulting code in Python.

Shouldn't be that hard.. Just :
Copy the formula & break it up in notepad++
For each function used, read up WORD-BY-WORD its documented documentation.
One you understand the data flow/change/list/start/end in the program.. use your mother tongue language, Ink it (the understanding) down. you may draw if it helps..
In the programming language of your choice, implement the "mother tongue" algorithm in that language.
Test the program with exactly the same input you gave to the original excel formula.. edit if necessary.
Since you didn't share ANY form of tries.. this is the best I have.. which works for me.. sorry if hurts/didn't solve..
Hope someone can improve/edit my answer for the better. (:

Related

Can Haskell [easily] do COM?

Alright, so I don't really know much about COM. What I do know is that if you write code in one of the Microsoft-sponsored programming languages, then you can write something like 3 lines of code to launch Excel, open a blank workbook, stuff some data into the cells and tell Excel to graph it. But I have no idea how this black magic actually works; all I know is that it's related to COM somehow.
Is it possible to do this kind of thing with Haskell? Is it "easy", or is it going to be hellishly difficult? Because if it's easy, I might try and get this to work, but if it's really hard, there are simpler ways to make Excel graph things...
I'm aware that you don't actually need to learn COM just to graph stuff. (E.g., I could use GraphVis or GNUplot, or Google Chart, or write a small Cairo function, or...) I'm interested in how easy or hard it is to do COM with Haskell, and this is just a motivating example.
HDirect used to be the standard, as it was last uploaded 3 years ago I imagine it's bitrotted a fair bit.
Looks like there's a new package aimed at doing the same sorts of things.
Sorry, I'm may be a little bit late.
There is already someone who plays with excel:
Excel Automation with haskell gives a seg fault
I've written some scripts who communicate with Clearcase and Clearquest.
It was quite easy until I get problems with variant StringArray (look
at my question on SO).
I've used HDirect in order to generate the Haskell glue code. The procedure is:
launch the OLE/COM Object Viewer and select View Typelib in the File menu
select the DLL (e.g. ccauto.dll for Clearcase)
save the IDL file
run HDirect on this IDL file in order to get the haskell glue code
import it in your project

Access Excel Programmatically

I am given a huge excel file with a great deal of data and formulas and I need to access the functionality implemented therein to create a web service running on MacOSX. I tried at first to use POI to access its functionality from Java but unfortunately POI does not support some functions like MINVERSE, TRANSPOSE, MMULT and others and I found it particularly difficult to implement them (also see this question).
Is there a programmatic way to access the excel sheet? I found RCOMClient for R but it seems it works onl on Windows. Perl, Python, R, Octave, C++, Java or anything if fine provided that it provided full functionality over Excel formulas.
Python Excel: Why Python Excel in NOT an option for my problem. In the tutorial it is written that:
A relative reference is useful only if you have external knowledge of
what cells can be used as the origin. Many formulas found in Excel
files include function calls and multiple references and are not
useful, and can be too hard to evaluate. A full calculation engine is
not included in xlrd.
Unfortunately I need a very good support of the Excel engine.
Sure you can.
Have you checked the wonderful Python Excel project ?
Even better : you could take a look at that stupendous blog post.
Have fun and keep us posted if you need more help.
Don't bother accessing Excel from another language than Python, as Python is where you'll waste the lower amount of time.

Data manipulating environment

I am looking for something* to aid me in manipulating and interpreting data.
Data of the names, addresses and that sorts.
Currently, I am making heavy use of Python to find whether one piece of information relate to another, but I am noticing that a lot of my code could easily be substituted with some sort of Query Language.
Mainly, I need an environment where I can import data in any format, be it xml, html, csv, or excel or database files. And I wish for the software to read it and tell me what columns there are etc., so that I can only worry about writing code that interprets it.
Does this sound concrete enough, if so, anyone in possession of such elegant software?
*Can be a programming language, IDE, combination of those.
Have you looked at the Pandas module in Python? http://pandas.pydata.org/pandas-docs/stable/
When combined with Ipython notebook, it makes a great data manipulation platform.
I think it may let you do a lot of what you want to do. I am not sure how well it handles html, but it's built to handle csv, excel and database files

Detecting programming language from a snippet [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
What would be the best way to detect what programming language is used in a snippet of code?
I think that the method used in spam filters would work very well. You split the snippet into words. Then you compare the occurences of these words with known snippets, and compute the probability that this snippet is written in language X for every language you're interested in.
http://en.wikipedia.org/wiki/Bayesian_spam_filtering
If you have the basic mechanism then it's very easy to add new languages: just train the detector with a few snippets in the new language (you could feed it an open source project). This way it learns that "System" is likely to appear in C# snippets and "puts" in Ruby snippets.
I've actually used this method to add language detection to code snippets for forum software. It worked 100% of the time, except in ambiguous cases:
print "Hello"
Let me find the code.
I couldn't find the code so I made a new one. It's a bit simplistic but it works for my tests. Currently if you feed it much more Python code than Ruby code it's likely to say that this code:
def foo
puts "hi"
end
is Python code (although it really is Ruby). This is because Python has a def keyword too. So if it has seen 1000x def in Python and 100x def in Ruby then it may still say Python even though puts and end is Ruby-specific. You could fix this by keeping track of the words seen per language and dividing by that somewhere (or by feeding it equal amounts of code in each language).
class Classifier
def initialize
#data = {}
#totals = Hash.new(1)
end
def words(code)
code.split(/[^a-z]/).reject{|w| w.empty?}
end
def train(code,lang)
#totals[lang] += 1
#data[lang] ||= Hash.new(1)
words(code).each {|w| #data[lang][w] += 1 }
end
def classify(code)
ws = words(code)
#data.keys.max_by do |lang|
# We really want to multiply here but I use logs
# to avoid floating point underflow
# (adding logs is equivalent to multiplication)
Math.log(#totals[lang]) +
ws.map{|w| Math.log(#data[lang][w])}.reduce(:+)
end
end
end
# Example usage
c = Classifier.new
# Train from files
c.train(open("code.rb").read, :ruby)
c.train(open("code.py").read, :python)
c.train(open("code.cs").read, :csharp)
# Test it on another file
c.classify(open("code2.py").read) # => :python (hopefully)
Language detection solved by others:
Ohloh's approach: https://github.com/blackducksw/ohcount/
Github's approach: https://github.com/github/linguist
Guesslang is a possible solution:
http://guesslang.readthedocs.io/en/latest/index.html
There's also SourceClassifier:
https://github.com/chrislo/sourceclassifier/tree/master
I became interested in this problem after finding some code in a blog article which I couldn't identify. Adding this answer since this question was the first search hit for "identify programming language".
An alternative is to use highlight.js, which performs syntax highlighting but uses the success-rate of the highlighting process to identify the language. In principle, any syntax highlighter codebase could be used in the same way, but the nice thing about highlight.js is that language detection is considered a feature and is used for testing purposes.
UPDATE: I tried this and it didn't work that well. Compressed JavaScript completely confused it, i.e. the tokenizer is whitespace sensitive. Generally, just counting highlight hits does not seem very reliable. A stronger parser, or perhaps unmatched section counts, might work better.
First, I would try to find the specific keyworks of a language e.g.
"package, class, implements "=> JAVA
"<?php " => PHP
"include main fopen strcmp stdout "=>C
"cout"=> C++
etc...
It's very hard and sometimes impossible. Which language is this short snippet from?
int i = 5;
int k = 0;
for (int j = 100 ; j > i ; i++) {
j = j + 1000 / i;
k = k + i * j;
}
(Hint: It could be any one out of several.)
You can try to analyze various languages and try to decide using frequency analysis of keywords. If certain sets of keywords occur with certain frequencies in a text it's likely that the language is Java etc. But I don't think you will get anything that is completely fool proof, as you could name for example a variable in C the same name as a keyword in Java, and the frequency analysis will be fooled.
If you take it up a notch in complexity you could look for structures, if a certain keyword always comes after another one, that will get you more clues. But it will also be much harder to design and implement.
It would depend on what type of snippet you have, but I would run it through a series of tokenizers and see which language's BNF it came up as valid against.
I needed this so i created my own.
https://github.com/bertyhell/CodeClassifier
It's very easily extendable by adding a training file in the correct folder.
Written in c#. But i imagine the code is easily converted to any other language.
Best solution I have come across is using the linguist gem in a Ruby on Rails app. It's kind of a specific way to do it, but it works. This was mentioned above by #nisc but I will tell you my exact steps for using it. (Some of the following command line commands are specific to ubuntu but should be easily translated to other OS's)
If you have any rails app that you don't mind temporarily messing with, create a new file in it to insert your code snippet in question. (If you don't have rails installed there's a good guide here although for ubuntu I recommend this. Then run rails new <name-your-app-dir> and cd into that directory. Everything you need to run a rails app is already there).
After you have a rails app to use this with, add gem 'github-linguist' to your Gemfile (literally just called Gemfile in your app directory, no ext).
Then install ruby-dev (sudo apt-get install ruby-dev)
Then install cmake (sudo apt-get install cmake)
Now you can run gem install github-linguist (if you get an error that says icu required, do sudo apt-get install libicu-dev and try again)
(You may need to do a sudo apt-get update or sudo apt-get install make or sudo apt-get install build-essential if the above did not work)
Now everything is set up. You can now use this any time you want to check code snippets. In a text editor, open the file you've made to insert your code snippet (let's just say it's app/test.tpl but if know the extension of your snippet, use that instead of .tpl. If you don't know the extension, don't use one). Now paste your code snippet in this file. Go to command line and run bundle install (must be in your application's directory). Then run linguist app/test.tpl (more generally linguist <path-to-code-snippet-file>). It will tell you the type, mime type, and language. For multiple files (or for general use with a ruby/rails app) you can run bundle exec linguist --breakdown in your application's directory.
It seems like a lot of extra work, especially if you don't already have rails, but you don't actually need to know ANYTHING about rails if you follow these steps and I just really haven't found a better way to detect the language of a file/code snippet.
This site seems to be pretty good at identifying languages, if you want a quick way to paste a snippet into a web form, rather than doing it programmatically: http://dpaste.com/
Nice puzzle.
I think it is imposible to detect all languages. But you could trigger on key tokens. (certain reserved words and often used character combinations).
Ben there are a lot of languages with similar syntax. So it depends on the size of the snippet.
Prettify is a Javascript package that does an okay job of detecting programming languages:
http://code.google.com/p/google-code-prettify/
It is mainly a syntax highlighter, but there is probably a way to extract the detection part for the purposes of detecting the language from a snippet.
I wouldn't think there would be an easy way of accomplishing this. I would probably generate lists of symbols/common keywords unique to certain languages/classes of languages (e.g. curly brackets for C-style language, the Dim and Sub keywords for BASIC languages, the def keyword for Python, the let keyword for functional languages). You then might be able to use basic syntax features to narrow it down even further.
I think the biggest distinction between languages is its structure. So my idea would be to look at certain common elements across all languages and see how they differ. For example, you could use regexes to pick out things such as:
function definitions
variable declarations
class declarations
comments
for loops
while loops
print statements
And maybe a few other things that most languages should have. Then use a point system. Award at most 1 point for each element if the regex is found. Obviously, some languages will use the exact same syntax (for loops are often written like for(int i=0; i<x; ++i) so multiple languages could each score a point for the same thing, but at least you're reducing the likelihood of it being an entirely different language). Some of them might scores 0s across the board (the snippet doesnt contain a function at all, for example) but thats perfectly fine.
Combine this with Jules' solution, and it should work pretty well. Maybe also look for frequencies of keywords for an extra point.
Interesting. I have a similar task to recognize text in different formats. YAML, JSON, XML, or Java properties? Even with syntax errors, for example, I should tell apart JSON from XML with confidence.
I figure how we model the problem is critical. As Mark said, single-word tokenization is necessary but likely not enough. We will need bigrams, or even trigrams. But I think we can go further from there knowing that we are looking at programming languages. I notice that almost any programming language has two unique types of tokens -- symbols and keywords. Symbols are relatively easy (some symbols might be literals not part of the language) to recognize. Then bigrams or trigrams of symbols will pick up unique syntax structures around symbols. Keywords is another easy target if the training set is big and diverse enough. A useful feature could be bigrams around possible keywords. Another interesting type of token is whitespace. Actually if we tokenize in the usual way by white space, we will loose this information. I'd say, for analyzing programming languages, we keep the whitespace tokens as this may carry useful information about the syntax structure.
Finally if I choose a classifier like random forest, I will crawl github and gather all the public source code. Most of the source code file can be labeled by file suffix. For each file, I will randomly split it at empty lines into snippets of various sizes. I will then extract the features and train the classifier using the labeled snippets. After training is done, the classifier can be tested for precision and recall.
I believe that there is no single solution that could possibly identify what language a snippet is in, just based upon that single snippet. Take the keyword print. It could appear in any number of languages, each of which are for different purposes, and have different syntax.
I do have some advice. I'm currently writing a small piece of code for my website that can be used to identify programming languages. Like most of the other posts, there could be a huge range of programming languages that you simply haven't heard, you can't account for them all.
What I have done is that each language can be identified by a selection of keywords. For example, Python could be identified in a number of ways. It's probably easier if you pick 'traits' that are also certainly unique to the language. For Python, I choose the trait of using colons to start a set of statements, which I believe is a fairly unique trait (correct me if I'm wrong).
If, in my example, you can't find a colon to start a statement set, then move onto another possible trait, let's say using the def keyword to define a function. Now this can causes some problems, because Ruby also uses the keyword def to define a function. The key to telling the two (Python and Ruby) apart is to use various levels of filtering to get the best match. Ruby use the keyword end to finish a function, whereas Python doesn't have anything to finish a function, just a de-indent but you don't want to go there. But again, end could also be Lua, yet another programming language to add to the mix.
You can see that programming languages simply overlay too much. One keyword that could be a keyword in one language could happen to be a keyword in another language. Using a combination of keywords that often go together, like Java's public static void main(String[] args) helps to eliminate those problems.
Like I've already said, your best chance is looking for relatively unique keywords or sets of keywords to separate one from the other. And, if you get it wrong, at least you had a go.
Set up the random scrambler like
matrix S = matrix(GF(2),k,[random()<0.5for _ in range(k^2)]); while (rank(S) < k) : S[floor(k*random()),floor(k*random())] +=1;

Convert Excel 4 macros to VBA

I have an old Excel 4 macro that I use to run monthly invoices. It is about 3000 lines and has many Excel 5 Dialog Box sheets (for dialog boxes). I would like to know what the easiest way would be to change it into VBA and if it is worth it. Also, if once I have converted it to VBA, how to create a standalone application out of it?
I have attempted this before and in the end you do need to rewrite it as Biri has said.
I did quite a bit of this work when our company was upgrading from Windows NT to Windows XP. I often found that it is easier not to look at the old code at all and start again from scratch. You can spend so much time trying to work out what the Excel 4 did especially around the "strange" dialog box notation. In the end if you know what the inputs are and the outputs then it often more time effective and cleaner to rewrite.
Whether to use VBA or not is in some ways another question but VBA is rather powerful and extensible so although I would rather use other tools like .NET in many circumstances it works well and is easy to deploy.
In terms of is it worth it? If you could say that you were never ever going to need to change your Excel 4 macro again then maybe not. But in Business there is always something that changes eg tax rates, especially end of year things. Given how hard it is to find somone to support Excel 4 and even find documentation on it I would say it is risky not to move to VBA but that is something to balance up.
(Disclaimer: I develop the Excel-DNA library)
Instead of moving the macro to VBA, which uses a COM automation object model that differs from the Excel macro language, rather move it to VB.NET (or C#) running with the Excel-DNA library.
Excel-DNA allows you to use the C API, which mirrors the macro language very closely. For every macro command, like your dialogs, there is a C API function that takes the same parameters. For example, the equivalent of DIALOG.BOX would be xlfDialogBox - there is some discussion and example in this thread: http://groups.google.com/group/exceldna/browse_thread/thread/53a8253269fdf0a5.
One big advantage of this move is that you can then gradually change parts of your code to use the COM automation interface - Excel-DNA allows you to mix and match the COM interfaces and C API.
AFAIK there is no possibility to somehow convert it. You have to basically rewrite in in VBA.
If you have converted to VBA, you cannot run as a standalone application. As VBA states Visual Basic for Application, it is living inside an application (Word, Excel, Scala, whatever).
You have to learn a standard language (not a macro-language) to create standalone applications. But you have to learn much more than the language itself. You have to learn different techniques, for example database handling instead of Excel sheet handling, printing instead of Excel printing, and so on. So basically you will lose a lot of function which is evident if you use Excel.
Here is a good artikel about this topic: http://msdn.microsoft.com/en-us/library/aa192490.aspx
You can download VB2008-Express for free at: http://www.microsoft.com/express/default.aspx

Resources