I am looking for something* to aid me in manipulating and interpreting data.
Data of the names, addresses and that sorts.
Currently, I am making heavy use of Python to find whether one piece of information relate to another, but I am noticing that a lot of my code could easily be substituted with some sort of Query Language.
Mainly, I need an environment where I can import data in any format, be it xml, html, csv, or excel or database files. And I wish for the software to read it and tell me what columns there are etc., so that I can only worry about writing code that interprets it.
Does this sound concrete enough, if so, anyone in possession of such elegant software?
*Can be a programming language, IDE, combination of those.
Have you looked at the Pandas module in Python? http://pandas.pydata.org/pandas-docs/stable/
When combined with Ipython notebook, it makes a great data manipulation platform.
I think it may let you do a lot of what you want to do. I am not sure how well it handles html, but it's built to handle csv, excel and database files
Related
I am new to writing Python code. I have currently written a few modules for data analysis projects. The data is queried from AWS Redshift tables and summarized in CSVs and Excel spreadsheets.
At this point I do not want to pass it on other users in the org as I do not want to expose the code.
Is there an easy way to operationalize the code without exposing it?
PS: I am in the process of learning front-end development (Flask, HTML, CSS) so users can input data and get results back.
Python programs are almost always shipped as bare source. There are ways of compiling Python code into binaries, but this is not a common thing to do and usually I would not recommend it, as it's not as easy as one might expect (which is too bad, really).
That said, you can check out cx_Freeze and Cython.
I've always had a great interest in computer security and after reading:
Surreptitious Software: Obfuscation, Watermarking, and Tamperproofing for Software Protection
-I'd like to implement some of these algorithms, which assume that you are able to modify an executable at the binary level.
Using a HEX editor, I have accomplished to insert a simple checksum algorithm for tamper-proofing protection of a code region. However, this technique is not feasible in practice, so I'm looking for ways of automating this.
Are there any well-known frameworks or techniques that takes an executable as input and allow the programmer to work with the code in a programatically way? If not, what are my options?
By programatically, I mean, parse it into, e.g., a tree-like structure that can be written back out as a new (modified) executable.
Thank you for your time and interest.
I'm looking into using Cassandra to store 50M+ documents that I currently have in XML format. I've been hunting around but I can't seem to find anything I can really follow on how to bulk load this data into Cassandra without needing to write some Java (not high on my list of language skills!).
I can happily write a script to convert this data into any format if it would make the loading easier although CSV might be tricky given the body of the document could contain just about anything!
Any suggestions welcome.
Thanks
Si
If you're willing to convert the XML to a delimited format of some kind (i.e. CSV), then here are a couple options:
The COPY command in cqlsh. This actually got a big performance boost in a recent version of Cassandra.
The cassandra-loader utility. This is a lot more flexible and has a bunch of different options you can tweak depending on the file format.
If you're willing to write code other than Java (for example, Python), there are Cassandra drivers available for a bunch of programming languages. No need to learn Java if you've got another language you're better with.
Hi I have this code and I want to translate it to any other programming language such as Python Java ruby
As you can see this is a raffle game user presses f5
And generates a raffle number such as Aqua 2231 7533 and stores all entries and at the end makes a random selection
This part is generation part
=LOOKUP(RANDBETWEEN(1,3),{1,2,3},{"Aqua","Blue","Red"})&" "&TEXT(RANDBETWEEN(0,9999),"0000")&" "&TEXT(RANDBETWEEN(0,9999),"0000")
And this part is the random selection
=INDEX(A:A,RANDBETWEEN(1,COUNTA(A:A)))
There are a couple of libraries which may be able to help. I don't know if they support the LOOKUP, RANDBETWEEN or TEXT functions specifically but as they are open source libraries there's potential for you, or someone you know, to add that particular functionailty (all of these projects welcome such input).
The libraries for Python that read Excel and generate Python code are;
PyCel
Formulas and Schedula
xlcaclulator (This is my project)
Koala
In each case these libraries have an "evaluator" which can actually run the generated Python code. Some of these evaluators are integrated with the Excel file reading functionality, in other cases it's separate.
PyCel and Koala2 have the evaluation functionality integrated with the Excel file reading functions. xlcalculator has them in separate objects. Formulas and Schedula have them in separate projects.
It's not easy for them to print out or save "vanilla" Python code due to what needs to happen to evaluate the generated code. The evaluation code needs access to the Python implementation of the Excel formula and, as things are at the moment, all of these projects use their own libraries with their own objects and their own mechanisms to call said objects. xlcalculator has attempted to extract the Python implementations of the Excel formulas to have a library anyone can use but it is also the most recently written so who knows how successful that will become. That library is called xlfunctions.
Providing the above mentioned projects support LOOKUP, RANDBETWEEN and TEXT functions, they will be able to read the function you've written (without the need for Excel to be installed), translate the function into Python code and execute the resulting code in Python.
Shouldn't be that hard.. Just :
Copy the formula & break it up in notepad++
For each function used, read up WORD-BY-WORD its documented documentation.
One you understand the data flow/change/list/start/end in the program.. use your mother tongue language, Ink it (the understanding) down. you may draw if it helps..
In the programming language of your choice, implement the "mother tongue" algorithm in that language.
Test the program with exactly the same input you gave to the original excel formula.. edit if necessary.
Since you didn't share ANY form of tries.. this is the best I have.. which works for me.. sorry if hurts/didn't solve..
Hope someone can improve/edit my answer for the better. (:
so my issue comes from the excel data I currently have which I need to convert into 4 separate forms, each with different details. The specfics don't really matter, but what I'm trying to do is code some kind of script that would do into this data and extract the stuff I need, therefore saving me tons of time copying and pasting.
The problem is, i'm not really sure where to start. I have done some research so I am familiar with csv files and I already have a pretty good grasp on java. What would be the best way to approach this problem, from what I have researched, python is very helpful at these string type manipulations, but I also know that it could be done in java using buffered reads/file writes, but I feel like that could get really clunky.
Thanks