Dissecting java Class file in haskell - haskell

I just recently started learning the Haskell language earlier this year and am working on a project that inputs a java class file (i.e FileName1.class) and dissect that file to print out the following:
name of the class defined by the class file
the number of methods of the class, their names and types
I did some research and found that this is possible by using a JVM but am a little lost in the process, anyone have any suggestions on how to tackle this?

You need to write a parser for the Java class format.
Luckily there are already libraries to do that.
Parse the file, inpsect the AST generated, print the required information.

Give parsec a try.
http://www.haskell.org/haskellwiki/Parsec
This is an incredibly good tutorial on how to get started: http://www.haskell.org/haskellwiki/Parsing_expressions_and_statements
And also from Real World Haskell: http://book.realworldhaskell.org/read/using-parsec.html
Parsec even contains a default Java language definition.
http://hackage.haskell.org/packages/archive/parsec/3.0.0/doc/html/src/Text-Parsec-Language.html#javaStyle
Also, when inspecting your AST, you might want to use the Reader monad to keep your type signatures from getting too polluted.

Related

Python typing, pickle and serialisation

I've started learning the typing system in python and came across an issue in defining function arguments that are picklable. Not everything in python can be pickled, can I define a type annotation that says "only accept objects that can are picklable"?
At first it sounds like something that should be possible, similar to Java's Serializable but then there is no Picklable interface in python and thinking about the issue a little more it occurs to me that pickling is an inherently runtime task. What can be pickled lists a number of things that can be pickled, and it's not difficult to imagine a container of lambda functions which would not be picklable, but I can't think of a way of determining that before hand (without touching the container definition).
The only way I've come up with is to define something like a typing.Union[Callable, Iterable, ...] of all the things listed in What can be pickled but that does not seem like a good solution.
This issue on github partially answers the question, although the issue is specifically related to json not pickle but the first answer from Guido should still apply to pickle
I tried to do that but a recursive type alias doesn't work in mypy right now, and I'm not sure how to make it work. In the mean time I use JsonDict = Dict[str, Any] (which is not very useful but at least clarifies that the keys are strings), and Any for places where a more general JSON type is expected.
https://github.com/python/typing/issues/182

Groovy static analysis for finding variable modifications

I have a simple task which requires finding the modification of variables in a given code. This will be a static analysis. For instance, given a variable (e.g., age), I would like to create a list or tree (a data structure) that gives me what modifies this variable and preferably the function name that makes the modification (as a return) or any other auxiliary information. I start writing my script, yet I see that it's very error-prone as I need to consider many cases such as nested loops, etc.
Would you suggest me where to start?
If the code to be analyzed happens to be Groovy code then you could write an AST transformation (probably a global one) that walks the code and obtains the information you seek.
The Groovy documentation site has a section on AST Transformations, have a look at http://groovy-lang.org/metaprogramming.html#_compile_time_metaprogramming
This page describes existing AST xforms and how you can develop your own. I'd recommend browsing the code that implements the standard AST xforms such as #Immutable, #Cannonical, and others.
CodeNarc (http://codenarc.sourceforge.net/) is a static code analyzer for Groovy code, inspired by PMD. It also relies on AST xforms.
GContracts (https://github.com/andresteingress/gcontracts) is another tool implemented using AST xforms. These two can serve as a basis for understanding more about AST transformations.
OTOH if the analyzed code happens to be Java then AST transformations will not help you.

Importing modules as a function, with string as input

I want to make a function called 'load' which imports definitions of functions from another file. I know how to import modules, but in my program I want the definitions of the functions to change depending on which module is 'loaded' with this new function. Is there a way to do this? Is there a better way to write my program so that this is not necessary?
I think it's type signature would look something like:
load :: String -> IO ()
where the string is the name of the module to be loaded (and the module is in the same directory).
Edit: Thanks for all the replies. Most people agree that this is not the best way to do what I want. Instead, is there a way to declare a global variable from within an I/O program. That is, I want it so that if I type (function "thing") into a function of type String -> IO(), I can still type 'thing' into GHCi to get the value assigned to it... Any suggestions?
There is almost certainly a better way to write your program so that this is not necessary. It's hard to say what without knowing more details about your situation, though. You could, for instance, represent the generic interface each module implements as a data-type, and have each module export a value of that type with the implementation.
Basically, the set of loaded modules is a static, compile-time property, so it makes no sense to want your program's behaviour to change based on its contents. Are you trying to write a library? Your users probably won't appreciate it doing such evil magic to their import lists :) (And it probably isn't possible without Template Haskell in that case, anyway.)
The exception is if you're trying to implement a Haskell tool (e.g. REPL, IDE, etc.) or trying to do plugins; i.e. dynamically-loaded modules of Haskell source code to integrate into your Haskell program. The first thing to try for those should be hint, but you may find you need something more advanced; in that case, the GHC API is probably your best bet. plugins used to be the de-facto standard in this area, but it doesn't seem to compile with GHC 7; you might want to check out direct-plugins, a simplified implementation of a similar interface that does.
mueval might be relevant; it's designed for executing short (one-line) snippets of Haskell code in a safe sandbox, as used by lambdabot.
Unless you're building a Haskell IDE or something like that, you most likely don't need this (^1).
But, in the case you do, there is always the hint-package, which allows you to embed a haskell interpreter into your program. This allows you to both load haskell modules and to convert strings into haskell values at runtime. There is a nice example of how to use it here
^1: If you're looking for a way to make things polymorphic, i.e. changing some, but not all definitions of in your code, you're probably looking for typeclasses.
With regards to your edit, perhaps you might be interested in IORef.

During Groovy AST Transformation, some FieldNodes wrongly say they have 0 annotations

Some FieldNodes wrongly say they have 0 annotations during an AST Transformation. My AST Transformation is during the CLASS_GENERATION phase. Why does it do this, and how can I get the missing annotations to show up?
EDIT: The problem mainly seems to happen on super classes of the class that the AST Transformation is running on.
Why CLASS_GENERATION? Are you just analyzing the code? Nevertheless, I haven't heard anyone using this phase for a transform.
A general guideline is to use phase CONVERSION for transforms that don't need much semantic (e.g type) information, and phase CANONICALIZATION for the rest.
GroovyConsole's AST browser (open with CTRL+T) is a handy tool to get an idea of what the AST looks like after each phase. Maybe it will help you find the problem.

How do I do automatic data serialization of data objects?

One of the huge benefits in languages that have some sort of reflection/introspecition is that objects can be automatically constructed from a variety of sources.
For example, in Java I can use the same objects for persisting to a db (with Hibernate), serializing to XML (with JAXB), and serializing to JSON (json-lib). You can do the same in Ruby and Python also usually following some simple rules for properties or annotations for Java.
Thus I don't need lots "Domain Transfer Objects". I can concentrate on the domain I am working in.
It seems in very strict FP like Haskell and Ocaml this is not possible.
Particularly Haskell. The only thing I have seen is doing some sort of preprocessing or meta-programming (ocaml). Is it just accepted that you have to do all the transformations from the bottom upwards?
In other words you have to do lots of boring work to turn a data type in haskell into a JSON/XML/DB Row object and back again into a data object.
I can't speak to OCaml, but I'd say that the main difficulty in Haskell is that deserialization requires knowing the type in advance--there's no universal way to mechanically deserialize from a format, figure out what the resulting value is, and go from there, as is possible in languages with unsound or dynamic type systems.
Setting aside the type issue, there are various approaches to serializing data in Haskell:
The built-in type classes Read/Show (de)serialize algebraic data types and most built-in types as strings. Well-behaved instances should generally be such that read . show is equivalent to id, and that the result of show can be parsed as Haskell source code constructing the serialized value.
Various serialization packages can be found on Hackage; typically these require that the type to be serialized be an instance of some type class, with the package providing instances for most built-in types. Sometimes they merely require an automatically derivable instance of the type-reifying, reflective metaprogramming Data class (the charming fully qualified name for which is Data.Data.Data), or provide Template Haskell code to auto-generate instances.
For truly unusual serialization formats--or to create your own package like the previously mentioned ones--one can reach for the biggest hammer available, sort of a "big brother" to Read and Show: parsing and pretty-printing. Numerous packages are available for both, and while it may sound intimidating at first, parsing and pretty-printing are in fact amazingly painless in Haskell.
A glance at Hackage indicates that serialization packages already exist for various formats, including binary data, JSON, YAML, and XML, though I've not used any of them so I can't personally attest to how well they work. Here's a non-exhaustive list to get you started:
binary: Performance-oriented serialization to lazy ByteStrings
cereal: Similar to binary, but a slightly different interface and uses strict ByteStrings
genericserialize: Serialization via built-in metaprogramming, output format is extensible, includes R5RS sexp output.
json: Lightweight serialization of JSON data
RJson: Serialization to JSON via built-in metaprogramming
hexpat-pickle: Combinators for serialization to XML, using the "hexpat" package
regular-xmlpickler: Serialization to XML of recursive data structures using the "regular" package
The only other problem is that, inevitably, not all types will be serializable--if nothing else, I suspect you're going to have a hard time serializing polymorphic types, existential types, and functions.
For what it's worth, I think the pre-processor solution found in OCaml (as exemplified by sexplib, binprot and json-wheel among others) is pretty great (and I think people do very similar things with Template Haskell). It's far more efficient than reflection, and can also be tuned to individual types in a natural way. If you don't like the auto-generated serializer for a given type foo, you can always just write your own, and it fits beautifully into the auto-generated serializers for types that include foo as a component.
The only downside is that you need to learn camlp4 to write one of these for yourself. But using them is quite easy, once you get your build-system set up to use the preprocessor. It's as simple as adding with sexp to the end of a type definition:
type t = { foo: int; bar: float }
with sexp
and now you have your serializer.
You wanted
to do lot of boring work to turn a data type in haskell into JSON/XML/DB Row object and back again into a data object.
There are many ways to serialize and unserialize data types in Haskell. You can use for example,
Data.Binary
Text.JSON
as well as other common formants (protocol buffers, thrift, xml)
Each package often/usually comes with a macro or deriving mechanism to allow you to e.g. derive JSON. For Data.Binary for example, see this previous answer: Erlang's term_to_binary in Haskell?
The general answer is: we have many great packages for serialization in Haskell, and we tend to use the existing class 'deriving' infrastructure (with either generics or template Haskell macros to do the actual deriving).
My understanding is that the simplest way to serialize and deserialize in Haskell is to derive from Read and Show. This is simple and isn't fullfilling your requirements.
However there are HXT and Text.JSON which seem to provide what you need.
The usual approach is to employ Data.Binary. This provides the basic serialisation capability. Binary instances for data types are easy to write and can easily be built out of smaller units.
If you want to generate the instances automatically then you can use Template Haskell. I don't know of any package to do this, but I wouldn't be surprised if one already exists.

Resources