Is this pseudo code or some programming language - programming-languages

I'm reading about graph traversal in this PDF and see some unfamiliar symbols. Is this code sample just pseudocode or is it some language I haven't seen before? If pseudocode, does it have some formal schema that will let me look up the unfamiliar symbols?
Bitmap sample here:

It is a formal mathematical style of pseudocode sometimes referred to as pidgin code. Very few real languages use symbols like that, but there are some, such as APL.
There is no "glossary of symbols" per se, because every author uses his or her own flavor, but they tend to follow commonly-understood math notation.
This particular piece of code on BFS is very widely quoted. Wikipedia's BFS article has an "ASCII" translation of it.

It looks like pidgin code - a mix of pseudocode and math notation. For determining the meaning of the math notation, rapidtables has a good reference for the symbols.

Related

Tutorial on stochastic simulation in Haskell

I'd like to use Haskell for stochastic simulation, but I don't know how. I've read Hutton's 'Programming in Haskell', and I'm comfortable writing deterministic functional programs. However, I don't know how to start writing stochastic simulations of the sort that are easy in imperative languages like R or python. Is there a tutorial or primer on this that I could read, or can anyone provide some tips on getting started?
There's a nice self-contained paper Erwig and Kollmansberger: Functional Pearls - Probabilistic Functional Programming in Haskell on this topic. I used this as a starting point for writing a natural language processor based on Hidden Markov Models in Haskell. There's a package that is based on this paper, which also seems to provide a basic interface to R plotting.
There's also an entry on the HaskellWiki with more links to hackage. In particular, the ProbabilityMonads package might be useful for you.
http://learnyouahaskell.com/a-fistful-of-monads#the-list-monad
This small section in Learn You a Haskell talks about using the List monad and functor functions to easily deal with non-determinism. May be a little simplistic depending on what your needs are, but make good use of the tools that are already in the standard library.

APL readability

I have to code in APL. Since the code is going to be maintained for a long time, I am wondering if there are some papers/books which contain heuristics/tips/samples to help in designing clean and readable APL programs.
It is a different experience than coding in other programming language. Making a function, for example. Small will not help: such a function can contain one line of code, which is completely incomprehensible.
First, welcome to the wonderful world of APL.
Writing readable and maintainable APL code is not much different than writing readable and maintainable code in any language. Any good book on writing clean code is as applicable to APL as any other language, perhaps even more so. I recommend Clean Code by Robert C. Martin.
Consider the guideline in this book that all code in a function should be at the same level of abstraction. This applies to APL 100 times over. For example, if you have a function named DoThisBigTask it should have very few APL primitive symbols in it, and certainly no long complex one-liners. It should just be series of calls to other, lower level functions. If these higher-level functions are all well-named and well-defined, the general drift should be easily determined by someone who does not even know APL. The lowest level functions will be nothing but primitives and will be inscrutable to the non-APLer. Depending on how they are written they may even initially appear inscrutable to a seasoned APLer. However, these low level functions should be short, have no side effects, and can easily be re-written rather than modified if the maintaining programmer is unable to understand the original coding technique.
In general, keep your functions short, well-named, well-defined, and to the point. And keep the lines of code even shorter. It is much more important to have well-defined and well-documented functions than it is to have well-written or well document lines of code.
Since you asked for books and other references, I can suggest:
APL2 in Depth by Norman D. Thomson and Raymond P. Polivka. I worked with Ray Polivka for years and he was one of the best APL teachers I
have ever known.
The classic A. P. L.: An Interactive Approach by
Leonard Gilman and Allen J. Rose is good for the core language, but
is rather outdated and doesn't contain much that is truly relevant on
readability.
APL 2 at a Glance by James A. Brown and Sandra Pakin serves in some ways as an update to Gilman and Rose. It covers nested operations and other updates to APL, but has not much specifically directed at readability. Still, if you follow the examples here you will be writing readable code.
APL is Easy by STSC and Jerry R. Turner is an intro directed specifically at the APL*Plus line. Again, not much specifically on readability, but the models are generally well-designed readable code.
Mastering Dyalog APL: A Complete Introduction to Dyalog APL by Bernard Legrand is quite good if you are specifically workign in Dyalog APL, not so much if you are working in one of the other versions such as APL*Plus (from APL2000)
It is my view that the reputation of APL as a "write-only language" is much overstated. One does need to get used to the primitives and the symbols used to represent them. But then one needs to get used to the syntax and the various library functions in many other language environments. I have seen convoluted code in C, C++, and Java as hard to follow as any APL. Of course, it isn't good C, C++, or Java, even if it is clever.
Some advice:
Writing 'one-liners' is a way to test one's mastery of the language,
but is very poor practice for production code.
Comment to make the algorithm and especially the data structure being used clear. As with any code, comments should add something
that cannot be easily read from the code itself, or call attention to
complex or obscure code.
If possible avoid obscure code so there is no need to explain it. It is usually possible.
Make each function do one and only one job, with a clear interface.
Avoid global variables for the most part, and document any that are needed.
Document the interface, purpose, and efect of any function at the
top. Make utilities black boxes without side-effects if possible. If
side-effects are essential, document those as part of the interface.
Develop a standard header comment structure.
Dynamic code built on-the-fly can add flexabiliy to a solution, but
is often much harder to debug if problems occur. Make such code
bullet-proof to the extent you can, and build in optional logging to
help when it turns out to have problems anyway.
You can use an OOP-like style if you wish. But there is no need to do so. If you do, it should IMO be used fairly pervasively through an application, except perhaps for low-level utilities. But OOP-style code can be at least as convoluted as non-OOP code, and APL doesn't have built-in inheritance or other OOP-supporting syntax.
(I'll use here "A" instead of comment, "'" instead of symbol sign.)
Well, I was developing APL for a year, I have only used Aplusdev.org.
You don't even need more. The trick is to try to think OOP-like. You should have -- if I remember well -- structured fields used as class data, sth like {'attribute1 'attribute2, {value,value2}}, so you can easily pick them out like obj.attribute1 in c++.
(here 'attribute Pick object, use only in class functions :) )
Moreover, use namespaced functions:
namespace_classname.method(this, arg1)
namespace_classname._private_method(this, arg1, arg2)
and lots of simple tool functions instead of nifty, long lines. The performance drop is not substantial, you can optimize later for say arrays once you see something could be faster.
And before anything: think matlab and mathematica without for loops! :) It helps a lot.
My suggestions for robust, maintainable code:
use extensive set of utility functions instead of trickery with those unreadable symbols to make your code always to the point.
try-catch blocks there is a built in exception handling, which can be utilized here,
try_begin();
A tried code, maybe in extra brackets not to forget try_end() at the end.
try_end();
catch(sth, function_here);
can be nicely implemented. (You'll see, catching errors is very important)
crude type checking : implement a standard and use for not-so-many times called functions... (you can put a function with flexible parameters right after a function definition)
Syntax:
function(point2i, ch):
{
typecheck({{'int, [1 2]}, 'char}); A do some assertions in typecheck...
// your function goes here
}
lambda functions can be very effective, you can do some reflections to achieve lambdas.
always declare returns with saying "return"!
Unit tests based on try-catch testing each and every function you write.
I also used a lot of 'apply' and 'map' from mathematica, implementing my own version, they are very-very effective here.
I wrote matlab thinking since you can here have a list of structured fields (=class data) in a variable. You will write lots of those if you wanna keep things for-loop-less (and you wanna, trust me). For that you need to have a standard naming convention say indicate with plurals:
namespace_class.method(objects, arg1, arg2)
To the end: also, I wrote inputBox and messageBox like the ones in Javascript or VisualBasic, they will make very easy hacking together simple tools or checking states. The only catch of messageBox, that it can't put the function-flow on hold,
so you need
AA documentation of f1
f1():
{
A do sth
msgbox.call("Hi there",{'Ok, {'f2}});
}
f2():
{
A continue doing stuff
}
You can write auto-docs in bash with a gawk/sed combination to put it into a webpage.
Also creating HTML formatted code helps in printing. ;)
I hope this was good outline for a proper build-up. Before writing own tools, try to dig up the available tools from the legacy codebase... functions are often even 4 times implemented with different names due to the mess that time.

machine representation of natural text

I'm currently working on high-level machine representation of natural text.
For example,
"I had one dog but I gave it to Danny who didn't have any"
would be
I.have.dog =1
I.have.dog -=1
Danny.have.dog = 0
Danny.have.dog +=1
something like this....
I'm trying to find resources, but can't really find matching topics..
Is there a valid subject name for this type of research? Any library of resources?
Natural logic sounds like something related but it's not really the same thing I'm working on. Please help me out!
Representing natural language's meaning is the domain of computational semantics. Within that area, lots of frameworks have been developed, though the basic one is still first-order logic.
Specifically, your problem seems to be that of recognizing discourse semantics, which deals with information change brought about by language use. This is pretty much an open area of research, so expect to find a lot of research papers and PhD positions, but little readily-usable software.
As larsmans already said, this is pretty much a really open field of research, called computational semantics (a subfield of computational linguistics.)
There's one important thing that you'll need to understand before starting off in the comp-sem world: most people there use fancy high-level languages. By high-level I don't mean C, but more something like LISP, Prolog, or, as of late, Haskell. Computational semantics is very close to logic, which is why people researching the topic are more comfortable with functional and logical languages — they're closer to what they actually use all day long.
It will also be very useful for you to first look at some foundational course in predicate logic, since that's what the underlying literature usually takes for granted.
A good introduction to the connection between logic and language is L.T.F. Gamut — Logic, Language, and Meaning, volume I. This deals with the linguistic side of semantics, which won't help you implement anything, but it will help you understand the following literature. That said, there are at least some books that will explain predicate logic as they go, but if you ask me, any person really interested in the representation of language as a formal system should take a course in predicate and possibly intuitionist and intensional logic.
To give you a bit of a peek, your example is rather difficult to treat for
current comp-sem approaches. Not impossible, but already pretty high up the
scale of difficulty. What makes it difficult is the tense for one part (dealing
with tense and aspect will typically bring you into even semantics,) but also
that you'd have to define the give and have relations in a way that
works for this example. (An easier example to work with would be, say "I had
a dog, but I gave it to Danny who didn't have any." Can you see why?)
Let's translate "I have a dog."
∃x[dog(x) ∧ have(I,x)]
(There is an object x, such that x is a dog and the have-relation holds between
"I" and x.)
These sentences would then be evaluated against a model, where the "I"
constant might already be defined. By evaluating multiple sentences in sequence,
you could then alter that model so that it keeps track of a conversation.
Let's give you some suggestions to start you off.
The classic comp-sem system is
SHRDLU, which places geometric
figures of certain color in a virtual environment. You can play around with it, since there's a Windows-compatible demo online at that page I linked you to.
The best modern book on the topic is probably Blackburn and Bos
(2005). It's written in Prolog, but
there are sources linked on the page to learn Prolog
(now!)
Van Eijck and Unger give a good course on computational semantics in Haskell, which is a bit more recent, but in my eyes not quite as educational in terms of raw computational semantics as Blackburn and Bos.

Tool for automated porting and language that can compile into others

I'm just asking this out of curiosity :
Is there any tool that can automatically convert a source code of reasonable complexity from one language to another ?
Is there any "meta-language" that can compile into several other languages ? For example CoffeeScript compiles into Javascript.
If you know any open-source example, it'd be great !
Thank you for your time.
PS: No idea how to tag this. Feel free to edit.
GCC converts complex C++ code into machine code and thus technically is an answer to your question. In fact, there are lots of compiler like this, but I don't think these are what you intended to ask.
There are tools that are hardwired to translate just one language to another as source code (another poster suggested "f2C", which is a perfect example). These are just like compilers... but rarer.
There are virtually no tools that will map from one language to many others, out of the box. The problem is that languages have different execution models, data types, and execution schemes, which such a translator has to simulate properly in the target language.
The are "code generators" that claim to do this, but they are largely IMHO specifications of rather simple functions that translate trivially to simple code in the target langauge.
If you want to translate one language to another in a sort of general way, you need a program transformation system, e.g., a system that can parse arbitrary langauges, and for which you can provide translation rules that map to other languages in a sort of straightforward way.
Our DMS Software Reengineering Toolkit is one of these. This SO What kinds of patterns could I enforce on the code to make it easier to translate to another programming language? discusses the issues in more detail.
You can convert Fortran code to C using the f2c tool.
For python, you can convert a subset of the language to C++ using shedskin.
The vala language is converted to C before the real compilation.

Literate Haskell (.lhs) and Haddock

At the moment I'm only using Haddock but after seeing some really interesting examples (e.g. this gist) of literate Haskell, I'm interested in trying it out in a project.
The questions I got, are:
What do you write as Haddock comments and what do you write in the literate part?
How do you scale literate programming to multiple files? Can anyone point me to an example where literate programming is used in a package with multiple modules? What is your experience of using literate programming in larger packages?
Which flavour (markdown, latex, ...) of literate Haskell is preferred?
Why are you programming in literate Haskell or plain vanilla Haskell? Are you programming in both styles and if so why?
Do you prefer block-style (\begin{code}) or Bird-style (>)? Why?
I used to write a lot of literate programs.
What do you write as Haddock comments and what do you write in the literate part?
The external API documentation goes into the Haddock comments. Everything else goes into the literate part. "Everything else" might include:
Internal invariants of data structures
Why you are doing things this way
What the design of the code is
Why this design was chosen, what other designs were tried and found wanting
How do you scale literate programming to multiple files?
The same way you scale a large LaTeX document to multiple files: one file per module, then a giant file that \includes them all.
Can anyone point me to an example where literate programming is used in a package with multiple modules?
It's not Haskell, but the Quick C-- compiler is a large functional program that is written using literate programming.
What is your experience of using literate programming in larger packages?
Literate programming works very well for documenting tricky, difficult, or complex modules. For most simple modules, the external API documentation (e.g., Haddock) is enough. And no literate program is really going to give you the big picture of a design that contains more than a dozen modules. For that you need other tools and techniques.
Which flavour (markdown, latex, ...) of literate Haskell is preferred?
If you're making such a major investment, I'd definitely go with LaTeX just because of the math capability, and the generally greater power of the tool.
Why are you programming in literate Haskell or plain vanilla Haskell? Are you programming in both styles and if so why?
My Haskell codes are almost always all plain vanilla, for two reasons:
I work with senior people who have more Haskell experience, and they have abandoned literate Haskell. Only the very oldest modules in their system have any chance of being .lhs.
For Haskell, literate programming is kind of superfluous. One of the big benefits of a literate-programming tool is that you are freed from any constraints that the compiler or language definition might put on the order in which your code appears. But Haskell has almost no such constraints: there's no definition before use, and for a typical function definition I have the choice of let-binding or where-binding auxiliary names (or both). Literate programming was never just about fancy comments, and with "literate" Haskell that's about all you get. It's not worth the bother.
Do you prefer block-style (\begin{code}) or Bird-style (>)? Why?
I strongly prefer block style:
It's roughly compatible with every other literate-programming tool on the planet. (Bird tracks are unique to Haskell.)
My editor copes better with block style.
If you intend to share programs on the internet, I've found a combination of literate haskell in markdown style with mathjax to be a great combination. The program "Pandoc" is seriously brilliant for taking this "markdown+lhs" to any format you desire, including PDF or HTML. If you tell Pandoc to output to HTML, you can use the -mathjax (or other similar flags if you prefer) to have your latex math formulas render.
When using this style, I find bird style to be preferable because it just is more readable to me and seems to fit along with the markdown style better.
The great thing about using Pandoc with markdown is that you can add citations to your code, math formulas, and have a really portable format. You can build something that resembles a scientific research paper but is executable and can also be posted to blogs/wikis/websites.
To give an alternative point to Norman where he says that literate programming is useful for clearer code arrangement, it could be argued that Haskell is expressive enough that the problems you solve with the code are actually interesting and can really benefit by being surrounded with explanatory text. Think of a mathematical research paper. Good papers in pure mathematics have a lot of text to explain the motivation or higher-level interpretations of what the mathematical notation means. In a paper about the Navier-Stokes equations, for example, it would be super useful to surround the notation of the equations with text explaining how it relates to Newton's conservation of momentum.
In summary, I have had good success with, and recommend, using markdown+lhs style, dollar signs to embed latex math formulas, bird style, and pandoc. I would recommend writing programs as if they were research papers and treat the haskell itself as you would mathematical expressions in a research paper.

Resources