Related
I have to code in APL. Since the code is going to be maintained for a long time, I am wondering if there are some papers/books which contain heuristics/tips/samples to help in designing clean and readable APL programs.
It is a different experience than coding in other programming language. Making a function, for example. Small will not help: such a function can contain one line of code, which is completely incomprehensible.
First, welcome to the wonderful world of APL.
Writing readable and maintainable APL code is not much different than writing readable and maintainable code in any language. Any good book on writing clean code is as applicable to APL as any other language, perhaps even more so. I recommend Clean Code by Robert C. Martin.
Consider the guideline in this book that all code in a function should be at the same level of abstraction. This applies to APL 100 times over. For example, if you have a function named DoThisBigTask it should have very few APL primitive symbols in it, and certainly no long complex one-liners. It should just be series of calls to other, lower level functions. If these higher-level functions are all well-named and well-defined, the general drift should be easily determined by someone who does not even know APL. The lowest level functions will be nothing but primitives and will be inscrutable to the non-APLer. Depending on how they are written they may even initially appear inscrutable to a seasoned APLer. However, these low level functions should be short, have no side effects, and can easily be re-written rather than modified if the maintaining programmer is unable to understand the original coding technique.
In general, keep your functions short, well-named, well-defined, and to the point. And keep the lines of code even shorter. It is much more important to have well-defined and well-documented functions than it is to have well-written or well document lines of code.
Since you asked for books and other references, I can suggest:
APL2 in Depth by Norman D. Thomson and Raymond P. Polivka. I worked with Ray Polivka for years and he was one of the best APL teachers I
have ever known.
The classic A. P. L.: An Interactive Approach by
Leonard Gilman and Allen J. Rose is good for the core language, but
is rather outdated and doesn't contain much that is truly relevant on
readability.
APL 2 at a Glance by James A. Brown and Sandra Pakin serves in some ways as an update to Gilman and Rose. It covers nested operations and other updates to APL, but has not much specifically directed at readability. Still, if you follow the examples here you will be writing readable code.
APL is Easy by STSC and Jerry R. Turner is an intro directed specifically at the APL*Plus line. Again, not much specifically on readability, but the models are generally well-designed readable code.
Mastering Dyalog APL: A Complete Introduction to Dyalog APL by Bernard Legrand is quite good if you are specifically workign in Dyalog APL, not so much if you are working in one of the other versions such as APL*Plus (from APL2000)
It is my view that the reputation of APL as a "write-only language" is much overstated. One does need to get used to the primitives and the symbols used to represent them. But then one needs to get used to the syntax and the various library functions in many other language environments. I have seen convoluted code in C, C++, and Java as hard to follow as any APL. Of course, it isn't good C, C++, or Java, even if it is clever.
Some advice:
Writing 'one-liners' is a way to test one's mastery of the language,
but is very poor practice for production code.
Comment to make the algorithm and especially the data structure being used clear. As with any code, comments should add something
that cannot be easily read from the code itself, or call attention to
complex or obscure code.
If possible avoid obscure code so there is no need to explain it. It is usually possible.
Make each function do one and only one job, with a clear interface.
Avoid global variables for the most part, and document any that are needed.
Document the interface, purpose, and efect of any function at the
top. Make utilities black boxes without side-effects if possible. If
side-effects are essential, document those as part of the interface.
Develop a standard header comment structure.
Dynamic code built on-the-fly can add flexabiliy to a solution, but
is often much harder to debug if problems occur. Make such code
bullet-proof to the extent you can, and build in optional logging to
help when it turns out to have problems anyway.
You can use an OOP-like style if you wish. But there is no need to do so. If you do, it should IMO be used fairly pervasively through an application, except perhaps for low-level utilities. But OOP-style code can be at least as convoluted as non-OOP code, and APL doesn't have built-in inheritance or other OOP-supporting syntax.
(I'll use here "A" instead of comment, "'" instead of symbol sign.)
Well, I was developing APL for a year, I have only used Aplusdev.org.
You don't even need more. The trick is to try to think OOP-like. You should have -- if I remember well -- structured fields used as class data, sth like {'attribute1 'attribute2, {value,value2}}, so you can easily pick them out like obj.attribute1 in c++.
(here 'attribute Pick object, use only in class functions :) )
Moreover, use namespaced functions:
namespace_classname.method(this, arg1)
namespace_classname._private_method(this, arg1, arg2)
and lots of simple tool functions instead of nifty, long lines. The performance drop is not substantial, you can optimize later for say arrays once you see something could be faster.
And before anything: think matlab and mathematica without for loops! :) It helps a lot.
My suggestions for robust, maintainable code:
use extensive set of utility functions instead of trickery with those unreadable symbols to make your code always to the point.
try-catch blocks there is a built in exception handling, which can be utilized here,
try_begin();
A tried code, maybe in extra brackets not to forget try_end() at the end.
try_end();
catch(sth, function_here);
can be nicely implemented. (You'll see, catching errors is very important)
crude type checking : implement a standard and use for not-so-many times called functions... (you can put a function with flexible parameters right after a function definition)
Syntax:
function(point2i, ch):
{
typecheck({{'int, [1 2]}, 'char}); A do some assertions in typecheck...
// your function goes here
}
lambda functions can be very effective, you can do some reflections to achieve lambdas.
always declare returns with saying "return"!
Unit tests based on try-catch testing each and every function you write.
I also used a lot of 'apply' and 'map' from mathematica, implementing my own version, they are very-very effective here.
I wrote matlab thinking since you can here have a list of structured fields (=class data) in a variable. You will write lots of those if you wanna keep things for-loop-less (and you wanna, trust me). For that you need to have a standard naming convention say indicate with plurals:
namespace_class.method(objects, arg1, arg2)
To the end: also, I wrote inputBox and messageBox like the ones in Javascript or VisualBasic, they will make very easy hacking together simple tools or checking states. The only catch of messageBox, that it can't put the function-flow on hold,
so you need
AA documentation of f1
f1():
{
A do sth
msgbox.call("Hi there",{'Ok, {'f2}});
}
f2():
{
A continue doing stuff
}
You can write auto-docs in bash with a gawk/sed combination to put it into a webpage.
Also creating HTML formatted code helps in printing. ;)
I hope this was good outline for a proper build-up. Before writing own tools, try to dig up the available tools from the legacy codebase... functions are often even 4 times implemented with different names due to the mess that time.
First, I have to say that I really like Groovy and all the good stuff it is bringing to the Java dev world. But since I'm using it for more than little scripts, I have some concerns.
In this Groovy help page about dynamic vs static typing, there is this statement about the absence of compilation error/warning when you have typo in your code because it could be a call to a method added later at runtime:
It might be scary to do away with all of your static typing and
compile time checking at first. But many Groovy veterans will attest
that it makes the code cleaner, easier to refactor, and, well, more
dynamic.
I'm pretty agree with the 'more dynamic' part, but not with cleaner and easier to refactor:
For the other two statements I'm not sure: from my Groovy beginner perspective, this is resulting in less code, but in more difficult to read later and in more trouble to maintain (can not rely on the IDE anymore to find who is declaring a dynamic method and who is using one).
To clarify, I find that reading groovy code is very pleasant, I love the collection and closure (concise and expressive way of tackle complicated problem).
But I have a lot of trouble in these situations:
no more auto-completion inside 'builder' using Map (Of Map (of Map))
everywhere
confusing dynamic methods call (you don't know if it is a typo or a
dynamic name)
method extraction is more complicated inside closure (often resulting in code duplicate: 'it is only a small closure after all')
hard to guess closure parameters when you have to write one for a method of a subsystem
no more learning by browsing the code: you have to use text search instead
I can only saw some benefits with GORM, but in this case the dynamic method are wellknown and my IDE is aware of them (so it is more looking like a systematic code generation than dynamic method for me)
I would be very glad to learn from groovy veteran how they can attest of these benefits.
It does lead to different classes of bugs and processes. It also makes writing tests faster and more natural, helping to alleviate the bug issues.
Discovering where behavior is defined, and used, can be problematic. There isn't a great way around it, although IDEs are getting better at it over time.
Your code shouldn't be more difficult to read--mainline code should be easier to read. The dynamic behavior should disappear into the application, and be documented appropriately for developers that need to understand functionality at those levels.
Magic does make discovery more difficult. This implies that other means of documentation, particularly human-readable tests (think easyb, spock, etc.) and prose, become that much more important.
This is somewhat old, but i'd like to share my experience if someone comes looking for some thoughts on the topic:
Right now we are using eclipse 3.7 and groovy-eclipse 2.7 on a small team (3 developers) and since we don't have tests scripts, mostly of our groovy development we do by explicitly using types.
For example, when using service classes methods:
void validate(Product product) {
// groovy stuff
}
Box pack(List<Product> products) {
def box = new Box()
box.value = products.inject(0) { total, item ->
// some BigDecimal calculations =)
}
box
}
We usually fill out the type, which enable eclipse to autocomplete and, most important, allows us to refactor code, find usages, etc..
This blocks us from using metaprogramming, except for Categories which i found that are supported and is detected by groovy-eclipse.
Still, Groovy is pretty good and a LOT of our business logic is in groovy code.
We had two issues in production code when using groovy, and both cases were due bad manual testing.
We also have a lot of XML building and parsing, and we validate it before sending it to webservices and the likes.
There's a small script we use to connect to an internal system whose usage is very restricted (and not needed in other parts of the system). This code i developed using entirely dynamic typing, overriding methods using metaclass and all that stuff, but this is an exception.
I think groovy 2.0 (with groovy-eclipse coming along, of course) and it's #TypeChecked will be great for those of us that uses groovy as a "java++".
To me there are 2 types of refactoring:
IDE based refactoring (extract to method, rename method, introduce variable, etc.).
Manual refactoring. (moving a method to a different class, changing the return value of a method)
For IDE based refactoring I haven't found an IDE that does as good of a job with Groovy as it does with Java. For example in eclipse when you extract to method it looks for duplicate instances to refactor to call the method instead of having duplicated code. For Groovy, that doesn't seem to happen.
Manual refactoring is where I believe that you could see refactoring made easier. Without tests though I would agree that it is probably harder.
The statement at cleaner code is 100% accurate. I would venture a guess that good Java to good Groovy code is at least a 3:1 reduction in lines of code. Being a newbie at Groovy though I would strive to learn at least 1 new way to do something everyday. Something that greatly helped me improve my Groovy was to simply read the APIs. I feel that Collection, String, and List are probably the ones that have the most functionality and I used the most to help make my Groovy code actually Groovy.
http://groovy.codehaus.org/groovy-jdk/java/util/Collection.html
http://groovy.codehaus.org/groovy-jdk/java/lang/String.html
http://groovy.codehaus.org/groovy-jdk/java/util/List.html
Since you edited the question I'll edit my answer :)
One thing you can do is tell intellij about the dynamic methods on your objects: What does 'add dynamic method' do in Groovy/IntelliJ?. That might help a little bit.
Another trick that I use is to type my objects when doing the initial coding and remove the typing when I'm done. For example I can never seem to remember if it's .substring(..) or .subString(..) on a String. So if you type your object you get a little better code completion.
As for your other bullet points, I'd really need to look at some code to be able to give a better answer.
This question already has answers here:
Closed 12 years ago.
Possible Duplicates:
Is there any advantage of being a case-sensitive programming language?
Why are many languages case sensitive?
Something I have always wondered, is why are languages designed to be case sensitive?
My pea brain can't fathom any possible reason why it is helpful.
But I'm sure there is one out there. And before anyone says it, having a variable called dog and Dog differentiated by case sensitivity is really really bad practise, right?
Any comments appreciated, along with perhaps any history on the matter! I'm insensitive about case sensitivity generally, but sensitive about sensitivity around case sensitivity so let's keep all answers and comments civil!
It's not necessarily bad practice to have two members which are only differentiated by case, in languages which support it. For example, here's a fairly common bit of C#:
private readonly string name;
public string Name { get { return name; } }
Personally I'm quite happy with case sensitivity - particularly as it allows code like the above, where the member variable and property follow conventions anyway, avoiding confusion.
Note that case-sensitivity has a culture aspect too... not all cultures will deem the same characters to be equivalent...
One of the biggest reasons for case-sensitivity in programming languages is readability. Things that mean the same should also look the same.
I found the following interesting example by M. Sandin in a related discussion:
I used to
believe case sensitivity was a
mistake, until I did this in the case
insensitive language PL/SQL (syntax
now entierly forgotten):
function IsValidUserLogin(user:string, password :string):bool begin
result = select * from USERS
where USER_NAME=user and PASSWORD=password;
return not is_empty(result);
end
This passed unnoticed for several
months on a low-volume production
system, and no harm came of it. But it
is a nasty bug, sprung from case
insensitivity, coding conventions, and
the way humans read code. The lesson
for me was that: Things that are the
same should look the same.
Can you see the problem immediately? I couldn't...
I like case sensitivity in order to differentiate between class and instance.
Form form = new Form();
If you can't do that, you end up with variables called myForm or form1 or f, which are not as clean and descriptive as plain old form.
Case sensitivity also means that you don't have references to form, FORM and Form which all mean the same thing. I find it difficult to read such code. I find it much easier to scan code where all references to the same variable look exactly the same.
Something I have always wondered, is why are languages designed to be case sensitive?
Ultimately, it's because it is easier to correctly implement a case-sensitive comparison correctly; you just compare bytes/characters without any conversions. You can also do other things like hashing really easy.
Why is this an issue? Well, case-insensitivity is rather hard to add unless you're in a tiny domain of supported characters (notably, US-ASCII). Case conversion rules vary by locale (the Turkish rules are not the same as those in the rest of the world) and there's no guarantee that flipping a single bit will do the right thing, or that it is always the same bit and under the same preconditions. (IIRC, there's some really complex rules in some language for throwing away diacritics when converting vowels to upper case, and reintroducing them when converting to lower case. I forget exactly what the details are.)
If you're case sensitive, you just ignore all that; it's just simpler. (Mind you, you still ought to pay attention to UNICODE normalization forms, but that's another story and it applies whatever case rules you're using.)
Imagine you have an object called dog, which has a method called Bark(). Also you have defined a class called Dog, which has a static method called Bark(). You write dog.Bark(). So what's it going to do? Call the object's method or the static method from the class? (in a language where :: doesn't exist)
I'm sure originally it was a performance consideration. Converting a string to upper or lower case for caseless comparison isn't an expensive operation exactly, but it's not free either, and on old systems it may have added complexity that the systems of the day weren't ready to handle.
And now, of course, languages like to be compatible with each other (VB for example can't distinguish between C# classes or functions that differ only in case), people are used to naming things the same text but with different cases (See Jon Skeet's answer - I do that a lot), and the value of caseless languages wasn't really enough to outweigh these two.
The reason you can't understand why case-sensitivity is a good idea, is because it is not. It is just one of the weird quirks of C (like 0-based arrays) that now seem "normal" because so many languages copied what C did.
C uses case-sensitivity in indentifiers, but from a language design perspective that was a weird choice. Most languages that were designed from scratch (with no consideration given to being "like C" in any way) were made case-insensitive. This includes Fortran, Cobol, Lisp, and almost the entire Algol family of languages (Pascal, Modula-2, Oberon, Ada, etc.)
Scripting languages are a mixed bag. Many were made case-sensitive because the Unix filesystem was case-sensitive and they had to interact sensibly with it. C kind of grew up organically in the Unix environment, and probably picked up the case-sensitive philosophy from there.
Case-sensitive comparison is (from a naive point of view that ignores canonical equivalence) trivial (simply compare code points), but case-insensitive comparison is not well defined and extremely complex in all cases, and the rules are impossible to remember. Implementing it is possible, but will inadvertedly lead to unexpected and surprising behavior. BTW, some languages like Fortran and Basic have always been case-insensitive.
I'm coding in an embedded language called JS.
I want to be able to call three functions in any order. (ABC, ACB, BAC, BCA, CBA, CAB.)
The trick? The language doesn't have user-defined functions.
It does have a conditional and a looping construct.
I think I have three choices.
Duplicate a whole bunch of code.
Write a preprocessor (that would create all the duplicated code).
Do a loop with three iterations, using an array to control which functionality gets called on each pass of the loop.
I hate #1. Duplicated code is nasty. How do I change anything without screwing up?
I guess #2 is OK. At least I don't have duplicated code in the source. But my output code is what I'll be debugging, and I wonder if I want to diverge from it. On the plus side, I could add a bunch of sugar to the language.
I think my best bet is #3.
Any other ideas? There is no goto. No functions. No existing preprocessor.
Funny thing about #3 is that it's essentially the infamous for/switch nightmare.
Perhaps some kind of mutant state-machine, viz:
int CODEWORD=0x123;
while (CODEWORD)
{
switch(CODEWORD&15)
{
case 1:
/// case 1
break;
case 2:
/// case 2
break;
case 3:
//// case 3
break;
}
CODEWORD=CODEWORD>>4;
}
DRY, no preprocessor, no array. for/switch seems somewhat unavoidable.
You might be able to use the C preprocessor instead of writing your own. That would at least let you try it to see if it's a workable solution.
The technically best solution (assuming that you have access to the code or the developers) is to modify the JS language to do what you really need.
Failing that, the best solution depends on aspects of the problem that you haven't explained:
are the 'functions' recursive?
are there function parameters?
do you need (are you likely to need) other control structures not provided in JS?
does the function call order depend on runtime parameters?
are you skilled and confident enough to design and implement a preprocessor language that meets your current and projected requirements?
is implementing a preprocessor going to save you / coworkers time in the long run?
If the answers to 5. and enough of the others are "yes", then your option #2 is the right answer. Otherwise ... an ugly solution like your #1 or #3 might actually be a better idea.
EDIT: If you don't have source code access and the development team is not responsive to your needs, consider looking for an open-source alternative.
I'm starting on a project where strings are written into the code most of the time. Many strings might only be used in a few places but some strings are common throughout many pages.
Is it a good use of my time to refactor the literals into constants being that the app is pretty well established and runs well? What would be the long-term benefits to doing so?
One common thing to consider would be i18n. If you (or your muckity-mucks) ever want to sell your product in Mexico or France (etc.) you're going to appreciate having those string literals not littered throughout the code base.
EDIT: I realize this doesn't directly answer your question, so I'm voting up some of the other answers re: rule of three, and the like. I understand you're talking about an existing code base, so it's a little late to talk about incorporating i18n from the start. It's so easy to do when you're in the habit from the start.
I like to apply the rule of three when refactoring. If it happens three or more times, then the code needs to be updated.
Only if this project needs to be supported into the future is this a good use of time. If you will be regularly maintaining/expanding this system; however, this is a great idea.
1) There is a large degree of risk associated with string literals as a single misspelling can usually only be detected at run time. The reduced risk of run time errors is a serious advantage as they can be embarrassing/frustrating.
2) Also, should they ever need to be changed, for example when they are used to reference another system (like table names, server names etc) they can be very difficult to update when those other system names change. Centralize them and it's a trivial issue.
If a string is used in more than one place, refactor it. If it is only used in one place, leave it alone.
If you've refactored out all your common strings, it makes it easier to internationalize/translate them. It's even easier if they're all in properties files, or whatever your language equivalent is.
Is it a good use of my time to refactor the literals into constants being that the app is pretty well established and runs well?
No, you better leave it like it is.
What would be the long-term benefits to doing so?
If no one ever touch that code, the benefits are none.
What you can do, however is avoid adding new literals. But I would pretty much leave existing the way they are.
You could probably refactor them in your free to sleep better.
Probably there are some other bugs already that need your attention. Fix those instead.
Finally, if you manage you add "refactoring" to your task list, go ahead!!!
I agree with JMD, just keep in mind that there is more to i18n than changing strings(currencies, UI must be adpated to Right-to-left languages etc)
Even if you don´t wish to 18n your application it would be useful to refactor your strings, since that string that is used today only once, maybe reused tomorrow several times, and if it´s hardcoded you may be not aware of it and star replicating string all over the place.
Best let sleeping dogs lie. If you need to change a string that's used eighteen-lumpy times, yeah, go ahead and turn it into a constant somewhere. If you find yourself working in a module that has a string that could be constant-ize, do it if you feel like it. But going through the whole app changing all strings to constants... that should be on the very bottom of the to-do list.