Related
In the last few decades, there were multiple algebraic modeling languages (AML) created, e.g., AMPL, GAMS, AIMMS, are among the most well-known and widely used names. I've been using mainly the first two options.
However, recently, new AMLs were created and now have excellent community support, e.g., Pyomo and JuMP are two probably the most promising names. I tried a bit of both them. The obvious thing is that for people having experience in Python or Julia programming languages, these two AMLs are great tools, and the learning curve is much steeper.
What are other main benefits (maybe new features, better efficiency, extended functionality) and motivation in general for creating these new AMLs?
I guess the most prominent reasons would be unified handling and support for many different solvers, clever use of data structures, interfaces to other commonly used tools such as Excel, MATLAB,.. and convenient embedding in programs written in popular languages. Since optimization for standard problems has become more and more a black-box tool it can readily be applied in the context of e.g. web-based applications - often written in object-oriented code.
I have a Gurobi licence and I am after a good MILP/LP modelling language, which should be
free/open source
intuitive, i.e. something that looks like (taken from MiniZinc)
var int: x;
constraint x >= 0.5;
solve minimize x;
fast: the time to build the model and send it to Gurobi should be of similar order to the best ones (AMPL GAMS etc.)
flexible/powerful (ability to deal with 3D+ arrays, activate/deactivate constraints easily, provide initial solutions to the solver, etc.)
Of course, and correct me if I'm wrong, AMPL GAMS fail at 1), Python and R fail at 2) (and perhaps at 3)?).
How about GLPK, Minizinc, ZIMPL etc.? They satisfy 1) and 2) but what about 3) and 4)? Are they as good as AMPL in this regard? If not, is there a modelling language satisfying 1-4?
I've used AMPL with Gurobi for mid-sized MIPs (~ 100k-1m variables?) and MiniZinc, mostly with Gecode, for smaller combinatorial problems. I've seen some Gurobi work done with R and Python, but haven't used it that way myself.
I'm less familiar with the other options. My understanding is that GAMS is quite similar to AMPL and much of what I have to say about AMPL may also be valid for GAMS, but I can't vouch for it.
Of course, and correct me if I'm wrong, AMPL GAMS fail at 1),
Yes, generally. There is an exception which probably isn't helpful for your specific requirements but might be useful to others: you can get free use of AMPL, Gurobi, and many other optimisation products, by using the NEOS web service. This is restricted to academic non-commercial purposes and you have to grant NEOS certain rights in relation to the problems you send them; definitely read those terms of service before using it. It also requires waiting for an available server, so if speed is a high priority this probably isn't the solution for you.
Python and R fail at 2) (and perhaps at 3)?).
In my limited experience, yes for (2). AMPL, GAMS, and MiniZinc are designed specifically for defining optimisation problems, so it's unsurprising that their syntax is more user-friendly for that purpose than languages like Python and R.
The flip-side to this is that if you want to do just about anything other than defining an optimisation problem with these languages, Python/R/etc. will probably be better for that purpose.
On speed: for the problems I usually work with, AMPL takes maybe a couple of seconds to build and presolve a MIP model which takes Gurobi a couple of minutes to solve. Obviously this is going to vary somewhat with hardware and details of the problem, but in general I would expect build time to be small compared to solve time for any of the solutions under discussion. Even with a good solver like Gurobi, big MIPs are hard. Many of the serious optimisation programmers I've met do use Python, so I presume the performance side is good enough.
However, that doesn't mean the choice of language/platform is irrelevant to speed. One of the nice features of AMPL (and also GAMS) is presolve, which attempts to reduce the problem size before sending it to the solver. My standard problems have a lot of redundant variables and constraints; AMPL identifies and eliminates many of these, reducing the problem size by about 80% and giving a noticeable improvement in solver time (as compared to runs where I switch off presolve, which I sometimes do for debugging-related reasons). This might be a consideration if you expect a lot of redundancy.
flexible/powerful (ability to deal with 3D+ arrays, activate/deactivate constraints easily, provide initial solutions to the solver, etc.)
MiniZinc handles up to 6D arrays, which may or may not be enough depending on your applications.
It's more flexible than AMPL in some areas and less so in others. AMPL has a lot of set-based functionality that I find useful (e.g. I can define a variable whose index set is something like "pairs of non-identical cities separated by no more than 500 km") and MiniZinc doesn't have this. OTOH, MiniZinc seems to be better than AMPL for solver-hopping, e.g. if I write a MZ model with a combinatorial constraint like "alldifferent" but then try to run it on a solver that doesn't recognise such constraints, MZ will translate it into something the solver can deal with.
I haven't tried deactivating constraints in MZ other than by commenting them out, so I can't help there, and similarly on providing initial solutions.
Overall, MiniZinc is a good choice to consider. Some pluses and minuses relative to AMPL ("free" being a big plus!) but it fills a similar niche.
IMHO, there is no such system if you consider the Python interfaces/modeling environments to SCIP or Gurobi too complicated:
x = model.addVar()
y = model.addVar(vtype="INTEGER")
model.setObjective(x + y)
model.addCons(2*x - y*y >= 0)
model.optimize()
To me this looks quite natural and straight forward. The immense benefit of using an actual programming language instead of modeling language is that you can do anything in there, while there will always be boundaries in the latter.
If you are a looking for a modeling GUI, you should check out LITIC. It can be used almost entirely with drag-and-drop operations: https://litic.com/showcase.html
I've used a lot of the options mentioned, and some not yet mentioned
GAMS
GAMS' Python API
GAMS' MATLAB API
AMPL
FICO Xpress Mosel
FICO Xpress Model's Python API
IBM ILOG OPL
Gurobi's Python API
PuLP (Python)
Pyomo (Python)
Python-MIP
JuMP (Julia)
MATLAB Optimization Toolbox
Google OR-Tools
Based on your requirements, I'd suggest trying Python-MIP, PuLP or JuMP. They are free and have easy syntax with no limit on array dimensionality.
Take a look at Google or-tools. I’m not sure if getting initial solution to the solver is available in all of its interfaces, but if you use it in python, it should probably satisfy all 1-4.
In the context of programming language discussion/comparison, what does the term "power" mean?
Does it have a well defined meaning? Even a poorly defined meaning?
Say if someone says "language X is more powerful than language Y" or asks the same as a question, what do they mean - or what information are they trying to find out?
It does not have a well-defined meaning. In these types of discussions, "language X is more powerful than language Y" usually means little more than "I like language X more than language Y." On the other end of the spectrum, you'll also usually have someone chime in about how any Turing-complete language can accomplish the same tasks as any other Turing-complete language, so that neither is strictly more powerful than the other.
I think a good meaning for it is expressivity. When a language is highly expressive, it means less code is required to express concepts. To me, this doesn't just mean that you have to write less code to accomplish the same tasks, but also that the code is easily readable by humans. Of course, generally (to a point), having fewer lines of code to read makes the task of reading and understanding easier for humans.
Having a "powerful" standard library comes into play here along the same lines. If a language comes equipped with thorough, complete libraries, then idiomatic code in that language will be able to benefit from the existing library code and not have to repeat or reinvent common functionality in application code. The end result is, again, having to write and read less code to accomplish the same tasks.
I keep saying "generally" and "to a point", because once a language gets too terse, it gets more difficult for humans to decipher. I suppose at this extreme, a language may still be considered "more powerful" (or even "too powerful"). So I guess I'm saying my personal interpretation of "powerful" includes some aspects of "useful" and "readable" in it as well.
C is powerful, because it is low level and gives you access to hardware. Python is powerful because you can prototype quickly. Lisp is powerful because its REPL gives you fantastic debugging opportunities. SQL is powerful because you say what you want and the DMBS will figure out the best way to do it for you. Haskell is powerful because each function can be tested in isolation. C++ is powerful because it has ten times the number of syntactic constructs that any one person ever needs or uses. APL is powerful since it can squeeze a ten-screen program into ten characters. Hell, COBOL is powerful because... why else would all the banks be using it? :)
"Powerful" has no real technical meaning, but lots of people have made proposals.
A couple of the more interesting ones:
Paul Graham wants to call a language "more powerful" if you can write the same programs in fewer lines of code (or some other sane, sensible measure of program size).
Matthias Felleisen has written a very serious theoretical study called On the Expressive Power of Programming Language.
As someone who knows and uses many programming languages, I believe that there are real differences between languages, and that "power" can be a convenient shorthand to describe ways in which one language might be better than another. Nevertheless, whenever I hear a discussion or claim that one language is more powerful than another, I tend to keep one hand firmly on my wallet.
The only meaningful way to describe "power" in a programming language is "can do what I require with the least amount of resources" where "resources" is defined as "whatever costs I'd rather not pay" and could, thus, be development time, CPU time, memory space, money, etc.
So basically the definition of "power" is purely subjective and rendered meaningless in any objective discussion.
Powerful means "high in power". "Power" is something that increases your ability to do things. "Things" vary in shape, size and other things. Loosely speaking therefore, "powerful" when applied to a programming language means that it helps you to do perform your tasks quickly and efficiently.
This makes "powerful" somewhat well defined but not constant across domains. A language powerful in one domain might be crippling in another eg. C is very powerful if you want to do systems level programming since it gives you direct access to the machine and hardware and structures that let you code much faster than you would in assembly. C compilers also produce tight code that runs fast. However, once you move to web applications, C can become very "unpowerful" and crippling since it's so much effort to get something up and running and you have to worry about a lot of extraneous details like memory etc.
Sometimes, languages are "powerful" in multiple domains. This gives them a general "powerful" tag (or badge since were are on SO here). PG's claim is that with LISP, this is the case. That might be true or might not be.
At the end of the day, "powerful" is a loaded word so you should evaluate who is saying it, why he's saying it and what it means to to your work.
There are really only two meanings people are worried about:
"Powerful" in the sense of "takes less resources (time, money, programmers, LOC, etc.) to achieve the same/better result", and "powerful" in the sense of "is capable of doing a wide range of tasks".
Some languages are extrememly resource-effective for a small range of tasks. Others are not so resource-effective but can be applied to a wide range of tasks (e.g. C, which is often used in OS development, creation of compilers and runtime libraries, and work with microcontrollers).
Which of these two meanings someone has in mind when they use the term "powerful" depends on the context (and even then is not always clear). Indeed often it is a bit of both.
Typically there are two distinct meanings:
Expressive, meaning the code tends to be very short and understandable
Low level, meaning you have very fine-grained control over the hardware.
For the most languages, these two definitions are at opposite ends of the spectrum: Python is very expressive but not very low level; C is very low level but not very expressive. Depending on which definition you pick, either language is powerful or not powerful.
nothing absolutely nothing.
To high level programmers it might mean alot of available datatypes built in. Or maybe abstractions to easily create or follow Design Patterns.
Paul Graham is a very high level guy here is what he has to say:
http://www.paulgraham.com/avg.html
Java guys might tell you something about portability, the power to reach every platform.
C/UNIX programmers may tell you that its speed and efficiency, complete control over every inch of memory.
VHDL/Verilog programmers will tell you its complete control over every clock and gate so as to not waste any electricity or time.
But in my opinion a "powerful language" supports all of the features for you to complete your task. Documentation may be important, or perhaps it is portability, or the ability to do graphics. It could be anything, writing a gui from Assembly is just stupid, so is trying to design an embedded processor in flash.
Choosing a language that suits your needs perfectly will always feel like power.
I view the term as marketing fluff, no one well-defined meaning.
If you consider, say, Assembler, C, and C++. On occasions one drops from C++ "down" to C for particualr needs, and in turn from C down to assembler. So that make assembler the most powerful because it's the only language that can do everything. Or, to argue the other way, a single line of C++ code can replace several of C (hiding polymorphic dispatch via function pointers for example) and a single line of C replaces many of assembler. So C++ is more powerful because one line does "more".
I think the term had some currency when products such as early databases and spreadsheets had in-built languages, some quite restricted. So vendors would tout their language as being "powerful" because it was less restricted.
It can have several meanings. In the very basic sense there's power as far as what is computable. In that sense the most powerful languages are Turing Complete which includes pretty much every general purpose programming language (as opposed to most markup languages and domain specific languages which are often not Turing complete).
In a more pragmatic sense it often refers to how concisely (and readably) you can do certain things. Basically how easy is it to do certain tasks in one language compared to another.
What language is more powerful (besides being somewhat subjective) depends heavily on what you're trying to do. If your requirements are to get something running on a small device with 64k of memory you're likely not going to be using Java. Most likely the right language would be C or C++ (or if you're really hard core assembly). If you need a very simple CRUD app done in 1 day, maybe something like Ruby On Rails would be the way to go (I know Rails is a framework and Ruby is the language, but these days what libraries and frameworks are available factor greatly into picking a language)
I think that, perhaps coincidentally, the physics definition of power is relevant here: "The rate at which work is performed."
Of course, a toaster does not perform very quickly the work of putting out fires. Similarly, the power of a programming language is not universal, but specific to the domain or task to which it is being applied. C is a powerful language for writing device drivers or implementations of higher-level languages; Python is a powerful language for writing general-purpose applications; XPath is a powerful language for writing queries on structured data sets.
So given a problem domain, the power of a language can be said to be the rate at which a competent programmer is able to use it to solve problems in that domain.
A precise answer can be tried to reach, by not assuming that the elements that define "powerful" (in the context of languages) come from so many dimensions.
See how many could be, and a lot will be missing:
runtime speed
code size
expressiveness
supported paradigms
development / debugging time
domain specialization
standard libs
codebase
toolchain ecosystem
portability
community
support / documentation
popularity
(add more here)
These and more parameters draw together X picture of how "programming in some language" would be like at X level. That will be only the definition, though, the only real knowledge comes with the actual practice of using the language, but i digress.
The question comes down to which parameter will represent the intrinsic quality of a language. If you refer to a language in itself, its ultimate, intrinsic purpose is "express things", and thus the most representative parameter is rightfully expressiveness, and is also one that resonates frequently when someone talks about how powerful a language is.
At the moment you try to widen the question/answer to cover more than the expressiveness of the language "as a language, as a tongue", you are more talking about different kinds of "environment", social environment, development environment, commercial environment, etc.
Depending of the complexity of the environment to be defined you'll have to mix more parameters that come from multiple, vast, overlapping and sometimes contradictory dimensions, and eventually the point of getting the definition will be lost or the question will have to be narrowed.
This approximation still won't answer "what is an expressive language", but, again, a common understanding are the definitions that Vineet well points out in its answer, and Forest remarks in the comments. I agree, for me "expression" is "conveying meaning".
I remember many instructors in college calling whatever language they were teaching "powerful".
Leads me to think:
Powerful = a relative term comparing the latest way to code something vs. the original or previous way.
I find it useless to use the word "powerful" in regards to discussing anything software related. Every time my professor in college would introduce a new concept such as polymorphism he would say "so this is a really powerful feature". After a while I got annoyed. If everything is powerful then nothing is. It's all the same. You can write code to do anything. Does is really matter how much code is required to do it? You can say it's short or efficient but powerful is just useless. Nuclear energy is powerful. Code is words.
I think that power would normally refer to how quickly it can process data, for example I found that in python as soon as a list exceeds a length of approx. 2000 it becomes unbearably slow whereas in C++ a list can easily contain 20,000 entries without doing so.
I'm hoping to use either Haskell or OCaml on a new project because R is too slow. I need to be able to use support vectory machines, ideally separating out each execution to run in parallel. I want to use a functional language and I have the feeling that these two are the best so far as performance and elegance are concerned (I like Clojure, but it wasn't as fast in a short test). I am leaning towards OCaml because there appears to be more support for integration with other languages so it could be a better fit in the long run (e.g. OCaml-R).
Does anyone know of a good tutorial for this kind of analysis, or a code example, in either Haskell or OCaml?
Hal Daume has written several major machine learning algorithms during his Ph.D. (now he is an assistant professor and rising star in machine learning community)
On his web page, there are a SVM, a simple decision tree and a logistic regression all in OCaml. By reading these code, you can have a feeling how machine learning models are implemented in OCaml.
Another good example of writing basic machine learning models is Owl library for scientific and numeric computations in OCaml.
I'd also like to mention F#, a new .Net language similar to OCaml. Here's a factor graph model written in F# analyzing Chess play data. This research also has a NIPS publication.
While FP is suitable for implementing machine learning and data mining models. But what you can get here most is NOT performance. It is right that FP supports parallel computing better than imperative languages, like C# or Java. But implementing a parallel SVM, or decision tree, has very little relation to do with the language! Parallel is parallel. The numerical optimizations behind machine learning and data mining are usually imperative, writing them pure-functionally is usually hard and less efficient. Making these sophisticated algorithms parallel is very hard task in the algorithm level, not in the language level. If you want to run 100 SVM in parallel, FP helps here. But I don't see the difficulty running 100 libsvm parallel in C++, not to consider that the single thread libsvm is more efficient than a not-well-tested haskell svm package.
Then what do FP languages, like F#, OCaml, Haskell, give?
Easy to test your code. FP languages usually have a top-level interpreter, you can test your functions on the fly.
Few mutable states. This means that passing the same parameter to a function, this function always gives the same result, thus debugging is easy in FPs.
Code is succinct. Type inference, pattern matching, closures, etc. You focus more on the domain logic, and less on the language part. So when you write the code, your mind is mainly thinking about the programming logic itself.
Writing code in FPs is fun.
The only problem I can see is that OCaml doesn't really support multicore parallelism, while GHC has excellent support and performance. If you're looking to use multiple threads of execution, on multiple calls, GHC Haskell will be a lot easier.
Secondly, the Haskell FFI is more powerful (that is, it does more with less code) than OCaml's, and more libraries are avaliable (via Hackage: http://hackage.haskell.org ) so I don't think foreign interfaces will be a deciding factor.
As far as multi-language integration goes, combining C and Haskell is remarkably easy, and I say this as someone who is (unlike dons) not really much of an expert on either. Any other language that integrates well with C shouldn't be much trickier; you can always fall back to a thin interface layer in C if nothing else. For better or worse, C is still the lingua franca of programming, so Haskell is more than acceptable for most cases.
...but. You say you're motivated by performance issues, and want to use "a functional language". From this I infer you're not previously familiar with the languages you ask about. Among Haskell's defining features are that it, by default, uses non-strict evaluation and immutable data structures--which are both incredibly useful in many ways, but it also means that optimizing Haskell for performance is often dramatically different from other languages, and well-honed instincts may lead you astray in baffling ways. You may want to browse performance-related topics on the Haskell wiki to get a feel for the issues.
Which isn't to say that you can't do what you want in Haskell--you certainly can. Both laziness and immutability can in fact be exploited for performance benefits (Chris Okasaki's thesis provides some nice examples). But be aware that there'll be a bit of a learning curve when it comes to dealing with performance.
Both Haskell and OCaml provide the lovely benefits of using an ML-family language, but for most programmers, OCaml is likely to offer a gentler learning curve and better immediate results.
It's hard to give a definitive answer on this. Haskell has the advantages that Don mentioned along with having a more powerful type system and cleaner syntax. OCaml will be easier to learn if you coming from almost any other language (this is because Haskell is as function as functional languages get), and working with mutable random access structures can be a little clunky in Haskell. You will also likely find the performance characteristics of your OCaml code more intuitive than Haskell because of Haskell's lazy evaluation.
Really, I would recommend you evaluate both if you have the time. Here are some relevant Haskell resources:
http://hackage.haskell.org/package/hslibsvm
http://hackage.haskell.org/package/HSvm
Real World Haskell: this is a great freely available book for Haskell
Learn You a Haskell: this tutorial is just plain fun to read
Oh, if you look further into Haskell be sure to sign up for the Haskell Beginners and Haskell Cafe lists. The community is friendly and eager to help out newcomers (is my bias showing?).
If speed is your prime concern then go for C. Haskell is pretty good performance wise but you are never going to get as fast as C. To my knowledge the only functional language that has bettered C in a benchmark is Stalin Scheme but that is very old and nobody really knows how it works.
I've written genetic programming libraries where performance was key and I wrote it in a functional style in C. The functional style allowed me to easily parallelise it using OMP and it scales linearly upto 8 cores within a single process. You certainly can't do that in OCaml although Haskell is improving all the time with regards to concurrency and parallelism.
The downside of using C was that it took me months to finally find all the bugs and stop the core dumps which was extremely challenging because of the concurrency. Haskell would probably have caught 90% of those bugs on the first compilation.
So speed at any cost ? Looking back I'd wish I'd used Haskell as I could stand it to be 2 - 3 times slower if I'd saved over a month in development time.
While dons is correct that multicore parallelism at the thread level is better supported in Haskell, it sounds like you could live with process level parallelism (from your phrase: ideally separating out each execution to run in parallel.) which is supported quite well in OCaml. Keith pointed out that Haskell has a more powerful type system, but it can also be said that OCaml has a more powerful module system than Haskell.
As others have pointed out, OCaml's learning curve will be lower than Haskell's; you'll likely be more productive more quickly in OCaml. That said, learning OCaml is a great stepping-stone towards learning Haskell because many of the underlying concepts are very similar, so you could always migrate to Haskell later and find a lot of things familiar there. And as you pointed out, there is an OCaml-R bridge.
As an examples of Haskell and Ocaml in machine learning see stuff at Hal Daume and Lloyd Allison homepages. IMO it's is much more straightforward to achieve C++-like performance in Ocaml, than in Haskell. Through, as already said, Haskell has much nicer community (packages, tools and support), syntax&features (i.e. FFI, probability monads via typeclasses) and parallel programming support.
Having revamped OCaml-R, I've got a few comments to make on integrating OCaml and R. It might be worthwile to use OCaml to call R code, it works, but is not yet exactly straightforward. So using it to pilot R is worthwile. Integrating R functionality much more thoroughly is still cumbersome as, for example, much remains to be done to export R's type system and data to OCaml in a seamless way (you will have work to do). Moreover, the interaction of R's GC and OCaml's GC is a delicate point: you free n values in O(n^2) time, which isn't nice (to solve this point, you either need a more flexible R API, as far as I understand it, or to implement a GC in the binding itself as a big R array for proper interaction between GCs).
In a nutshell, I'd go for the "pilot R from OCaml" approach.
Contributions on the GC interaction layer and on mapping R datatypes to OCaml are most welcome.
You may want to take a look at this : http://www.haskell.org/pipermail/haskell-cafe/2010-May/077243.html
Late answer but a machine learning library in Haskell is available here : https://github.com/mikeizbicki/HLearn
This library implements various ML algorithms who are designed to have a much faster cross-validation than the usual implementations. It is based on the following paper Algebraic classifiers: a generic approach to fast cross-validation,
online training, and parallel training. The authors claims a 400x speed-up compared to the same task in Weka.
for haskell, consider checking hasktorch (which I managed to use for my AI thesis). for ocaml there seem to be tensorflow bindings.
So, I am writing some sort of a statistics program (actually I am redesigning it to something more elegant) and I thought I should use a language that was created for that kind of stuff (dealing with huge data of stats, connections between them and some sort of genetic/neural programming).
To tell you the truth, I just want an excuse to dive into lisp/smalltalk (aren't smalltalk/lisp/clojure the same? - like python and ruby? -semantics-wise) but I also want a language to be easily understood by other people that are fond of the BASIC language (that's why I didn't choose LISP - yet :D).
I also checked Prolog and it seems a pretty cool language (easy to do relations between data and easier than Lisp) but I'd like to hear what you think.
Thx
Edit:
I always confuse common lisp with Smalltalk. Sorry for putting these two langs together. Also what I meant by "other people that are fond of the BASIC language" is that I don't prefer a language with semantics like lisp (for people with no CS background) and I find Prolog a little bit more intuitive (but that's my opinion after I just messed a little bit with both of them).
Is there any particular reason not to use R? It's sort of a build vs. buy (or in this case download) decision. If you're doing a statistical computation, R has many packages off the shelf. These include many libraries and interfaces for various types of data sources. There are also interface libraries for embedding R in other languages such as Python, so you can build a hybrid application with a GUI in Python (for example) and a core computation engine using R.
In this case, you could possibly reduce the effort needed for implementation and wind up with a more flexible application.
If you've got your heart set on learning another language, by all means, do it. There are several good free (some as in speech, some as in beer) implementations of Smalltalk, Prolog and LISP.
If you're putting a user interface on the system, Smalltalk might be the better option. If you want to create large rule sets as a part of your application, Prolog is designed for this sort of thing. Various people have written about the LISP ephiphany that influences the way you think about programming but I can't really vouch for this from experience - I've only really used AutoLISP for writing automation scripts on AutoCAD.
At the risk of offending some, I have a hard time reconciling "easily understood by other people that are fond of the BASIC language" with any of the languages you mentioned. That's not intended as a criticism, but as an observation that each of the languages you mention has a style and natural idiom that's quite different from that of BASIC.
Smalltalk - pure OO from the ground up, usually (e.g. Squeak) coupled with an integrated environment that is simultaneously the IDE and the runtime. IOW you enter the Smalltalk VM and work inside it rather than just writing a text that is "source code".
LISP - much closer to functional programming (although with imperative overtones); the prefix notation is the first barrier to most people who "like" other languages, but the concept and use of macros is a much more substantial one.
Clojure - The combination of LISP, OO, and JVM integration makes this one even less BASIC-like.
Python and Ruby - I lump these together (at the risk of further annoying fans of either ;-) because they are both OO language with distinct notations that will take an outsider a bit of learning curve. The use of indentation-only for control nesting in Python and the Perl-like use of special characters in Ruby are often points of the complaint by newcomers. Although both can be written in an imperative style, that would be considered non-standard by seasoned users.
Prolog - This is the most unlike BASIC of all languages mentioned. All of the other languages you mentioned can be (ab)used in a semi-procedural style, but that is essentially impossible in Prolog. It requires a thorough understanding of, and comfort with, recursion to do anything non-trivial.
Code written with a "native accent" in essentially all of these languages (but especially Prolog, IMHO) will make use of idioms and concepts that are outside the norm for conventional BASIC programming. Put another way, if you pick one of these and then write code "with a BASIC accent" you've pretty much wasted the benefits that the language can offer.
I believe that all of them are worth learning for the concepts they can teach (or at least reinforce, depending on your background). But the similarity to Language X (for a wide range of values of X) is not what you'll get.
I can answer you partially
(aren't Smalltalk/Lisp/Clojure the same? - like python and ruby? -semantics-wise)
No, it is not. Smalltalk is OO language with message pass instead method calls. Lisp is Lisp ;-) It means truly functional language with the powerful macro system, OO support which is never seen in other languages (in CL) and many more features. Closure is Lisp-like language without many Lisp features but good integration to JVM. It's not supporting tail call optimization for example. And python or ruby are classic imperative OO languages with some limited functional ability. Note word limited. For example, Guido doesn't like functional programming and removed some functional features in version 2.5 and 2.6.
If you familiar with imperative procedural programming as in Python and you want to change your paradigm you should make your decision carefully.
Prolog is a very different language. It can be very hard to grasp, mainly because it relies heavily on recursion to do very basic tasks. If you are really willing then give it a go. It can be very powerful because it allows to expess relationships and solve complicated problems simply, typical examples are Towers of Hanoi or quicksort. It will change the way you think, which can be difficult if you are used to imperative languages.
If you're interested in Prolog then there's a free version of Visual Prolog available and the commercial version is reasonably priced.
It's a strong type offshoot of Prolog so isn't your classic implementation of the language, but has a respectable history - Borland marketed the DOS ancestor of it as Turbo-Prolog back in the late '80s.
It's also Windows only, but can be used to create standard Windows DLLs so you can link your code into a 'normal' windows programming language. I've never used the package in anger myself, but I did a couple of Prolog courses at Uni so have downloaded it from time to time to play with and look for possible uses and it looks solid enough. Might be just the set of cogs you're looking for.