What would programming languages look like if every computable thing could be done in 1 second? - programming-languages

Inspired by this question
Suppose we had a magical Turing Machine with infinite memory, and unlimited CPU power.
Use your imagination as to how this might be possible, e.g. it uses some sort of hyperspace continuum to automatically parallelize anything as much as is desired, so that it could calculate the answer to any computable question, no matter what it's time complexity is and number of actual "logical steps", in one second.
However, it can only answer computable questions in one second... so I'm not positing an "impossible" machine (at least I don't think so)... For example, this machine still wouldn't be able to solve the halting problem.
What would the programming language for such a machine look like? All programming languages I know about currently have to make some concessions to "algorithmic complexity"... with that constraint removed though, I would expect that all we would care about would be the "expressiveness" of the programming language. i.e. its ability to concisely express "computable questions"...
Anyway, in the interests of a hopefully interesting discussion, opening it up as community wiki...

SendMessage travelingSalesman "Just buy a ticket to the same city twice already. You'll spend much more money trying to solve this than you'll save by visiting Austin twice."
SendMessage travelingSalesman "Wait, they built what kind of computer? Nevermind."

This is not really logical. If a thing takes O(1) time, then doing n times will take O(n) time, even on a quantum computer. It is impossible that "everything" takes O(1) time.
For example: Grover's algorithm, the one mentioned in the accepted answer to the question you linked to, takes O(n^1/2) time to find an element in a database of n items. And thats not O(1).

The amount of memory or the speed of the memory or the speed of the processor doesn't define the time and space complexity of an algorithm. Basic mathematics do that. Asking what would programming languages look like if everything could be computed in O(1) is like asking how would our calculators look like if pi was 3 and the results of all square roots are integers. It's really impossible and if it isn't, it's not likely to be very useful.
Now, asking ourself what we would do with infinite process power and infinite memory could be a useful exercise. We'll still have to deal with complexity of algorithms but we'd probably work somehow differently. For that I recommend The Hundred-Year Language.

Note that even if the halting problem is not computable, "does this halt within N steps on all possible inputs of size smaller than M" is!
As such any programming language would become purely specification. All you need to do is accurately specify the pre and post conditions of a function and the compiler could implement the fastest possible code which implements your spec.
Also, this would trigger a singularity very quickly. Constructing an AI would be a lot easier if you could do near infinite computation -- and once you had one, of any efficiency, it could ask the computable question "How would I improve my program if I spent a billion years thinking about it?"...

It could possibly be a haskell-ish language. Honestly it's a dream to code in. You program the "laws" of your types, classes, and functions and then let them loose. It's incredibly fun, powerful, and you can write some very succinct and elegant code. It's like an art.

Maybe it would look more like pseudo-code than "real" code. After all, you don't have to worry about any implementation details any more because whichever way you go, it'll be sufficiently fast enough.

Scalability would not be an issue any longer. We'd have AIs way smarter than us.
We wouldn't need to program any longer and instead the AI would figure out our intentions before we realize them ourselves.

SQL is such a language - you ask for some piece of data and you get it. If you didn't have to worry about minute implementation details of the db this might even be fun to program in.

Your underestimate the O(1). It means that there exists a constant C>0 such that time to compute a problem is limited to this C.
What you ignore is that the actual value of C can be large and it can (and mostly is) different for different algorithms. You may have two algorithms (or computers - doesn't matter) both with O(1) but in one this C may be billion times bigger that in another - then the latter will be much slower and perhaps very slow in terms of time.

If it will all be done in one second, then most languages will eventually look like this, I call it DWIM theory (Do what I mean theory):
Just do what I said (without any bugs this time)
Because if we ever develop a machine that can compute everything in one second, then we will probably have mind control at that stage, and at the very least artificial intelligence.

I don't know what new languages would come up (I'm a physicist, not a computer scientist) but I'd still write my programs for it in Python.

Related

What's the overhead of the different forms of parallelism in Julia v0.5?

As the title states, what is the overhead of the different forms of parallelism, at least in the current implementation of Julia (v0.5, in case the implementation changes drastically in the future)? I am looking for some "practical measures", some general heuristics or ballparks to keep in my head for when it can be useful. For example, it's pretty obvious that multiprocessing won't give you gains in a loop like:
addprocs(4)
#parallel (+) for i=1:4
rand()
end
doesn't give you performance gains because each process is only taking one random number, but is there general heuristic for knowing when it will be worthwhile? Also, what about a heuristic for threading. It's surely a lower overhead than multiprocessing, but for example, with 4 threads, for what N is it a good idea to multithread:
A = rand(4)
Base.#threads (+) for i = 1:N
A[i%4+1]
end
(I know there isn't a threaded reduction right now, but let's act like there is, or edit with a better example). Sure, I can benchmark every example, but some good rules to keep in mind would go a long way.
In more concrete terms: what are some good rules of thumb?
How many numbers do you need to be adding/multiplying before threading gives performance enhancements, or before multiprocessing gives performance enhancements?
How much does the depend on Julia's current implementation?
How much does it depend on the number of threads/processes?
How much does the depend on the architecture? Are there good rules for knowing when the threshold should be higher/lower on a particular system?
What kinds of applications violate these heuristics?
Again, I'm not looking for hard rules, just general guidelines to guide development.
A few caveats: 1. I'm speaking from experience with version 0.4.6, (and prior), haven't played with 0.5 yet (but, as I hope my answer below demonstrates, I don't think this is essential vis-a-vis the response I give). 2. this isn't a fully comprehensive answer.
Nevertheless, from my experience, the overhead for multiple processes itself is very small provided that you aren't dealing with data movement issues. In other words, in my experience, any time that you ever find yourself in a situation of wishing something were faster than a single process on your CPU can manage, you're well past the point where parallelism will be beneficial. For instance, in the sum of random numbers example that you gave, I found through testing just now that the break-even point was somewhere around 10,000 random numbers. Anything more and parallelism was the clear winner. Generating 10,000 random number is trivial for modern computers, taking a tiny fraction of a second, and is well below the threshold where I'd start getting frustrated by the slowness of my scripts and want parallelism to speed them up.
Thus, I at least am of the opinion, that although there are probably even more wonderful things that the Julia developers could do to cut down on the overhead even more, at this point, anything pertinent to Julia isn't going to be so much of your limiting factor, at least in terms of the computation aspects of parallelism. I think that there are still improvements to be made in terms of enhancing both the ease and the efficiency of parallel data movement (I like the package that you've started on that topic as a good step. You and I would probably both agree there's still a ways more to go). But, the big limiting factors will be:
How much data do you need to be moving around between processes?
How much read/write to your memory do you need to be doing during your computations? (e.g. flops per read/write)
Aspect 1. might at times lean against using parallelism. Aspect 2. is more likely just to mean that you won't get so much benefit from it. And, at least as I interpret "overhead," neither of these really fall so directly into that specific consideration. And, both of these are, I believe, going to be far more heavily determined by your system hardware than by Julia.

Are limitations of CPU speed and memory prevent us from creating AI systems?

Many technology optimists say that in 15 years the speed of computers will be comparable with the speed of the human brain. This is why they believe that computers will achieve the same level of intelligence as humans.
If Moore's law holds, then every 18 months we should expect doubling of CPU speed. 15 years is 180 months. So, we will have the doubling 10 times. Which means that in 15 years computer will be 1024 times faster than they are now.
But is the speed the reason of the problem? If it is so, we would be able to build an AI system NOW, it would just 1024 times slower than in 15 years. Which means that to answer a question it will need 1024 second (17 minutes) instead of acceptable 1 second. But do we have now strong (but slow) AI system? I think no. Even if now (2015) we give to a system 1 hour instead of 17 minutes, or 1 day, or 1 month or even 1 year, it still will be unable to answer complex questions formulated in natural language. So, it is not the speed that causes problems.
It means that in 15 years our intelligence will not be 1024 faster than now (because we have no intelligence). Instead our "stupidity" will be 1024 times faster than now.
We need both faster hardware and better algorithms. Of course speed alone is not enough as you pointed out.
We need self-modifying meta-learning algorithms capable of creating hypotheses and performing experiments to verify them (like humans do). Systems that are learning to learn and self-improving. Algorithms that can prove that given self-modification is optimal in certain sense and will lead to even better self-modifications in the future. Systems that can reflect on and inspect their own software (can you call it consciousness ?). Such research is being done and may create superhuman intelligence in the future or even technological singularity as some believe.
There is one problem with this approach, though. People doing this research usually assume that consciousness is computable. That it is all about intelligence. They don't take into account experiences like pleasure and pain which have nothing to do (in my opinion) with computation nor intellect. You can understand pain through experience only (not intellectual speculation). Setting variable pleasure to 5 or behaving like one feels pleasure is very different from experiencing pleasure. Some people say that feelings originate in brain so it is enough to understand brain. Not necessarily. Child can ask: "How did they put small people inside TV box ?". Of course TV is just a receiver and there are no small people inside. Brain might be receiver too. Do we need higher knowledge for feelings and other experiences ?
The answer has to be answered in the context of computation and complexity.
Every algorithm has its own complexity and running time (See Big O notation). There are problems which are non-computable problems such as the halting problem. These problems are proven that an algorithm does not exists independent of the hardware.
Computable algorithms are described in the number of steps required with respect to the input to solve an algorithm. As the number of input increases, the execution time of the algorithm also increases. However, these algorithms can be categorized into two: exponential time algorithms and non-exponential time algorithms. Exponential time algorithms increases drastically with the number of input and becomes intractable.
These executing time of these problems can be improved with better hardware however the complexity will always be the same. This means that no matter what the CPU uses, the execution time will always require the same number of steps. This means that the hardware is important to provide an answer in less time but the hardness of the problem will always remain the same. Thus, the limitation of the hardware is not preventing us from creating an AI system. For instance, you can use parallel programming (ex: GPU) to improve the execution time of the algorithm drastically but the algorithm is still the same as a normal CPU algorithm.
I would say no. As you showed, speed is not the only factor of intelligence. I for one would think Language is, yes language. Language is the primary skill we learn as humans, so why not for computers? Language gives an understanding that can be understood across the globe, given you know that language. Humans use nonverbal and verbal language to communicate. But I honestly think it really works something like this:
Humans go through experiences. These experiences have a bigger impact on our lives the closer we are to our birth date, or the more emotional they are. For example, the first time we are told no means ALOT more to us as an infant than as a 70 year old adult. These get stored as either long term or short term memory and correlated to that event later on in life for reference. We mainly store events to learn from them to prevent negative experience or promote positive experiences.
Think of it as a tag cloud. The more often you do task A, the bigger the cloud is in memory. We then store crucial details such as type of emotion, location, smells etc. Now when we reference them again from memory we pick out those details and create a logical sentence:
Touching that stove hurt me when I was at grandma's house.
All of the bolded words would have to be stored to have a complete memory.
Now inside of this sentence we have learned a lot more things than just being hurt from the stove at grandma's house. We have learned that stove's can be hot, dangerous, and grandma allows it to be in her house. We also learned how long it takes to heal from such an event, emotionally and physically to gauge how important the event is. And so much more. So we also store this sub-event information inside of other knowledge bubbles. And these bubbles continue to grow exponentially.
Now when asked: Are stoves dangerous?
You can identify the words in the sentence:
are, stoves, dangerous, question
and reference the definition of dangerous as: hurt, bad
and then provide more evidence that this is true, such as personal experience to result in:
Yes, stoves are dangerous because I was hurt at grandmas house by one.
So intelligence seems to be a mix of events, correlation and data retrieval to solve some solution. I'm sure there's a lot more to it than that but this is just my understanding of intelligence.

How is FRP handled in terms of memory?

Reading about FRP (Functional Reactive Programming) I'm amazed about how intuitive and logical it seems compared to the standard imperative approach; one thing however puzzles me.. How doesn't the computer immediately run out of memory doing it?
From what I've gathered from [here], is that in FRP the complete history (past, present, and future) of a value's change is first class. That notion immediately rings an alarm in my head saying it has got to eat up your memory very fast if it's used in an environment where the past of the value isn't cleared from memory immediately.
Reading about [Fran], I've noticed several of the examples having recursively defined functions with no termination condition. If the function never terminates and returns its value to the function calling it, how is it ever going to get anything done? Or for that matter, how's it not blowing the stack after a while? Even a lazy language like Haskell will run into stack overflows at some point.
An explanation of these things would be greatly appreciated, as it completely baffles me.
The fact that this can work for simple cases should not be much of a surprise: we already comfortably use infinite data structures in Haskell thanks to laziness and garbage collection. As long as your final result does not depend on having all your values at once, they can be collected as you go along or not forced in the first place.
This is why this classical Fibonacci example runs in constant¹ space: previous entries in the list are not needed once the next two are calculated, so they are collected as you go along—as long as you do not have any other pointers to the list.
fib n = fibs !! n
where fibs = 0 : 1 : zipWith (+) fibs (drop 1 fibs)
Try running this function for different inputs and looking at memory usage. (Run it with +RTS -s.)
(If you want a more detailed explanation with diagrams, take a look at this post I wrote.)
The point is, even if an unbounded amount of information is available to the programmer, we can still garbage collect most of it if nothing else depends on it.
Exactly the same logic can be used to implement FRP programs efficiently.
Of course, everything is not that easy. In the fibs example, the memory usage would go way up if we had an active pointer to the beginning of the fibs list. The same thing happens with FRP if you have a computation that depends on too much past data: it's called a time leak.
Dealing with time leaks is one of the open problems with implementing an efficient, well-behaved FRP framework. It's difficult to provide expressive FRP abstractions without allowing the possibility of poor or even catastrophic memory usage. I believe most current approaches end up providing abstract FRP types along with a blessed set of operations that is less likely to cause these sorts of leaks; a particularly extreme form of this is Arrowized FRP which does not provide a behavior/signal type at all but rather expresses everything with transformations between signals (as arrows).
I've never tried to implement a nice FRP system myself, so I can't really explain the problems in any more detail. If you're interested in more details on this topic, a great place to look is Conal Elliott's blog—with this post as a good starting point. You can also take a look at some of the papers he's written like "Push-Pull Functional Reactive Programming" as well as other papers on the subject, including some about Arrowized FRP like "Functional Reactive Programming, Continued" (chosen almost at random).
footnotes
¹ It's not really constant space because the intermediate results get bigger themselves. But it should maintain a constant number of list cells in memory.
About the time leaks part of your question: this is indeed one of the main challenges in implementing FRP. However, FRP researchers and implementers have found several ways to avoid them.
It all depends on the precise API you offer for signals. The main question is whether or not you provide higher-order FRP. This often takes the form of a "monadic join" primitive for signals: a way to convert a signal of signals into a signal, or in other words an API to produce a signal that dynamically switches between a number of other signals. Such an API is very powerful, but can introduce the potential for time leaks, i.e. the problem that you ask about: the need to keep all of a signal's previous values in memory. However, as Heinrich Apfelmus mentions in a comment to a previous answer, there are ways to solve this by restricting higher-order APIs in certain ways, using the type system or otherwise. See that comment for links to further explanations.
Many FRP libraries simply do not offer higher-order APIs and thus (quite easily) avoid the problem of time leaks. You mentioned Elm, which is in this case, as mentioned here under "Signals are not monads in Elm". This does come at the cost of expressiveness, because no powerful monadic API is offered, but not everyone believes you need the general power of such an API in an FRP framework/library.
Finally, I recommend an interesting presentation by Elm's main author Evan Czaplicki who does a very good job of explaining these problems and providing an overview of possible ways to solve them. He categorizes FRP approaches according to how they solve them.

Standard (simple?) benchmark code/test?

Is there some kind of standard benchmarking system or outline or something? I am looking at go, llvm, d and other languages and i wanted to know how they fair in execution time, memory usage, etc.
I found https://benchmarksgame-team.pages.debian.net/benchmarksgame/ but the code is NOT THE SAME. One example is a C++ source is < 100 lines while the C source is >650. I hardly call that fair. Another test in its source has the stupid mistake of putting a lock inside the loop while other languages put it outside.
So i wanted to know some test i might consider looking at/running that perhaps uses no nonstandard or even complex libs. Like implemented completely inside a single source file. Something fair.
For several years the benchmarks game website featured this on the Help page -
What does "not fair" mean? (A fable)
They raced up, and down, and around and around and around, and forwards and backwards and sideways and upside-down.
Cheetah's friends said "it's not fair" - everyone knows Cheetah is the fastest creature but the races are too long and Cheetah gets tired!
Falcon's friends said "it's not fair" - everyone knows Falcon is the fastest creature but Falcon doesn't walk very well, he soars across the sky!
Horse's friends said "it's not fair" - everyone knows Horse is the fastest creature but this is only a yearling, you must stop the races until a stallion takes part!
Man's friends said "it's not fair" - everyone knows that in the "real world" Man would use a motorbike, you must wait until Man has fueled and warmed up the engine!
Snail's friends said "it's not fair" - everyone knows that a creature should leave a slime trail, all those other creatures are cheating!
Dalmatian's tail was banging on the ground. Dalmatian panted and between breaths said "Look at that beautiful mountain, let's race to the top!"
At that time "it's not fair" comments were mostly special pleading intended to gain an advantage for programming language X to the disadvantage of programming language Y.
But the issues your question raises are a little different.
Firstly, look at the n-body
programs on the benchmarks game
website. Even though the programs
are written in different languages
there's very little difference in
the way the programs are coded.
So far no one has found an effective
way to make use of quad-core for
this small n-body problem - so there
are no special multi-core programs.
The programs do not use non-standard
or complex libraries. The programs
are completely implemented inside a
single source file.
I said there's very little
difference in the way the n-body
programs are coded but does that
really mean the programs are the
same? Soon after the project had
been revived, 6 or 7 years ago I
remember an Ada programmer
half-joked about comparing apples to
oranges because the assembly language
from the Ada programs wasn't the
same as the assembly language from the C
programs - so obviously like wasn't
being compared to like :-)
otoh the Ada source code would have
to be written in a different way
than the C source code was written,
to make the Ada compiler produce the
same assembly language as the C compiler
produced.
otoh if the assembly language produced by
both compilers really was line-by-line
the same, why would there be a
performance difference?
When there's very little difference in the way
the programs are coded then at first glance the
comparison appears to be fair, but forcing
different languages to be coded like language X
may favour language X.
As Yannick Versley noted, the point
of using a different language is for
the different approaches that
language provides. In other words,
there's more than one way to do the
same thing.
Look at the mandelbrot programs on
the benchmarks game website - the
simplest C program is half the size
of the fastest C program; the
simplest C program is sequential and
uses doubles, the fastest C program
uses all 4 cores through OMP and GCC
intrinsics.
Other languages take different approaches to use all 4 cores - does that mean we should only compare sequential programs and ignore the reality of multi-core computing?
Other language implementations may not provide an equivalent to GCC intrinsics - does that mean we should only compare programs that use doubles? But other language implementations take different approaches in the way they represent doubles - does that mean we should ignore all floating point programs?
The problem is that programming languages (and programming language implementations) are more different than apples to oranges, but we still ask - Will my program be faster if I write it in language X? - and still wish for a simpler answer than - It depends how you write it!
The different tasks and different programs on the benchmarks game website show that some of the performance comparison answers are confusing and complicated - the details matter, a lot.
Benchmarking is not entirely about being fair - it's about choosing something for your own workload, within your restraints.
If you want to use the alioth shootout site, you can still get interesting information if you exclude solutions that are too verbose, or too slow (the exact balancing depends on what you want to do - do you write code that runs for five seconds, or one that will occupy a dozen computers for five months). Look at the most concise examples for one particular problem to see the general problem structure - then see what typical optimizations people applied to make the code run faster.
Having a benchmark with THE SAME code misses the point, because you need different things to help in different languages; Java has GC, which means that it will do well on the trees test, whereas you need custom memory allocation in C/C++ to compete with that (and that particular benchmark is structured so that standard malloc does really poorly), for the spectral-norm one, you need non-boxed double arrays...
If you want to come up with your own solutions, have a go at Project Euler - there are a lot of problems that do not depend on complex libraries, yet are challenging to optimize. Otherwise, try to come up with scoring criteria that you consider adequate to filter or rank the existing contributions in the shootout (or outside it - for example, there are ShedSkin and Cython solutions to some of the problems, which are "unofficial" because these languages are not included).

Write the longest possible loop

Recently I was asked this question in a technical discussion. What is the longest possible loop that can be written in computational science considering the machine/architecture on which it is to run? This loop has to be as long as possible and yet not an infinite loop and should not end-up crashing the program (Recursion etc...)
I honestly did not know how to attack this problem, so I asked him if is it practically possible. He said using some computer science concepts, you can arrive at a hypothetical number which may not be practical but nevertheless it will still not be infinite.
Anyone here; knows how to analyse / attack this problem.
P.S. Choosing some highest limit for a type that can store the highest numerical value is apparently not an answer.
Thanks in advance,
You are getting into the field of turing machines.
Simply put (lets stay in the deterministic fields...) your computer/machine can be in a finite number of states that are passed during the algorithm. every state is unique and only occours once, otherwise you would by defnition have an endless loop. like "goto".
we can remove that limitation, but it would not make much sense because then a trivial algorithm can be found that always has one more loop runs than every other possible algorithm.
so it depends on the machines possible states which you could naively translate by "its ram".
so the question now is: whats the longest possible loop on a machine that can be in X several states? and wikipedia gives the answer
Please read up on the Busy Beaver problem.
http://en.wikipedia.org/wiki/Busy_beaver
The largest possible finite value? As a mathematician, I find that ridiculous. Perhaps the problem could be explained better.
If we're talking about language limitation, well, some languages define arbitrary-precision integers (like Python and Common Lisp), so you could count up to any number you liked as far as the language goes. You could easily set it too large for any actual machine, but that's not a language limitation.
As far as counting on actual machines go, it's a matter of the number of possible states. For each bit you can allocate as a counter (and it doesn't have to be as one data element, since it's real easy to make an arbitrary-length counter), that's two states, so if you had 8G of memory available you could count to something like 2^8G with it. You could of course use the file system for more counter space.
Or, assuming you're not using physically reversible computation, you could check the minimum energy necessary to flip a bit, divide the amount of energy available (like the total expected solar output or whatever), and get a limit.
There is a limit for Turing machines of specified complexity. It goes up pretty fast.
There's too many possible answers to provide one.
If "long" means time, the following finite loop should be a good bet:
for(unsigned long long i = 0; i < ULONG_LONG_MAX; ++i) sleep(UINT_MAX);
Okay this answer is not really serious, but actually my opinion is that the question is totally useless to ask in a job interview. Why? According to S.Lott's answer, this could be about the Busy Beaver problem which is almost 50 years old and is totally unknown because nobody could make use of it in a real job.

Resources