Memoizing a Haskell Array

Memoizing a Haskell Array - haskell

Continuing with my looking into CRCs via Haskell, I've written the following code to generate a table for CRC32 calculation:
crc32Table = listArray (0, 255) $ map (tbl 0xEDB88320) [0..255]
tbl polynomial byte = (iterate f byte) !! 8
where f r = xor (shift r (-1)) ((r .&. 1) * polynomial)
This correctly generates the table. I want to make frequent accesses to this table but 1) don't want to hardcode the results into code and 2) don't want to recalculate this table every time I reference it.
How would I memoize this array in Haskell? The Haskell memoization pages haven't given me any clues.

The discussion at this question should help explain what's going on: When is memoization automatic in GHC Haskell?
As folks have said in comments, crc32Table, if it is monomorphically typed should only be computed once and retained.

Related

How to filter by predicate on index in Repa

I have two Repa arrays a1 and a2 and I would like to eliminate all the elements in a2 for which the corresponding index in a1 is above a certain threshold. For example:
import qualified Data.Array.Repa as R -- for Repa
import Data.Array.Repa (Z (..), (:.)(..))
a1 = R.fromFunction (Z :. 4) $ \(Z :. x) -> [8, 15, 9, 14] ! x
a2 = R.fromFunction (Z :. 4) $ \(Z :. x) -> [0, 1, 2, 3] ! x
threshold = 10
desired = R.fromFunction (Z :. 2) $ \(Z :. x) -> [0, 2] ! x
-- 15 and 14 are above the threshold, 10
One way to do this is with selectP but I would like to avoid using this, since it computes the arrays, and I would like my arrays to remain in delayed form, if possible.
Another way is with the repa-array, but stack solver does not seem to know how to import this library with resolver nightly-2017-04-10.

One way to look at this issue is that, in order to create a Repa Array, you need to know the size (extent) of the Array upon creation (eg. fromFunction), but, in case of filter operation, there is no way to know the size of the resulting Array in repa without applying a thresholding predicate, essentially computing values of the resulting Array.
Another way to look at it is, Delayed array is a simple function from an index to a value, which is fine for most operations. For filtering though, when you apply a predicate, in order to find a value at a particular index, you now need to know all values that come before that index in the resulting array, cause for any location, a value may be there, maybe not.
vector package solves this issue elegantly with stream fusion, and repa-array, next version of Repa, which is still in experimental stage, seems to be trying to use a similar approach, except with extention to higher dimensions (I might be wrong, haven't looked too closely).
So, short answer, there is no way to do filtering with Repa style functional fusion. Either:
stick to selectP - faster (probably), but less memory efficient (for sure), or
piggy back onto ifilter from vector package for sequential
filtering

You can build a list of pairs with zip, then filter by a predicate function with the type (Int,Int) -> Bool and lastly extract the first or second element of the pair (depending on which one you want) by using map fst or map snd respectively. Everything you need for this is in Prelude.
I hope this is enough information so you can put the pieces together yourself. If in doubt, look at the type signatures of the functions i mentioned.

Generating triangular number using iteration in haskell

I am trying to write a function in Haskell to generate triangular number, I am not allowed to use recursion, I am supposed to use iteration
here is my code ...
triSeries 0 = [0]
triSeries n = take n $iterate (\x->(0+x)) 1
I know that my function after iterate is wrong .
But It has been hours looking for a function, any hint please?

Start by writing out some triangular numbers
T(1) = 1
T(2) = 1 + 2
T(3) = 1 + 2 + 3
An iterative process to generate T(n) is to start from [1..n], take the first element of the list, and add it to a running total. In a language with mutable state, you might write:
def tri(n):
sum = 0
for x in [1..n]:
sum += x
return sum
In Haskell, you can iteratively consume a list of numbers and accumulate state via a fold function (foldl, foldr, or some variant). Hopefully that's enough to get started with.

Maybe wikipedia could be a hint, where something like
triangular :: Int -> Int
triangular x = x * (x + 1) `div` 2
could be got from.
triSeries could be something like
triSeries :: Int -> [Int]
triSeries x = map triangular [1..x]
and works like that
> triSeries 10
[1,3,6,10,15,21,28,36,45,55]
Talking about iterate. Maybe there is some way to use it here, but as John said, foldl would be sufficient. Take a look at this page, what are you looking is in the very beginning.

It is not clear what is meant by "recursion is not allowed, use iteration". All functions that appear to be "iterative" are recursive inside.
iterate in all your uses can only modify the input with a constant, and iterate (+1) 1 is the same as [1..]. Consider using a Data.List function that can combine a number from infinite range [1..] and the previously computed sum to produce a infinite list of such sums:
T_i=i+T_{i-1}
This is definitely cheaper than x*(x+1) div 2
Consider using a Data.List function that can produce an infinite list of finite lists of sums from a infinite list of sums. This is going to be cheaper than computing a list of 10, then a list of 11 repeating the same computation done for the list of 10, etc.

Could a concatenative language use prefix notation?

Concatenative languages have some very intriguing characteristics, such as being able to compose functions of different arity and being able to factor out any section of a function. However, many people dismiss them because of their use of postfix notation and how it's tough to read. Plus the Polish probably don't appreciate people using their carefully crafted notation backwards.
So, is it possible to have prefix notation? If it is, what would the tradeoffs be?
I have an idea of how it could work, but I'm not experienced with concatenative languages so I'm probably missing something. Basically, a function would be evaluated in reverse order and values would be pulled from the stack in reverse order. To demonstrate this, I'll compare postfix to what prefix would look like. Here are some concatenative expressions with the traditional postfix notation.
5 dup * ! Multiply 5 by itself
3 2 - ! Subtract 2 from 3
(1, 2, 3, 4, 5) [2 >] filter length ! Get the number of integers from 1 to 5
! that are greater than 2
The expressions are evaluated from left to right: in the first example, 5 is pushed on the stack, then dup duplicates the top value on the stack, then * multiplies the top two values on the stack. Functions pull their last argument first from the stack: in the second example, when - is called, 2 is at the top of the stack, but it is the last argument.
Here is what I think prefix notation would look like:
* dup 5
- 3 2
length filter (1, 2, 3, 4, 5) [< 2]
The expressions are evaluated from right to left, and functions pull their first argument first from the stack. Note how the prefix filter example reads much more closely to its description and looks similar to the applicative style. One issue I noticed is factoring things out might not be as useful. For example, in postfix notation you can factor out 2 - from 3 2 - to create a subtractTwo function. In prefix notation you can factor out - 3 from - 3 2 to create a subtractFromThree function, which doesn't seem as useful.
Barring any glaring issues, perhaps a concatenative language that uses prefix notation could win over the people who dislike postfix notation. Any insight is appreciated.

Well certainly, if your words are still fixed-arity then it's just a matter of executing tokens right to left.
It's only because of n-arity functions that prefix notation implies parenthesis, and it's only because of wanting human "reading order" to match execution order that being a stack language implies postfix.

I'm writing such a language right now as it happens, and so far I like some of the side-effects of using prefix notation. The semantics are based on Joy:
Files are parsed from left to right, but executed from right to left.
By extension, definitions must come after the point at which they are used.
As a nice side-effect, comments are simply lists which are dropped.
Here's the factorial function, for instance:
def 'fact [cond [* fact - 1 dup] [1 drop] dup]
I also find it easier to reason about the code as I'm writing it, but I don't have a strong background in concatenative languages. Here's my (probably-naive) derivation of the map function over lists. The 'nb' function drops something and is used for comments. 'stash [f]' pops into a temp, runs 'f' on the rest of the stack, then pushes the temp back on.
def 'map [q [cons map stash [head swap i] dup stash [tail dup]] [nb] is_cons nip]
nb [map [f] (cons x y) -> cons map [f] x f y
stash [tail dup] [f] (cons x y) = [f] y (cons x y)
dup [f] y (cons x y) = [f] [f] y (cons x y)
stash [head swap i] [f] [f] y (cons x y) = [f] x (f y)
cons map [f] x (f y) = cons map [f] x f y
map [f] [] -> []]

I just came from reading about the Om Language
Seems just what you are talking about. From it's description (emphasis mine):
The Om language is:
a novel, maximally-simple concatenative, homoiconic programming and algorithm notation language with:
minimal syntax, comprised of only three elements.
prefix notation, in which functions manipulate the remainder of the program itself. [...]
It also states that it's not finished, and will experience much change yet.
Still, it seems to be working, and really interesting as proof of concept.

I imagine a concatenative prefix language without stack. It could call functions, which would then themselves interpret code until they got all needed operands. Interpreter would then call next function. It would only need one memory construct - the result. Everything else could be read from the source code at time of execution. As you might have noticed, I am talking about interpreted language, not compiled one.

Computing recurrence relations in Haskell

Greetings, StackOverflow.
Let's say I have two following recurrence relations for computing S(i,j)
I would like to compute values S(0,0), S(0,1), S(1,0), S(2,0), etc... in asymptotically optimal way. Few minutes with pencil and paper reveal that it unfolds into treelike structure which can be transversed in several ways. Now, it's unlikely tree will be useful later on, so for now I'm looking to produce nested list like [[S(00)],[S(10),S(01)],[S(20),S(21),S(12),S(02)],...]. I have created a function to produce a flat list of S(i,0) (or S(0,j), depending on first argument):
osrr xpa p predexp = os00 : os00 * (xpa + rp) : zipWith3 osrr' [1..] (tail osrr) osrr
where
osrr' n a b = xpa * a + rp * n * b
os00 = sqrt (pi/p) * predexp
rp = recip (2*p)
I am, however, at loss as how to proceed further.

I would suggest writing it in a direct recursive style and using memoization to create your traversal:
import qualified Data.MemoCombinators as Memo
osrr p = memoed
where
memoed = Memo.memo2 Memo.integral Memo.integral osrr'
osrr' a b = ... -- recursive calls to memoed (not osrr or osrr')
The library will create an infinite table to store values you have already computed. Because the memo constructors are under the p parameter, the table exists for the scope of p; i.e. osrr 1 2 3 will create a table for the purpose of computing A(2,3), and then clean it up. You can reuse the table for a specific p by partially applying:
osrr1 = osrr p
Now osrr1 will share the table between all its calls (which, depending on your situation, may or may not be what you want).

First, there must be some boundary conditions that you've not told us about.
Once you have those, try stating the solution as a recursively defined array. This works as long as you know an upper bound on i and j. Otherwise, use memo combinators.

Haskell function taking a long time to process

I am doing question 12 of project euler where I must find the first triangle number with 501 divisors. So I whipped up this with Haskell:
divS n = [ x | x <- [1..(n)], n `rem` x == 0 ]
tri n = (n* (n+1)) `div` 2
divL n = length (divS (tri n))
answer = [ x | x <- [100..] , 501 == (divL x)]
The first function finds the divisors of a number.
The second function calculates the nth triangle number
The 3rd function finds the length of the list that are the divisors of the triangle number
The 4th function should return the value of the triangle number which has 501 divisors.
But so far this run for a while without returning a result. Is the answer very large or do I need some serious optimisation to make this work in a realistic amount of time?

You need to use properties of divisor function: http://en.wikipedia.org/wiki/Divisor_function
Notice that n and n + 1 are always coprime, so that you can get d(n * (n + 1) / 2) by multiplying previously computed values.

It is probably faster to prime-factorise the number and then use the factorisation to find the divisors, than using trial division with all numbers <= sqrt(n).
The Sieve of Eratosthenes is a classical way of finding primes, which may be modified slightly to find the number of divisors of each natural number. Instead of just marking each non-prime as "not prime", you could make a list of all the primes dividing each number.
You can then use those primes to compute the complete set of divisors, or just the number of them, since that is all you need.
Another variation would be to mark not just multiples of primes, but multiples of all natural numbers. Then you could simply use a counter to keep track of the number of divisors for each number.
You also might want to check out The Genuine Sieve of Eratosthenes, which explains why
trial division is way slower than the real sieve.
Last off, you should look carefully at the different kinds of arrays in Haskell. I think it is probably easier to use the ST monad to implement the sieve, but it might be possible to achieve the correct complexity using accumArray, if you can make sure that your update function is strict. I have never managed to get this to work though, so you are on your own here.

If you were using C instead of Haskell, your function would still take much time.
To make it faster you will need to improve the algorithm, using suggestions from the above answers. I suggest to change the title and question description accordingly. Following that I'll delete this comment.
If you wish, I can spoil the problem by sharing my solution.
For now I'll give you my top-level code:
main =
print .
head . filter ((> 500) . length . divisors) .
map (figureNum 3) $ [1..]
The algorithmic improvement lies in the divisors function. You can further improve it using rawicki's suggestion, but already this takes less than 100ms.

Some optimization tips:
check for divisors between 1 and sqrt(n). I promise you won't find any above that limit (except for the number itself).
don't build a list of divisors and count the list, but count them directly.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string