Periodicity (Fibonacci mod sequence) in infinites list Haskell - haskell

I need to create a function in Haskell, which works as follows
periodicity ::[Integer] ->[Integer]
periodicity [1,2,3,3,4,1,2,3,3,4...] = [1,2,3,4]
periodicity [0,1,2,2,5,4,3,3,0,1,2,5,4...] = [0,1,2,5,4,3]
That is to say, that from a list you get the part that is always repeated, what in Mathematical Sciences would be called period of a function.
I've tried this, but I doesn't work like I want for the reason that I want that work with infinites list
periodicty :: Eq a => [a] -> [a]
periodicity xs = take n xs
where l = length xs
n = head [m | m <- divisors l,
concat (replicate (l `div` m) (take m xs)) == xs]
I have found this function that gives me the length of period, I could have solved the problem, but I don't understand the code after where:
periodo 1 = 1
periodo n = f 1 ps 0
where
f 0 (1 : xs) pi = pi
f _ (x : xs) pi = f x xs (pi + 1)
ps = 1 : 1 : zipWith (\u v -> (u + v) `mod` n) (tail ps) ps

The function you want, as you have stated it, is impossible1.
But since you said you are really after is the Pisano period, it's enough to notice that two successive numbers is enough to determine the remainder of a fibonacci sequence (mod n or otherwise). So you are really looking for the first reoccurrence of an adjacent pair, e.g.
0, 1, 1, 2, 0, 2, 2, 1, 0, 1, 1, 2, 0, 2, 2, 1, 0, 1, 1, 2, 0, 2, 2, 1, 0
^^^^ ^^^^
[--------- 8 -----------)
I am not much for coding people's problems for them, but I can sketch the way I would solve this. One thing to keep in mind is that the periodicity might have a prefix that does not repeat -- I don't know whether this actually occurs in Fibonacci sequences mod n, but it occurs in general. So we need to be prepared to throw away a prefix.
First, zip the list with its tail to get a list of adjacent pairs
[ 0, 1, 1, 2, 0, 2, 2, 1 ...]
-> [(0,1), (1,1), (1,2), (2,0), (0,2), (2,2), (2,1), ... ]
From this, fold through the list building a Data.Map keyed on this pair, where the value is the index it first occurred. You could do this with foldr but I'd probably just use a recursive function with an accumulator. For the above example the map at each step would look like:
{(0,1): 0}
{(0,1): 0, (1,1): 1}
{(0,1): 0, (1,1): 1, (1,2): 2}
{(0,1): 0, (1,1): 1, (1,2): 2, (2,0): 3}
...
When you reach a point in the list where the key is already present, you can then subtract the current index from the one in the map, and there's your period.
1 Here's a proof. Let's say you have the specification for a Turing machine, and you make a list steps of the steps of its execution. This list will be finite if it halts, infinite otherwise. Now construct this list:
bad = zipWith const (cycle [1,2,3]) steps ++ cycle [1,2,3,4]
This list cycles with period 3 as long as the machine runs, and with period 4 afterward. So if the Turing machine halts, periodicity bad = 4, otherwise periodicity bad = 3. That is, periodicity can decide the halting problem, which is impossible.

What you are asking for is impossible for an arbitrary infinite list. We can only examine a finite sublist in finite time, and the next element of the list might, for all we know, break the pattern.
In your comments, you clarify that you really are looking for a periodic part of the Fibonacci sequence, modulo m. In that special case, it is possible, if I understand you correctly.
The Fibonacci sequence (mod m) is periodic after a certain point if either the same value repeats three times: the previous two values are both equal to their predecessors, so the function becomes periodic with a period of 1. It is also periodic after a certain point if any sequence of two or more numbers repeats even once, as then we know that the this value and its predecessor are repeats of the ones k and k-1 terms ago, and the function will generate the same subsequence again with period k. There is no shorter period, or we would have detected it, going left to right.
Furthermore, any sequence that repeats infinitely will repeat once first, so this detects all such sequences.
Therefore, a better way to calculate this than I originally wrote would be to search for the current number and its predecessor earlier in the list. (You can use luqui’s strategy of building a list of consecutive pairs, or search the same data structure recursively instead of building a new one.) If a match exists, the sequence is guaranteed to repeat with a period equal to the distance between the two appearances of the same pair.
That takes time quadratic in the length of the non-periodic initial subsequence, since you search each initial subsequence from the beginning. To do it in linear time with an upper bound of m ²+2 steps: we know there are only m possible values, meaning only m ² possible pairs of values, a sequence of k numbers contains k-1 consecutive pairs of numbers, and therefore by the pigeonhole principle the first m ²+2 elements of the sequence must contain some pair of consecutive values in two different places, and become periodic from the first instance of the pair onward. So searching that fixed-length initial subsequence suffices, and we can build a table of the index (if any) of each of the n ² potential pairs in the list until we encounter the first duplicate. (That said, we would need to use a mutable array, so we sacrifice either speed or functional purity.)
This is similar to lugui’s algorithm, but with a faster lookup.
Conjecture
The sequence is periodic iff 0:1 appears more than once. If every Fibonacci sequence (mod m) is periodic, then the period is simply the position of the second occurrence of [0,1].
0:1 would be generated only by a preceding -1:1, which would be generated by a preceding -3:2, which would be generated by a preceding -8:5, and so on. [...,-8,5,-3,2,-1,1,0] is exactly the fibonacci sequence, backwards, with alternating sign, mod m, and if any two consecutive numbers appear in the original sequence, it is periodic. Thus, iff [0,1,1] would ever be generated by this pattern, it will eventually generate 0:1 in the Fibonacci sequence mod m. This occurs iff m-1 and 1 occur consecutively in Fibo mod m, in either order.
Two Special Cases
If Fibo mod m contains m-1:1 at position i, the sequence has period i+2, and if it contains 1:m-1, the sequence has period 2 i+4. (If the sequence contains 1:-1, the next position is i+2 and the next i+2 steps are: {0,-1,-1,-2,-3,-5,...,-1,1}). So this lets us shortcut a bit; when we see 1,4 at position 8 of Fibo mod 5, we know the sequence has a period of 20. In this special case, the scan needs fewer than half the elements on average, has an upper bound of m ²/2+1 elements to scan in order to rule the case out, and uses constant memory.

Related

Why is [a..b] an empty list when a > b?

If I enter [5..1] into the Haskell console it returns [], whereas I expected [5, 4, 3, 2, 1].
In general, [a..b] = [] if a > b. Why?
The Report covers the details. In Section 3.10:
Arithmetic sequences satisfy these identities:
[ e1..e3 ] = enumFromTo e1 e3
In Section 6.3.4:
For the types Int and Integer, the enumeration functions have the following meaning:
The sequence enumFromTo e1 e3 is the list [e1,e1 + 1,e1 + 2,…e3]. The list is empty if e1 > e3.
For Float and Double, the semantics of the enumFrom family is given by the rules for Int above, except that the list terminates when the elements become greater than e3 + i∕2 for positive increment i, or when they become less than e3 + i∕2 for negative i.
Then the next question is, "Why was the Report specified that way?". There I think the answer is that this choice is quite natural for a mathematician, which most of the original committee were to some extent. It also has a number of nice properties:
If [x..y] has n values, then [x..y-1] and [x+1..y] have n-1 values (where in n-1, the subtraction saturates at 0, an ahem natural choice).
Checking whether a particular element is in the range [x..y] only requires checking that it is bigger than x and smaller than y -- you need not first determine which of x or y is bigger.
It prevents a certain class of surprising off-by-one errors: if you want to take the next n>=0 elements after x, you can write [x..x+n-1]. If you choose the other rule, where [x..y] might mean [y,y+1,...,x] if y is smaller, there is no way to create an empty list with [_.._] syntax, so no uniform way to take the next n elements. One would have to write the more cumbersome if n>0 then [x..x+n-1] else []; and it would be very easy to forget to write this check.
If you would like the list [5,4,3,2,1], that may be achieved by specifying an explicit second step, as in [5,4..1].

Taxicab Numbers in Haskell

Taxicab number is defined as a positive integer that can be expressed as a sum of two cubes in at least two different ways.
1729=1^3+12^3=9^3+10^3
I wrote this code to produce a taxicab number which on running would give the nth smallest taxicab number:
taxicab :: Int -> Int
taxicab n = [(cube a + cube b)
| a <- [1..100],
b <- [(a+1)..100],
c <- [(a+1)..100],
d <- [(c+1)..100],
(cube a + cube b) == (cube c + cube d)]!!(n-1)
cube x = x * x * x
But the output I get is not what I expected.For the numbers one to three the code produces correct output but taxicab 4 produces 39312 instead of 20683.Another strange thing is that 39312 is originally the 6th smallest taxicab number-not fourth!
So why is this happening? Where is the flaw in my code?
I think you mistakenly believe that your list contains the taxicab numbers in an increasing order. This is the actual content of your list:
[1729,4104,13832,39312,704977,46683,216027,32832,110656,314496,
216125,439101,110808,373464,593047,149389,262656,885248,40033,
195841,20683,513000,805688,65728,134379,886464,515375,64232,171288,
443889,320264,165464,920673,842751,525824,955016,994688,327763,
558441,513856,984067,402597,1016496,1009736,684019]
Recall that a list comprehension such as [(a,b) | a<-[1..100],b<-[1..100]] will generate its pairs as follows:
[(1,1),...,(1,100),(2,1),...,(2,100),...,...,(100,100)]
Note that when a gets to its next value, b is restarted from 1. In your code, suppose you just found a taxicab number of the form a^3+b^3, and then no larger b gives you a taxicab. In such case the next value of a is tried. We might find a taxicab of the form (a+1)^3+b'^3 but there is no guarantee that this number will be larger, since b' is any number in [a+2..100], and can be smaller than b. This can also happen with larger values of a: when a increases, there's no guarantee its related taxicabs are larger than what we found before.
Also note that, for the same reason, an hypotetical taxicab of the form 101^3+b^3 could be smaller than the taxicabs you have on your list, but it does not occur there.
Finally, note that you function is quite inefficient, since every time you call taxicab n you recompute all the first n taxicab values.

Counting change in Haskell

I came across the following solution to the DP problem of counting change:
count' :: Int -> [Int] -> Int
count' cents coins = aux coins !! cents
where aux = foldr addCoin (1:repeat 0)
where addCoin c oldlist = newlist
where newlist = (take c oldlist) ++ zipWith (+) newlist (drop c oldlist)
It ran much faster than my naive top-down recursive solution, and I'm still trying to understand it.
I get that given a list of coins, aux computes every solution for the positive integers. Thus the solution for an amount is to index the list at that position.
I'm less clear on addCoin, though. It somehow uses the value of each coin to draw elements from the list of coins? I'm struggling to find an intuitive meaning for it.
The fold in aux also ties my brain up in knots. Why is 1:repeat 0 the initial value? What does it represent?
It's a direct translation of the imperative DP algorithm for the problem, which looks like this (in Python):
def count(cents, coins):
solutions = [1] + [0]*cents # [1, 0, 0, 0, ... 0]
for coin in coins:
for i in range(coin, cents + 1):
solutions[i] += solutions[i - coin]
return solutions[cents]
In particular, addCoin coin solutions corresponds to
for i in range(coin, cents + 1):
solutions[i] += solutions[i - coin]
except that addCoin returns a modified list instead of mutating the old one. As to the Haskell version, the result should have an unchanged section at the beginning until the coin-th element, and after that we must implement solutions[i] += solutions[i - coin].
We realize the unchanged part by take c oldlist and the modified part by zipWith (+) newlist (drop c oldlist). In the modified part we add together the i-th elements of the old list and i - coin-th elements of the resulting list. The shifting of indices is implicit in the drop and take operations.
A simpler, classic example for this kind of shifting and recursive definition is the Fibonacci numbers:
fibs = 0 : 1 : zipWith (+) fibs (tail fibs)
We would write this imperatively as
def fibs(limit):
res = [0, 1] + [0]*(limit - 2)
for i in range(2, limit):
res[i] = res[i - 2] + res[i - 1]
return res
Turning back to coin change, foldr addCoin (1:repeat 0) corresponds to the initialization of solutions and the for loop on the coins, with the change that the initial list is infinite instead of finite (because laziness lets us do that).

Generating triangular number using iteration in haskell

I am trying to write a function in Haskell to generate triangular number, I am not allowed to use recursion, I am supposed to use iteration
here is my code ...
triSeries 0 = [0]
triSeries n = take n $iterate (\x->(0+x)) 1
I know that my function after iterate is wrong .
But It has been hours looking for a function, any hint please?
Start by writing out some triangular numbers
T(1) = 1
T(2) = 1 + 2
T(3) = 1 + 2 + 3
An iterative process to generate T(n) is to start from [1..n], take the first element of the list, and add it to a running total. In a language with mutable state, you might write:
def tri(n):
sum = 0
for x in [1..n]:
sum += x
return sum
In Haskell, you can iteratively consume a list of numbers and accumulate state via a fold function (foldl, foldr, or some variant). Hopefully that's enough to get started with.
Maybe wikipedia could be a hint, where something like
triangular :: Int -> Int
triangular x = x * (x + 1) `div` 2
could be got from.
triSeries could be something like
triSeries :: Int -> [Int]
triSeries x = map triangular [1..x]
and works like that
> triSeries 10
[1,3,6,10,15,21,28,36,45,55]
Talking about iterate. Maybe there is some way to use it here, but as John said, foldl would be sufficient. Take a look at this page, what are you looking is in the very beginning.
It is not clear what is meant by "recursion is not allowed, use iteration". All functions that appear to be "iterative" are recursive inside.
iterate in all your uses can only modify the input with a constant, and iterate (+1) 1 is the same as [1..]. Consider using a Data.List function that can combine a number from infinite range [1..] and the previously computed sum to produce a infinite list of such sums:
T_i=i+T_{i-1}
This is definitely cheaper than x*(x+1) div 2
Consider using a Data.List function that can produce an infinite list of finite lists of sums from a infinite list of sums. This is going to be cheaper than computing a list of 10, then a list of 11 repeating the same computation done for the list of 10, etc.

Finding the minimum number of swaps to convert one string to another, where the strings may have repeated characters

I was looking through a programming question, when the following question suddenly seemed related.
How do you convert a string to another string using as few swaps as follows. The strings are guaranteed to be interconvertible (they have the same set of characters, this is given), but the characters can be repeated. I saw web results on the same question, without the characters being repeated though.
Any two characters in the string can be swapped.
For instance : "aabbccdd" can be converted to "ddbbccaa" in two swaps, and "abcc" can be converted to "accb" in one swap.
Thanks!
This is an expanded and corrected version of Subhasis's answer.
Formally, the problem is, given a n-letter alphabet V and two m-letter words, x and y, for which there exists a permutation p such that p(x) = y, determine the least number of swaps (permutations that fix all but two elements) whose composition q satisfies q(x) = y. Assuming that n-letter words are maps from the set {1, ..., m} to V and that p and q are permutations on {1, ..., m}, the action p(x) is defined as the composition p followed by x.
The least number of swaps whose composition is p can be expressed in terms of the cycle decomposition of p. When j1, ..., jk are pairwise distinct in {1, ..., m}, the cycle (j1 ... jk) is a permutation that maps ji to ji + 1 for i in {1, ..., k - 1}, maps jk to j1, and maps every other element to itself. The permutation p is the composition of every distinct cycle (j p(j) p(p(j)) ... j'), where j is arbitrary and p(j') = j. The order of composition does not matter, since each element appears in exactly one of the composed cycles. A k-element cycle (j1 ... jk) can be written as the product (j1 jk) (j1 jk - 1) ... (j1 j2) of k - 1 cycles. In general, every permutation can be written as a composition of m swaps minus the number of cycles comprising its cycle decomposition. A straightforward induction proof shows that this is optimal.
Now we get to the heart of Subhasis's answer. Instances of the asker's problem correspond one-to-one with Eulerian (for every vertex, in-degree equals out-degree) digraphs G with vertices V and m arcs labeled 1, ..., m. For j in {1, ..., n}, the arc labeled j goes from y(j) to x(j). The problem in terms of G is to determine how many parts a partition of the arcs of G into directed cycles can have. (Since G is Eulerian, such a partition always exists.) This is because the permutations q such that q(x) = y are in one-to-one correspondence with the partitions, as follows. For each cycle (j1 ... jk) of q, there is a part whose directed cycle is comprised of the arcs labeled j1, ..., jk.
The problem with Subhasis's NP-hardness reduction is that arc-disjoint cycle packing on Eulerian digraphs is a special case of arc-disjoint cycle packing on general digraphs, so an NP-hardness result for the latter has no direct implications for the complexity status of the former. In very recent work (see the citation below), however, it has been shown that, indeed, even the Eulerian special case is NP-hard. Thus, by the correspondence above, the asker's problem is as well.
As Subhasis hints, this problem can be solved in polynomial time when n, the size of the alphabet, is fixed (fixed-parameter tractable). Since there are O(n!) distinguishable cycles when the arcs are unlabeled, we can use dynamic programming on a state space of size O(mn), the number of distinguishable subgraphs. In practice, that might be sufficient for (let's say) a binary alphabet, but if I were to try to try to solve this problem exactly on instances with large alphabets, then I likely would try branch and bound, obtaining bounds by using linear programming with column generation to pack cycles fractionally.
#article{DBLP:journals/corr/GutinJSW14,
author = {Gregory Gutin and
Mark Jones and
Bin Sheng and
Magnus Wahlstr{\"o}m},
title = {Parameterized Directed \$k\$-Chinese Postman Problem and \$k\$
Arc-Disjoint Cycles Problem on Euler Digraphs},
journal = {CoRR},
volume = {abs/1402.2137},
year = {2014},
ee = {http://arxiv.org/abs/1402.2137},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
You can construct the "difference" strings S and S', i.e. a string which contains the characters at the differing positions of the two strings, e.g. for acbacb and abcabc it will be cbcb and bcbc. Let us say this contains n characters.
You can now construct a "permutation graph" G which will have n nodes and an edge from i to j if S[i] == S'[j]. In the case of all unique characters, it is easy to see that the required number of swaps will be (n - number of cycles in G), which can be found out in O(n) time.
However, in the case where there are any number of duplicate characters, this reduces to the problem of finding out the largest number of cycles in a directed graph, which, I think, is NP-hard, (e.g. check out: http://www.math.ucsd.edu/~jverstra/dcig.pdf ).
In that paper a few greedy algorithms are pointed out, one of which is particularly simple:
At each step, find the minimum length cycle in the graph (e.g. Find cycle of shortest length in a directed graph with positive weights )
Delete it
Repeat until all vertexes have not been covered.
However, there may be efficient algorithms utilizing the properties of your case (the only one I can think of is that your graphs will be K-partite, where K is the number of unique characters in S). Good luck!
Edit:
Please refer to David's answer for a fuller and correct explanation of the problem.
Do an A* search (see http://en.wikipedia.org/wiki/A-star_search_algorithm for an explanation) for the shortest path through the graph of equivalent strings from one string to the other. Use the Levenshtein distance / 2 as your cost heuristic.

Resources