What does heuristic h: {1, ... , N} --> R mean? - search

I want to know what does a heuristic h: {1, ... , N} --> R with the goal state always being 1 mean?
The state are represented as points in a 2D Cartesian system, with coordinates (x,y).

Wikipedia describes this notation of a function. In your specific case:
h: {1, ..., N] --> R
we have:
h: The function's symbol (h for heuristic)
{1, ..., N}: the domain of your function, in this case the set of all integers from 1 up to and including N. This is the ''input'' that your function can take. Note that this means that your function h(x) is not, for example, defined for x = 1.5. It can only take integers between 1 and N (both inclusive) as input.
R: The codomain of your functions, in this case the set R which is probably supposed to denote the set of all real numbers. Your function can produce any real number as output.

Related

Octave fplot abs looks very strange

f = #(x)(abs(x))
fplot(f, [-1, 1]
Freshly installed octave, with no configuration edited. It results in the following image, where it looks as if it is constant for a while around 0, looking more like a \_/ than a \/:
Why does it look so different from a usual plot of the absolute value near 0? How can this be fixed?
Since fplot is written in Octave it is relatively easy to read. Its location can be found using the which command. On my system this gives:
octave:1> which fplot
'fplot' is a function from the file /usr/share/octave/5.2.0/m/plot/draw/fplot.m
Examining fplot.m reveals that the function to be plotted, f(x), is evaluated at n equally spaced points between the given limits. The algorithm for determining n starts at line 192 and can be summarised as follows:
n is initially chosen to be 8 (unless specified differently by the user)
Construct a vector of arguments using a coarser grid of n/2 + 1 points:
x0 = linspace (limits(1), limits(2), n/2 + 1)'
(The linspace function will accept a non-integer value for the number of points, which it rounds down)
Calculate the corresponding values:
y0 = f(x0)
Construct a vector of arguments using a grid of n points:
x = linspace (limits(1), limits(2), n)'
Calculate the corresponding values:
y = f(x0)
Construct a vector of values corresponding to the members of x but calculated from x0 and y0 by linear interpolation using the function interp1():
yi = interp1 (x0, y0, x, "linear")
Calculate an error metric using the following formula:
err = 0.5 * max (abs ((yi - y) ./ (yi + y + eps))(:))
That is, err is proportional to the maximum difference between the calculated and linearly interpolated values.
If err is greater than tol (2e-3 unless specified by the user) then put n = 2*(n-1) and repeat. Otherwise plot(x,y).
Because abs(x) is essentially a pair of straight lines, if x0 contains zero then the linearly interpolated values will always exactly match their corresponding calculated values and err will be exactly zero, so the above algorithm will terminate at the end of the first iteration. If x doesn't contain zero then plot(x,y) will be called on a set of points that doesn't include the 'cusp' of the function and the strange behaviour will occur.
This will happen if the limits are equally spaced either side of zero and floor(n/2 + 1) is odd, which is the case for the default values (limits = [-5, 5], n = 8).
The behaviour can be avoided by choosing a combination of n and limits so that either of the following is the case:
a) the set of m = floor(n/2 + 1) equally spaced points doesn't include zero or
b) the set of n equally spaced points does include zero.
For example, limits equally spaced either side of zero and odd n will plot correctly . This will not work for n=5, though, because, strangely, if the user inputs n=5, fplot.m substitutes 8 for it (I'm not sure why it does this, I think it may be a mistake). So fplot(#abs, [-1, 1], 3) and fplot(#abs, [-1, 1], 7) will plot correctly but fplot(#abs, [-1, 1], 5) won't.
(n/2 + 1) is odd, and therefore x0 contains zero for symmetrical limits, only for every 2nd even n. This is why it plots correctly with n=6 because for that value n/2 + 1 = 4, so x0 doesn't contain zero. This is also the case for n=10, 14, 18 and so on.
Choosing slightly asymmetrical limits will also do the trick, try: fplot(#abs, [-1.1, 1.2])
The documentation says: "fplot works best with continuous functions. Functions with discontinuities are unlikely to plot well. This restriction may be removed in the future." so it is probably a bug/feature of the function itself that can't be fixed except by the developers. The ordinary plot() function works fine:
x = [-1 0 1];
y = abs(x);
plot(x, y);
The weird shape comes from the sampling rate, i.e. at how many points the function is evaluated. This is controlled by the parameter N of fplot The default call seems to accidentally skip x=0, and with fplot(#abs, [-1, 1], N=5) I get the same funny shape like you:
However, trying out different values of N can yield the correct shape, try e.g. fplot(#abs, [-1, 1], N=6):
Although in general I would suggest to use way higher numbers, like N=100.

Periodicity (Fibonacci mod sequence) in infinites list Haskell

I need to create a function in Haskell, which works as follows
periodicity ::[Integer] ->[Integer]
periodicity [1,2,3,3,4,1,2,3,3,4...] = [1,2,3,4]
periodicity [0,1,2,2,5,4,3,3,0,1,2,5,4...] = [0,1,2,5,4,3]
That is to say, that from a list you get the part that is always repeated, what in Mathematical Sciences would be called period of a function.
I've tried this, but I doesn't work like I want for the reason that I want that work with infinites list
periodicty :: Eq a => [a] -> [a]
periodicity xs = take n xs
where l = length xs
n = head [m | m <- divisors l,
concat (replicate (l `div` m) (take m xs)) == xs]
I have found this function that gives me the length of period, I could have solved the problem, but I don't understand the code after where:
periodo 1 = 1
periodo n = f 1 ps 0
where
f 0 (1 : xs) pi = pi
f _ (x : xs) pi = f x xs (pi + 1)
ps = 1 : 1 : zipWith (\u v -> (u + v) `mod` n) (tail ps) ps
The function you want, as you have stated it, is impossible1.
But since you said you are really after is the Pisano period, it's enough to notice that two successive numbers is enough to determine the remainder of a fibonacci sequence (mod n or otherwise). So you are really looking for the first reoccurrence of an adjacent pair, e.g.
0, 1, 1, 2, 0, 2, 2, 1, 0, 1, 1, 2, 0, 2, 2, 1, 0, 1, 1, 2, 0, 2, 2, 1, 0
^^^^ ^^^^
[--------- 8 -----------)
I am not much for coding people's problems for them, but I can sketch the way I would solve this. One thing to keep in mind is that the periodicity might have a prefix that does not repeat -- I don't know whether this actually occurs in Fibonacci sequences mod n, but it occurs in general. So we need to be prepared to throw away a prefix.
First, zip the list with its tail to get a list of adjacent pairs
[ 0, 1, 1, 2, 0, 2, 2, 1 ...]
-> [(0,1), (1,1), (1,2), (2,0), (0,2), (2,2), (2,1), ... ]
From this, fold through the list building a Data.Map keyed on this pair, where the value is the index it first occurred. You could do this with foldr but I'd probably just use a recursive function with an accumulator. For the above example the map at each step would look like:
{(0,1): 0}
{(0,1): 0, (1,1): 1}
{(0,1): 0, (1,1): 1, (1,2): 2}
{(0,1): 0, (1,1): 1, (1,2): 2, (2,0): 3}
...
When you reach a point in the list where the key is already present, you can then subtract the current index from the one in the map, and there's your period.
1 Here's a proof. Let's say you have the specification for a Turing machine, and you make a list steps of the steps of its execution. This list will be finite if it halts, infinite otherwise. Now construct this list:
bad = zipWith const (cycle [1,2,3]) steps ++ cycle [1,2,3,4]
This list cycles with period 3 as long as the machine runs, and with period 4 afterward. So if the Turing machine halts, periodicity bad = 4, otherwise periodicity bad = 3. That is, periodicity can decide the halting problem, which is impossible.
What you are asking for is impossible for an arbitrary infinite list. We can only examine a finite sublist in finite time, and the next element of the list might, for all we know, break the pattern.
In your comments, you clarify that you really are looking for a periodic part of the Fibonacci sequence, modulo m. In that special case, it is possible, if I understand you correctly.
The Fibonacci sequence (mod m) is periodic after a certain point if either the same value repeats three times: the previous two values are both equal to their predecessors, so the function becomes periodic with a period of 1. It is also periodic after a certain point if any sequence of two or more numbers repeats even once, as then we know that the this value and its predecessor are repeats of the ones k and k-1 terms ago, and the function will generate the same subsequence again with period k. There is no shorter period, or we would have detected it, going left to right.
Furthermore, any sequence that repeats infinitely will repeat once first, so this detects all such sequences.
Therefore, a better way to calculate this than I originally wrote would be to search for the current number and its predecessor earlier in the list. (You can use luqui’s strategy of building a list of consecutive pairs, or search the same data structure recursively instead of building a new one.) If a match exists, the sequence is guaranteed to repeat with a period equal to the distance between the two appearances of the same pair.
That takes time quadratic in the length of the non-periodic initial subsequence, since you search each initial subsequence from the beginning. To do it in linear time with an upper bound of m ²+2 steps: we know there are only m possible values, meaning only m ² possible pairs of values, a sequence of k numbers contains k-1 consecutive pairs of numbers, and therefore by the pigeonhole principle the first m ²+2 elements of the sequence must contain some pair of consecutive values in two different places, and become periodic from the first instance of the pair onward. So searching that fixed-length initial subsequence suffices, and we can build a table of the index (if any) of each of the n ² potential pairs in the list until we encounter the first duplicate. (That said, we would need to use a mutable array, so we sacrifice either speed or functional purity.)
This is similar to lugui’s algorithm, but with a faster lookup.
Conjecture
The sequence is periodic iff 0:1 appears more than once. If every Fibonacci sequence (mod m) is periodic, then the period is simply the position of the second occurrence of [0,1].
0:1 would be generated only by a preceding -1:1, which would be generated by a preceding -3:2, which would be generated by a preceding -8:5, and so on. [...,-8,5,-3,2,-1,1,0] is exactly the fibonacci sequence, backwards, with alternating sign, mod m, and if any two consecutive numbers appear in the original sequence, it is periodic. Thus, iff [0,1,1] would ever be generated by this pattern, it will eventually generate 0:1 in the Fibonacci sequence mod m. This occurs iff m-1 and 1 occur consecutively in Fibo mod m, in either order.
Two Special Cases
If Fibo mod m contains m-1:1 at position i, the sequence has period i+2, and if it contains 1:m-1, the sequence has period 2 i+4. (If the sequence contains 1:-1, the next position is i+2 and the next i+2 steps are: {0,-1,-1,-2,-3,-5,...,-1,1}). So this lets us shortcut a bit; when we see 1,4 at position 8 of Fibo mod 5, we know the sequence has a period of 20. In this special case, the scan needs fewer than half the elements on average, has an upper bound of m ²/2+1 elements to scan in order to rule the case out, and uses constant memory.

Proving nonregularity

Suppose I have a language L = {wxwR} where wR is the reverse of w, w and x has minimum length of 1, w can consist of either 0's or 1's, while x can only consist of 1's.
How do I prove that this language is not regular? Is there any other way than using the pumping lemma? If using the pumping lemma, I'm still figuring out what x,y, and z I should pick for the string s=xyz, I would appreciate if you give me any hint.
Thanks!
You should take another look into how to use the pumping lemma. You have to pick a string s, such that for each partition x,y,z, one of the pumping lemma conditions is violated.
So, let n be the "pumping-lemma-number". Pick s= 0^n 1 0^n.
From 1) you know that |xy| <= n. From 2) you know that |y|>=1. Thus y only contains 0s.
Following 3) uv^2w has also to be in L, but the first block of 0s is longer than the second one. This means 3) is violated and thus L is not regular.

Finding the minimum number of swaps to convert one string to another, where the strings may have repeated characters

I was looking through a programming question, when the following question suddenly seemed related.
How do you convert a string to another string using as few swaps as follows. The strings are guaranteed to be interconvertible (they have the same set of characters, this is given), but the characters can be repeated. I saw web results on the same question, without the characters being repeated though.
Any two characters in the string can be swapped.
For instance : "aabbccdd" can be converted to "ddbbccaa" in two swaps, and "abcc" can be converted to "accb" in one swap.
Thanks!
This is an expanded and corrected version of Subhasis's answer.
Formally, the problem is, given a n-letter alphabet V and two m-letter words, x and y, for which there exists a permutation p such that p(x) = y, determine the least number of swaps (permutations that fix all but two elements) whose composition q satisfies q(x) = y. Assuming that n-letter words are maps from the set {1, ..., m} to V and that p and q are permutations on {1, ..., m}, the action p(x) is defined as the composition p followed by x.
The least number of swaps whose composition is p can be expressed in terms of the cycle decomposition of p. When j1, ..., jk are pairwise distinct in {1, ..., m}, the cycle (j1 ... jk) is a permutation that maps ji to ji + 1 for i in {1, ..., k - 1}, maps jk to j1, and maps every other element to itself. The permutation p is the composition of every distinct cycle (j p(j) p(p(j)) ... j'), where j is arbitrary and p(j') = j. The order of composition does not matter, since each element appears in exactly one of the composed cycles. A k-element cycle (j1 ... jk) can be written as the product (j1 jk) (j1 jk - 1) ... (j1 j2) of k - 1 cycles. In general, every permutation can be written as a composition of m swaps minus the number of cycles comprising its cycle decomposition. A straightforward induction proof shows that this is optimal.
Now we get to the heart of Subhasis's answer. Instances of the asker's problem correspond one-to-one with Eulerian (for every vertex, in-degree equals out-degree) digraphs G with vertices V and m arcs labeled 1, ..., m. For j in {1, ..., n}, the arc labeled j goes from y(j) to x(j). The problem in terms of G is to determine how many parts a partition of the arcs of G into directed cycles can have. (Since G is Eulerian, such a partition always exists.) This is because the permutations q such that q(x) = y are in one-to-one correspondence with the partitions, as follows. For each cycle (j1 ... jk) of q, there is a part whose directed cycle is comprised of the arcs labeled j1, ..., jk.
The problem with Subhasis's NP-hardness reduction is that arc-disjoint cycle packing on Eulerian digraphs is a special case of arc-disjoint cycle packing on general digraphs, so an NP-hardness result for the latter has no direct implications for the complexity status of the former. In very recent work (see the citation below), however, it has been shown that, indeed, even the Eulerian special case is NP-hard. Thus, by the correspondence above, the asker's problem is as well.
As Subhasis hints, this problem can be solved in polynomial time when n, the size of the alphabet, is fixed (fixed-parameter tractable). Since there are O(n!) distinguishable cycles when the arcs are unlabeled, we can use dynamic programming on a state space of size O(mn), the number of distinguishable subgraphs. In practice, that might be sufficient for (let's say) a binary alphabet, but if I were to try to try to solve this problem exactly on instances with large alphabets, then I likely would try branch and bound, obtaining bounds by using linear programming with column generation to pack cycles fractionally.
#article{DBLP:journals/corr/GutinJSW14,
author = {Gregory Gutin and
Mark Jones and
Bin Sheng and
Magnus Wahlstr{\"o}m},
title = {Parameterized Directed \$k\$-Chinese Postman Problem and \$k\$
Arc-Disjoint Cycles Problem on Euler Digraphs},
journal = {CoRR},
volume = {abs/1402.2137},
year = {2014},
ee = {http://arxiv.org/abs/1402.2137},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
You can construct the "difference" strings S and S', i.e. a string which contains the characters at the differing positions of the two strings, e.g. for acbacb and abcabc it will be cbcb and bcbc. Let us say this contains n characters.
You can now construct a "permutation graph" G which will have n nodes and an edge from i to j if S[i] == S'[j]. In the case of all unique characters, it is easy to see that the required number of swaps will be (n - number of cycles in G), which can be found out in O(n) time.
However, in the case where there are any number of duplicate characters, this reduces to the problem of finding out the largest number of cycles in a directed graph, which, I think, is NP-hard, (e.g. check out: http://www.math.ucsd.edu/~jverstra/dcig.pdf ).
In that paper a few greedy algorithms are pointed out, one of which is particularly simple:
At each step, find the minimum length cycle in the graph (e.g. Find cycle of shortest length in a directed graph with positive weights )
Delete it
Repeat until all vertexes have not been covered.
However, there may be efficient algorithms utilizing the properties of your case (the only one I can think of is that your graphs will be K-partite, where K is the number of unique characters in S). Good luck!
Edit:
Please refer to David's answer for a fuller and correct explanation of the problem.
Do an A* search (see http://en.wikipedia.org/wiki/A-star_search_algorithm for an explanation) for the shortest path through the graph of equivalent strings from one string to the other. Use the Levenshtein distance / 2 as your cost heuristic.

Can someone help me with this proof using the pumping lemma?

I just started reading about the pumping lemma and know how to perform a few proofs, mostly by contradiction. It is only this particular question which I don't seem to find an answer for. I have no idea on how to begin. I can assume that there has to be a pumping length P and that for all w element of L that the LENGTH(w) >= P. And of course that we can write w as xyz with the three normal conditions of the pumping lemma.
I have to proof that the following language is non regular:
L = {x + y = z | x,y,z element of {0,1}* and #(x) + #(y) = #(z) }
Can someone help me on this, I really want to master the process in proofing these kind of questions?
Edit:
Sorry, forgot to say that the alphabet is {0,1,+,=} and # means the binary value of the string. Like #(00101) = 5 and #(110) = 6.
Since you want to master the process, I'll point out a few things before showing a proof.
The first thing to notice is that the + and the = may only appear once each. So when you write your string w as w = abc, the pumped portion, b, cannot contain + or = otherwise you'd reach a trivial contradiction (I'm not using the more standard w = xyz notation to avoid confusion with L's definition).
Another thing to notice is that normally, you'd pick a specific string w to pump. In this case, it could be easier to pick a class of strings that share a certain property. The pumping lemma only requires you to reach a contratiction using one string, but there's no reason you can't reach a contradiction with multiple strings.
Proof (in a spoiler):
So let w be any string in L such that |w| ≥ P and x, y, z do not contain leading 0's. By the pumping lemma we can write w as w = abc By pumping lemma, we know b is not empty. Since b cannot contain + or =, it is fully contained in either x, y, or z. Pumping w with any i ≠ 1 results in the binary equation no longer holding since exactly one of x, y, z would be a different number (this is why we needed the no leading 0's bit).
Choose as the string 1(0^n+1) + 1(0^n) = 11(0^n).
In other words, your string will read "the sum of two to the power n+2 plus two to the power n+1 is equal to 11 followed by n zeroes".
Since the string to be pumped will consist entirely of symbols from the first addend, pumping must change the number represented (adding or removing digits to a number will change the number; this is true because our string doesn't contain leading zeroes) and if x + y = z holds, then x' + y = z does not hold if x' != x (over integers, at least).
Since the pumping lemma requires pumped strings to be in the language, and pumping this string fails, we have that the language is not regular.

Resources