What does it mean for a heuristic to be considered admissible? - search

I've been told that an admissible heuristic for a search algorithm is one which never overestimates the shortest path to the goal. However is it valid to have non-goal state nodes have a heuristic value of 0 or is their an additional rule of admissibility that also states that only goal states may have a 0 heuristic value?
For example the shortest path between a node and the goal state D is as follows:
A = 5
B = 4
C = 3
D = 0
Would the following heuristic be valid:
A = 4
B = 4
C = 0
D = 0
Would this heuristic also be valid (while also being useless)
A = 0
B = 0
C = 0
D = 0

An admissible heuristic is simply one that, as you said, does not overestimate the distance to a goal. It is allowed to underestimate, and the two examples you gave are indeed valid, admissible heuristics.
Typically in the kinds of algorithms we're talking about with these heuristics (for instance, A*), it is beneficial if the heuristics are as close to the truth as possible. So, like you already noticed yourself, the last example with heuristic values of 0 for all nodes would not be very useful. Typically you want your heuristic values to be as close to the truth as possible while still being admissible (making sure they never overestimate)

Related

compare strings lexicographically in multiple queries

Compare two numbers in the form of string lexicographically multiple times.
What I tried
the question is straight forward, to compare string after change but I am getting Time limit exceeded error because of multiple queries.
I was searching internet and came across segment tree to solve range queries. But alas I am not able to visualise, how it can help here.
Any hint is appreciated.
Segment tree seems overkill for this problem. When will B be lexicographically larger than A? When at index i, A[i] = 0, B [i] = 1 and A[0:i] = B [0:i]. Iterate over both strings at same time and keep in a set all indexes where they are different.
For each query at index i, update B to 1. Then check if B[i] = A[i]. IF they are equal, erase i from the indexes set. Otherwise, add it to the set. If there is no index left in the set, A and B are now equal => answer YES.
If there is at least 1 element, get the lowest index. If this index is j, that means A[0:j] = B[0:j] but A[j] != B[j]. So either A is 0 and B is 1 or A is 1 and B is 0. Depending on that, answer YES or NO.
This has complexity of O(Q log N), being Q the amount of queries

Use dynamic programming to find a subset of numbers whose sum is closest to given number M

Given a set A of n positive integers a1, a2,... a3 and another positive integer M, I'm going to find a subset of numbers of A whose sum is closest to M. In other words, I'm trying to find a subset A′ of A such that the absolute value |M - 􀀀 Σ a∈A′| is minimized, where [ Σ a∈A′ a ] is the total sum of the numbers of A′. I only need to return the sum of the elements of the solution subset A′ without reporting the actual subset A′.
For example, if we have A as {1, 4, 7, 12} and M = 15. Then, the solution subset is A′ = {4, 12}, and thus the algorithm only needs to return 4 + 12 = 16 as the answer.
The dynamic programming algorithm for the problem should run in
O(nK) time in the worst case, where K is the sum of all numbers of A.
You construct a Dynamic Programming table of size n*K where
D[i][j] = Can you get sum j using the first i elements ?
The recursive relation you can use is: D[i][j] = D[i-1][j-a[i]] OR D[i-1][j] This relation can be derived if you consider that ith element can be added or left.
Time complexity : O(nK) where K=sum of all elements
Lastly you iterate over entire possible sum you can get, i.e. D[n][j] for j=1..K. Which ever is closest to M will be your answer.
For dynamic algorithm, we
Define the value we would work on
The set of values here is actually a table.
For this problem, we define value DP[i , j] as an indicator for whether we can obtain sum j using first i elements. (1 means yes, 0 means no)
Here 0<=i<=n, 0<=j<=K, where K is the sum of all elements in A
Define the recursive relation
DP[i+1 , j] = 1 , if ( DP[i,j] == 1 || DP[i,j-A[i+1]] ==1)
Else, DP[i+1, j] = 0.
Don't forget to initialize the table to 0 at first place. This solves boundary and trivial case.
Calculate the value you want
Through bottom-up implementation, you can finally fill the whole table.
Now, things become easy. You just need to find out the closest value to M in the table whose value is one.
Here, just work on DP[n][j], since n covers the whole set. Find the closest j to M whose value is 1.
Time complexity is O(kn), since you iterate k*n times in total.

is there any paper or an explanation on how to implement a two dimensional KMP?

I tried to solve the problem of two dimensional search using a combination of Aho-Corasick and a single dimensional KMP, however, I still need something faster.
To elaborate, I have a matrix A of characters of size n1*n2 and I wish to find all occurrences of a smaller matrix B of size m1*m2 and I want that to be in O(n1*n2+m1*m2) if possible.
For example:
A = a b c b c b
b c a c a c
d a b a b a
q a s d q a
and
B = b c b
c a c
a b a
the algorithm should return the indexes of say, the upper left corner of the match, which in this case should return (0,1) and (0,3). notice that the occurrences may overlap.
There is an algorithm called the Baker-Bird algorithm that I just recently encountered that appears to be a partial generalization of KMP to two dimensions. It uses two algorithms as subroutines - the Aho-Corasick algorithm (which itself is a generalization of KMP), and the KMP algorithm - to efficiently search a two-dimensional grid for a pattern.
I'm not sure if this is what you're looking for, but hopefully it helps!

Given string s, find the shortest string t, such that, t^m=s

Given string s, find the shortest string t, such that, t^m=s.
Examples:
s="aabbb" => t="aabbb"
s="abab" => t = "ab"
How fast can it be done?
Of course naively, for every m divides |s|, I can try if substring(s,0,|s|/m)^m = s.
One can figure out the solution in O(d(|s|)n) time, where d(x) is the number of divisors of s. Can it be done more efficiently?
This is the problem of computing the period of a string. Knuth, Morris and Pratt's sequential string matching algorithm is a good place to get started. This is in a paper entitled "Fast Pattern Matching in Strings" from 1977.
If you want to get fancy with it, then check out the paper "Finding All Periods and Initial Palindromes of a String in Parallel" by Breslauer and Galil in 1991. From their abstract:
An optimal O(log log n) time CRCW-PRAM algorithm for computing all
periods of a string is presented. Previous parallel algorithms compute
the period only if it is shorter than half of the length of the
string. This algorithm can be used to find all initial palindromes of
a string in the same time and processor bounds. Both algorithms are
the fastest possible over a general alphabet. We derive a lower bound
for finding palindromes by a modification of a previously known lower
bound for finding the period of a string [3]. When p processors are
available the bounds become \Theta(d n p e + log log d1+p=ne 2p).
I really like this thing called the z-algorithm: http://www.utdallas.edu/~besp/demo/John2010/z-algorithm.htm
For every position it calculates the longest substring starting from there, that is also a prefix of the whole string. (in linear time of course).
a a b c a a b x a a a z
1 0 0 3 1 0 0 2 2 1 0
Given this "z-table" it is easy to find all strings that can be exponentiated to the whole thing. Just check for all positions if pos+z[pos] = n.
In our case:
a b a b
0 2 0
Here pos = 2 gives you 2+z[2] = 4 = n hence the shortest string you can use is the one of length 2.
This will even allow you to find cases where only a prefix of the exponentiated string matches, like:
a b c a
0 0 1
Here (abc)^2 can be cut down to your original string. But of course, if you don't want matches like this, just go over the divisors only.
Yes you can do it in O(|s|) time.
You can search for a "target" string of length n in a "source" string of length m in O(n+m) time. Build a solution based on that.
Let both source and target be s. An additional constraint is that 1 and any positions in the source that do not divide |s| are not valid starting positions for the match. Of course the search per se will always fail. But if there's a partial match and you have reached the end of the sourse string, then you have a solution to the original problem.
a modification to Boyer-Moore could possibly handle this in O(n) where n is length of s
http://en.wikipedia.org/wiki/Boyer%E2%80%93Moore_string_search_algorithm

Explanations about the mechanics of a simple factorial function

I'm new to Haskell, so I'm both naive and curious.
There is a definition of a factorial function:
factorial n = product [1..n]
I naively understand this as: make the product of every number between 1 and n. So, why does
factorial 0
return 1 (which is the good result as far as my maths are not too rusted)?
Thank you
That's because of how product is defined, something like:
product [] = 1
product (n:ns) = n * product ns
or equivalently
product = foldr (*) 1
via the important function foldr:
foldr f z [] = z
foldr f z (x:xs) = f x (foldr f z xs)
Read up on folding here. But basically, any recursion must have a base case, and product's base case (on an empty list) clearly has to be 1.
The story about empty product is long and interesting.
It has many sense to define it as 1.
Despite of that, there is some more debate about whether we are justified to define 00 as 1, although 00 can be thought of also as an empty product in most contexts. See the 00 debate here and also here.
Now I show an example, when empty product conventions can yield a surprising, unintuitive outcome.
How to define the concept of a prime, without the necessity to exclude 1 explicitly? It seems so unaesthetic, to say that "a prime is such and such, except for this and that". Can the concept of prime be defined with some handy definition which can exclude 1 in a "natural", "automatic" way, without mentioning the exclusion explicitly?
Let us try this approach:
Let us call a natural number c composite, iff c can be written as a product of some a1, ..., ⋅ an natural numbers, so that all of them must be different from c.
Let us call a natural number p prime, iff p cannot be written as a product of any a1, an natural numbers each differing from p.
Let us test whether this approach is any good:
6 = 6 ⋅ 1 3 ⋅ 26 is composite, this fact is witnessed by the following factorisation: 6 can be written as the product 3 ⋅ 2, or with other words, product of the ⟨3, 2⟩ sequence, notated as Π ⟨3, 2⟩.
Till now, our approach new is O.K.
5 = 5 ⋅ 1 1 ⋅ 55 is prime, there is no sequence ⟨a1, ... an⟩ such that
all its members a1, ... an would differ from 5
but the product itself, Π ⟨a1, ... an⟩ would equal 5.
Till now, our new approach is O.K.
Now let us investigate 1:
1 = Π ⟨⟩,
Empty product is a good witness, with it, 1 satisfies the definition of being a composite(!!!) Who is the witness? Where is the witnessing factorization? It is no other than the empty product Π ⟨⟩, the product of the empty sequence ⟨⟩.
Π ⟨⟩ equals 1
All factors of the empty product Π ⟨⟩, i.e. the members of the empty sequence ⟨⟩ satisfy that each of them differ from 1: simply because empty sequence ⟨⟩ does not have any members at all, thus none of its member can equal 1. (This argumentation is simply a vacuous truth, with members of the empty set).
thus 1 is a composite (with the trivial factorization of the Π ⟨⟩ empty product).
Thus, 1 is excluded being a prime, naturally and automatically, by definition. We have reached our goal. For this, we have exploited the convention about empty product being 1.
Some drawbacks: although we succeeded to exclude 1 being a prime, but at the same time, 0 "slipped in": 0 became a prime (at least in zero-divisor free rings, like natural numbers). Although this strange thing makes some theorems more concise formally (Goldbach conjecture, fundamental theorem of arithmetic), but I cannot stand for that it is not a drawback.
A bigger drawback, that some concepts of arithmetic seem to become untenable with this new approach.
In any case, I wanted only to demonstrate that defining the empty product as 1 can yield formalizing unintuitive things (which is not necessarily a problem, set theory abounds with unintuitive things, see how to produce gold for free), but at the same time, it can provide useful strength in some contexts.
It's traditional to define the product of all the elements of the empty list to be 1, just as it's traditional to define the sum of all the elements of the empty list to be 0. That way
(product list1) * (product list2) == product (list1 ++ list2)
among other convenient properties.
Also, your memory is correct, and 0! is defined to be 1. This also has many convenient properties, including being consistent with the definition of factorials in terms of the gamma function.
Not sure I understand your question, are you asking how to write such a function?
Just as an exercise, you could use pattern matching to approach it like this:
factorial :: Int->Int
factorial 0 = 1
factorial n = product [1..n]
The first line is the function declaration/type signature. The second two lines are equations defining the function - Haskell pattern matching matches up the actual runtime parameter to whichever equation is appropriate.
Of course as others have pointed out, the product function handles the zero case correctly for you.

Resources