What is the solution path yielded by a DFS on this graph - search

Using a DFS on this graph, the nodes are visited in the following order(for more than one successor node, nodes are pushed to the "frontier" in alphabetical order):
S->A->E->D->F->G
Is that visitation sequence the solution path aswell? If so, why is it not S->A->E->G, since G is also a successor node of E?
PS: Im new to algorithms, so if I'm clearly not understanding the concept, please let me know.

If you are visiting the nodes, the DFS approach will traverse the graph based on the creation order of the adjacency list.
For example, the order of inserting node E's successors may be in the following ways:
1- E-> D, G
2- E-> G, D
In the first way, you will traverse D->F->G or D->G directly and in both cases you will visit node G before traversing any of node E other successors, so you will not be able to traverse the pathS->A->E->G because node G will be already visited before from node D or F.
In the second way, you will traverse E->G directly, so this will result in traversing the path S->A->E->G, but also you will not be able to access node G from node D or F because it will be already visited from node E.
The previous scenario will happen if you are visiting with true or false, but if you are trying to find the shortest path using the costs on edges, then you will need to use Dijkstra's algorithm for finding shortest path on a graph, and you can read more about it here if you are not familiar with it.

I assume its taking into account both the heuristic and edge cost to determine the next node to visit.
Starting at S it looks at its three possibilities:
A = 9 + 13 = 21
B = 14 + 14 = 28
C = 15 + 15 = 30
It then chooses A and looks at its only available path from A and goes to E.
From E we have two possibilities:
D = 2 + 8 = 10
G = 19 + 0 = 19
It will then choose D and now it has two possibilities:
F = 11 + 5 = 16
G = 16 + 0 = 16
Its a tie so depending on how the algorithm was set it up and the solution you gave it goes to F which then has two possibilities:
E = 6 + 7 = 13
G = 6 + 0 = 6
It then goes to G and finally it sees that this is the goal node and returns the state sequence.

Related

Find H Index of the nodes of a graph using NetworkX

Definition of H Index used in this algorithm
Supposing a relational expression is represented as y = F(x1, x2, . . . , xn), where F returns an integer number greater than 0, and the function is to find a maximum value y satisfying the condition that there exist at least y elements whose values are not less than y. Hence, the H-index of any node i is defined as
H(i) = F(kj1 ,kj2 ,...,k jki)
where kj1, kj2, . . . , kjki represent the set of degrees of neighboring nodes of node i.
Now I want to find the H Index of the nodes of the following graphs using the algorithm given below :
Graph :
Code (Written in Python and NetworkX) :
def hindex(g, n):
nd = {}
h = 0
# print(len(list(g.neighbors(n))))
for v in g.neighbors(n):
#nd[v] = len(list(g.neighbors(v)))
nd[v] = g.degree(v)
snd = sorted(nd.values(), reverse=True)
for i in range(0,len(snd)):
h = i
if snd[i] < i:
break
#print("H index of " + str(n)+ " : " + str(h))
return h
Problem :
This algorithm is returning the wrong values of nodes 1, 5, 8 and 9
Actual Values :
Node 1 - 6 : H Index = 2
Node 7 - 9 : H Index = 1
But for Node 1 and 5 I am getting 1, and for Node 8 and 9 I am getting 0.
Any leads on where I am going wrong will be highly appreciated!
Try this:
def hindex(g, n):
sorted_neighbor_degrees = sorted((g.degree(v) for v in g.neighbors(n)), reverse=True)
h = 0
for i in range(1, len(sorted_neighbor_degrees)+1):
if sorted_neighbor_degrees[i-1] < i:
break
h = i
return h
There's no need for a nested loop; just make a decreasing list, and calculate the h-index like normal.
The reason for 'i - 1' is just that our arrays are 0-indexed, while h-index is based on rankings (i.e. the k largest values) which are 1-indexed.
From the definition of h-index: For a non-increasing function f, h(f) is max i >= 0 such that f(i) >= i. This is, equivalently, the min i >= 1 such that f(i) < i, minus 1. Here, f(i) is equal to sorted_neighbor_degrees[i - 1]. There are of course many other ways (with different time and space requirements) to calculate h.

TSP / CPP variant - subtour constraint

I'm developing an optimization problem that is a variant on Traveling Salesman. In this case, you don't have to visit all the cities, there's a required start and end point, there's a min and max bound on the tour length, you can traverse each arc multiple times if you want, and you have a nonlinear objective function that is associated with the arcs traversed (and number of times you traverse each arc). Decision variables are integers, how many times you traverse each arc.
I've developed a nonlinear integer program in Pyomo and am getting results from the NEOS server. However I didn't put in subtour constraints and my results are two disconnected subtours.
I can find integer programming formulations of TSP that say how to formulate subtour constraints, but this is a little different from the standard TSP and I'm trying to figure out how to start. Any help that can be provided would be greatly appreciated.
EDIT: problem formulation
50 arcs , not exhaustive pairs between nodes. 50 Decision variables N_ab are integer >=0, corresponds to how many times you traverse from a to b. There is a length and profit associated with each N_ab . There are two constraints that the sum of length_ab * N_ab for all ab are between a min and max distance. I have a constraint that the sum of N_ab into each node is equal to the sum N_ab out of the node you can either not visit a node at all, or visit it multiple times. Objective function is nonlinear and related to the interaction between pairs of arcs (not relevant for subtour).
Subtours: looking at math.uwaterloo.ca/tsp/methods/opt/subtour.htm , the formulation isn't applicable since I am not required to visit all cities, and may not be able to. So for example, let's say I have 20 nodes and 50 arcs (all arcs length 10). Distance constraints are for a tour of exactly length 30, which means I can visit at most three nodes (start at A -> B -> C ->A = length 30). So I will not visit the other nodes at all. TSP subtour elimination would require that I have edges from node subgroup ABC to subgroup of nonvisited nodes - which isn't needed for my problem
Here is an approach that is adapted from the prize-collecting TSP (e.g., this paper). Let V be the set of all nodes. I am assuming V includes a depot node, call it node 1, that must be on the tour. (If not, you can probably add a dummy node that serves this role.)
Let x[i] be a decision variable that equals 1 if we visit node i at least once, and 0 otherwise. (You might already have such a decision variable in your model.)
Add these constraints, which define x[i]:
x[i] <= sum {j in V} N[i,j] for all i in V
M * x[i] >= N[i,j] for all i, j in V
In other words: x[i] cannot equal 1 if there are no edges coming out of node i, and x[i] must equal 1 if there are any edges coming out of node i.
(Here, N[i,j] is 1 if we go from i to j, and M is a sufficiently large number, perhaps equal to the maximum number of times you can traverse one edge.)
Here is the subtour-elimination constraint, defined for all subsets S of V such that S includes node 1, and for all nodes i in V \ S:
sum {j in S} (N[i,j] + N[j,i]) >= 2 * x[i]
In other words, if we visit node i, which is not in S, then there must be at least two edges into or out of S. (A subtour would violate this constraint for S equal to the nodes that are on the subtour that contains 1.)
We also need a constraint requiring node 1 to be on the tour:
x[1] = 1
I might be playing a little fast and loose with the directional indices, i.e., I'm not sure if your model sets N[i,j] = N[j,i] or something like that, but hopefully the idea is clear enough and you can modify my approach as necessary.

How to delete the element in a list which is duplicated for linear representation in python?

https://imgur.com/a/0lFwssy
I want to draw an evolution diagram like this, [1,2,3,4] is an annotation to the point:
1 :(x=1,y=2)
2 :(x=2,y=3)
3 :(x=3,y=5)
4 :(x=4,y=6)
The connection is like:
a = [1,1,2,3] *Starting Point
b = [2,4,4,4] *Ending Point
And because point1 and point2 both connect to point4 and I don't want the connection of point to point4 because point1 evolved to point2 first.
So I want to get
https://imgur.com/a/asAUlHQ
c = [1,2,3]
d = [2,4,4]
I tried to use zip to write a for loop but it failed.
How to get c and d in python?
From what I understand it looks like you are looking for a minimum spanning tree for the graph where the edges are (a_i,b_i). You can do this as follows:
A = sp.sparse.csr_matrix((len(a),len(a)),dtype='bool')
A[a,b] = 1
c,d = sp.sparse.csgraph.minimum_spanning_tree(A).nonzero()
Note that the minimum spanning tree is not unique.

Recursive arithmetic sequence in Haskell

It's been nearly 30 years since I took an Algebra class and I am struggling with some of the concepts in Haskell as I work through Learn you a Haskell. The concept that I am working on now is "recursion". I have watched several youtube videos on the subject and found a site with the arithmetic sequence problem: an = 8 + 3(an-1) which I understand to be an = an-1 + 3 This is what I have in Haskell.
addThree :: (Integral a) => a -> a
addThree 1 = 8
addThree n = (n-1) + 3
Running the script yields:
addThree 1
8
addThree 2
4
addThree 3
6
I am able to solve this and similar recursions on paper, (after polishing much rust), but do not understand the syntax in Haskell.
My Question How do I define the base and the function in Haskell as per my example?
If this is not the place for such questions, kindly direct me to where I should post. I see there are Stack Exchanges for Super User, Programmers, and Mathematics, but not sure which of the Stack family best fits my question.
First a word on Algebra and you problem: I think you are slightly wrong - if we write 3x it usually means 3*x (Mathematicans are even more lazy then programmers) so your series indeed should look like an = 8 + 3*an-1 IMO
Then an is the n-th element in a series of a's: a0, a1, a2, a3, ... that's why you there is a big difference between (n-1) and addThree (n-1) as the last one would designate an-1 while the first one would just be a number not really connected to your series.
Ok, let's have a look at your series an = 8 + 3an-1 (this is how I would understand it - because otherwise you would have x=8+3*x and therefore just x = -4:
you can choose a0 - let's say it`s 0 (as you did?)
then a1=8+3*0 = 8
a2=8+3*8 = 4*8 = 32
a3=8+3*32 = 8+3*32 = 104
...
ok let's say you want to use recursion than the problem directly translates into Haskell:
a :: Integer -> Integer
a 0 = 0
a n = 8 + 3 * a (n-1)
series :: [Integer]
series = map a [0..]
giving you (for the first 5 elements):
λ> take 5 series
[0,8,32,104,320]
Please note that this is a very bad performing way to do it - as the recursive call in a really does the same work over and over again.
A technical way to solve this is to observe that you only need the previous element to get the next one and use Data.List.unfoldr:
series :: [Integer]
series = unfoldr (\ prev -> Just (prev, 8 + 3 * prev)) 0
now of course you can get a lot more fancier with Haskell - for example you can define the series as it is (using Haskells laziness):
series :: [Integer]
series = 0 : map (\ prev -> 8 + 3 * prev) series
and I am sure there are much more ways out there to do it but I hope this will help you along a bit

Count the Number of Zero's between Range of integers

. Is there any Direct formula or System to find out the Numbers of Zero's between a Distinct Range ... Let two Integer M & N are given . if I have to find out the total number of zero's between this Range then what should I have to do ?
Let M = 1234567890 & N = 2345678901
And answer is : 987654304
Thanks in advance .
Reexamining the Problem
Here is a simple solution in Ruby, which inspects each integer from the interval [m,n], determines the string of its digits in the standard base 10 positional system, and counts the occuring 0 digits:
def brute_force(m, n)
if m > n
return 0
end
z = 0
m.upto(n) do |k|
z += k.to_s.count('0')
end
z
end
If you run it in an interactive Ruby shell you will get
irb> brute_force(1,100)
=> 11
which is fine. However using the interval bounds from the example in the question
m = 1234567890
n = 2345678901
you will recognize that this will take considerable time. On my machine it does need more than a couple of seconds, I had to cancel it so far.
So the real question is not only to come up with the correct zero counts but to do it faster than the above brute force solution.
Complexity: Running Time
The brute force solution needs to perform n-m+1 times searching the base 10 string for the number k, which is of length floor(log_10(k))+1, so it will not use more than
O(n (log(n)+1))
string digit accesses. The slow example had an n of roughly n = 10^9.
Reducing Complexity
Yiming Rong's answer is a first attempt to reduce the complexity of the problem.
If the function for calculating the number of zeros regarding the interval [m,n] is F(m,n), then it has the property
F(m,n) = F(1,n) - F(1,m-1)
so that it suffices to look for a most likely simpler function G with the property
G(n) = F(1,n).
Divide and Conquer
Coming up with a closed formula for the function G is not that easy. E.g.
the interval [1,1000] contains 192 zeros, but the interval [1001,2000] contains 300 zeros, because a case like k = 99 in the first interval would correspond to k = 1099 in the second interval, which yields another zero digit to count. k=7 would show up as 1007, yielding two more zeros.
What one can try is to express the solution for some problem instance in terms of solutions to simpler problem instances. This strategy is called divide and conquer in computer science. It works if at some complexity level it is possible to solve the problem instance and if one can deduce the solution of a more complex problem from the solutions of the simpler ones. This naturally leads to a recursive formulation.
E.g. we can formulate a solution for a restricted version of G, which is only working for some of the arguments. We call it g and it is defined for 9, 99, 999, etc. and will be equal to G for these arguments.
It can be calculated using this recursive function:
# zeros for 1..n, where n = (10^k)-1: 0, 9, 99, 999, ..
def g(n)
if n <= 9
return 0
end
n2 = (n - 9) / 10
return 10 * g(n2) + n2
end
Note that this function is much faster than the brute force method: To count the zeros in the interval [1, 10^9-1], which is comparable to the m from the question, it just needs 9 calls, its complexity is
O(log(n))
Again note that this g is not defined for arbitrary n, only for n = (10^k)-1.
Derivation of g
It starts with finding the recursive definition of the function h(n),
which counts zeros in the numbers from 1 to n = (10^k) - 1, if the decimal representation has leading zeros.
Example: h(999) counts the zero digits for the number representations:
001..009
010..099
100..999
The result would be h(999) = 297.
Using k = floor(log10(n+1)), k2 = k - 1, n2 = (10^k2) - 1 = (n-9)/10 the function h turns out to be
h(n) = 9 [k2 + h(n2)] + h(n2) + n2 = 9 k2 + 10 h(n2) + n2
with the initial condition h(0) = 0. It allows to formulate g as
g(n) = 9 [k2 + h(n2)] + g(n2)
with the intital condition g(0) = 0.
From these two definitions we can define the difference d between h and g as well, again as a recursive function:
d(n) = h(n) - g(n) = h(n2) - g(n2) + n2 = d(n2) + n2
with the initial condition d(0) = 0. Trying some examples leads to a geometric series, e.g. d(9999) = d(999) + 999 = d(99) + 99 + 999 = d(9) + 9 + 99 + 999 = 0 + 9 + 99 + 999 = (10^0)-1 + (10^1)-1 + (10^2)-1 + (10^3)-1 = (10^4 - 1)/(10-1) - 4. This gives the closed form
d(n) = n/9 - k
This allows us to express g in terms of g only:
g(n) = 9 [k2 + h(n2)] + g(n2) = 9 [k2 + g(n2) + d(n2)] + g(n2) = 9 k2 + 9 d(n2) + 10 g(n2) = 9 k2 + n2 - 9 k2 + 10 g(n2) = 10 g(n2) + n2
Derivation of G
Using the above definitions and naming the k digits of the representation q_k, q_k2, .., q2, q1 we first extend h into H:
H(q_k q_k2..q_1) = q_k [k2 + h(n2)] + r (k2-kr) + H(q_kr..q_1) + n2
with initial condition H(q_1) = 0 for q_1 <= 9.
Note the additional definition r = q_kr..q_1. To understand why it is needed look at the example H(901), where the next level call to H is H(1), which means that the digit string length shrinks from k=3 to kr=1, needing an additional padding with r (k2-kr) zero digits.
Using this, we can extend g to G as well:
G(q_k q_k2..q_1) = (q_k-1) [k2 + h(n2)] + k2 + r (k2-kr) + H(q_kr..q_1) + g(n2)
with initial condition G(q_1) = 0 for q_1 <= 9.
Note: It is likely that one can simplify the above expressions like in case of g above. E.g. trying to express G just in terms of G and not using h and H. I might do this in the future. The above is already enough to implement a fast zero calculation.
Test Result
recursive(1234567890, 2345678901) =
987654304
expected:
987654304
success
See the source and log for details.
Update: I changed the source and log according to the more detailed problem description from that contest (allowing 0 as input, handling invalid inputs, 2nd larger example).
You can use a standard approach to find m = [1, M-1] and n = [1, N], then [M, N] = n - m.
Standard approaches are easily available: Counting zeroes.

Resources