AI50 Pset0 Degrees: What seems to be the problem in my code's shortest_path function? - search

This function is supposed to return a list containing tuples of (movie_id, person_id), in order to figure out the number of degrees of separation by movies between two different actors (whose person_id are denoted by source and target).
But it takes very long to compute the output, even with small datasets, and sometimes the program just crashes, even if the degree of separation is only 2 or 3. How can I make this algorithm more efficient?
See here for more background: https://cs50.harvard.edu/ai/2020/projects/0/degrees/
def shortest_path(source, target):
# create node for the start. initialise action and parent to None, and state to start id parameter
source_node = Node (source, None, None)
# create empty set of tuples called explored
explored = set()
frontier = QueueFrontier()
# add start node to frontier list
frontier.add (source_node)
# execute search for target id (while True loop):
while True:
# remove a node from frontier (name it removed_node)
removed_node = frontier.remove()
# check if removed_node.state is target id
if removed_node.state != target:
neighbours = neighbors_for_person(removed_node.state)
for neighbour in neighbours:
neighbour_node = Node (neighbour[1], removed_node.state, neighbour[0])
if neighbour != target:
frontier.add(neighbour_node)
else:
frontier.add_infront(neighbour_node)
explored.add (removed_node)
# if removed_node.state is not target id:
# get set of neighbours from get_neighbours function with its input as removed_node
# generate node for each neighbour in the set, and add all the neighbours to frontier
# add removed_node to explored set
else:
path = []
path.append ((removed_node.action, removed_node.state))
while removed_node.parent != None:
find_node = removed_node.parent
for a_node in explored:
if find_node == a_node.state:
removed_node = a_node
path.append ((removed_node.action, removed_node.state))
break
path.reverse()
path.pop(0)
print (path)
return path

Related

Applying the change of list at the start of a function

I have been trying to create a word search generator by using list within list. However, one of the things that I've been trying to do is that the list must preserve the word that it append before rather than start a blank list. The method that I did was to simply call on a previous function for this.
initial= list()
def board ():
inital = add()
test = list()
for a in range(5):
test.append(a)
initial = test.copy()
return initial
def add():
initial = board()
for a in range(5,10):
initial.append(a)
initial = initial.copy()
return initial
initial = add()
inital = board()
print(initial)
When I tried running it in the terminal, an error message appeared saying that the maximum recursion depth has exceeded. What does this means and how do I achieve the goal (having a list that prints 5,6,7,8,9,0,1,2,3,4)?

Priority Queue without built in Python functions

I am trying to setup a new server for a project in mechanical engineering. I plan on running a priority queue on the server for sorting what it will do next. Most mechanical engineers do not know how to code so I cannot find many people to help.
For reasons that are difficult to go into, I need to avoid most built in functions withing python. I want to limit myself to the following: pop, insert, if statements, and such. I have a simple one written, however, I need this in one function that computes whenever I input values. The current one essentially will sort at the end.
I want to make it into class PriorityQueue:
Then allow it to sort as items enter
class Queue:
def __init__(self, iL=[]):
self.aList = iL[:]
def enqueue(self, newItem:'item'):
"""put an item in back the queue"""
# your code...
self.aList.append(newItem)
def dequeue(self)->'item': #return the item
"""remove an item from front of the queue, return the item"""
# your code...
return self.aList.pop(0)
def empty(self):
"""Is the queue empty?"""
# your code...
return self.aList == []
def __repr__(self): # for testing only
return repr(self.aList)
# Use the Queue class as a Priority Queue by
# having a priority associated with each item, e.g.:
# (2,'code') where item 'code' have priority 2
# (1, 'eat') this entry has a priority 1 and the item 'eat'
# (3, 'sleep') 'sleep' has priority 3
# Let's consider #1 to be the highest priority, be #1, then 2, then 3
# given this type of priority queue as input
# define a function to sort the queue,
# such that highest priority item (smallest number)will be served first
# return a sorted queue
def queueSorted(qu:Queue) -> Queue:
# your code...
# must use only those 3 queue functions
# cannot use any python function for sorting
qusorted = Queue()
while not qu.empty():
# find min, enqueue it into sorted
qutemp = Queue()
imin = qu.dequeue()
while not qu.empty():
itemp = qu.dequeue()
if itemp < imin:
itemp, imin = imin, itemp
qutemp.enqueue(itemp)
qusorted.enqueue(imin)
# do rest
qu = qutemp
return qusorted

How to say if a word tree is similar to another?

I want to know when a tree is similar to part of another, for instance when
When did Beyonce start becoming popular?
is partly included in the following sentences :
She started becoming popular in the late 1990s (1st case)
She rose to fame in the late 1990s (2nd case)
I am able to say when one text is strictly included included in another : I created a class that transforms spaCy array to tree and I show later how to transform text to spaCy.
class WordTree:
'''Tree for spaCy dependency parsing array'''
def __init__(self, tree, is_subtree=False):
"""
Construct a new 'WordTree' object.
:param array: The array contening the dependency
:param parent: The parent of the array if exists
:return: returns nothing
"""
self.parent = []
self.children = []
self.data = tree.label().split('_')[0] # the first element of the tree # We are going to add the synonyms as well.
for subtree in tree:
if type(subtree) == Tree:
# Iterate through the depth of the subtree.
t = WordTree(subtree, True)
t.parent=tree.label().split('_')[0]
elif type(subtree) == str:
surface_form = subtree.split('_')[0]
self.children.append(surface_form)
And I can tell when one sentence is included in another tahnks to the following functions :
def isSubtree(T,S):
if S is None:
return True
if T is None:
return False
if areIdentical(T, S):
return True
return any(isSubtree(c, S) for c in T.children)
def areIdentical(root1, root2):
'''
function to say if two roots are identical
'''
# Base Case
if root1 is None and root2 is None:
return True
if root1 is None or root2 is None:
return False
# Check if the data of both roots their and children are the same
return (root1.data == root2.data and
((areIdentical(child1 , child2))
for child1, child2 in zip(root1.children, root2.children)))
Indeed, for instance :
# first tree creation
text = "start becoming popular"
textSpacy = spacy_nlp(text)
treeText = nltk_spacy_tree(textSpacy)
t = WordTree(treeText[0])
# second tree creation
question = "When did Beyonce start becoming popular?"
questionSpacy = spacy_nlp(question)
treeQuestion = nltk_spacy_tree(questionSpacy)
q = WordTree(treeQuestion[0])
# tree comparison
isSubtree(t,q)
Returns True. Therefore when can I say that a tree is partly included (1st case) in another or similar (2nd case) to another ?

How to Pop all items from a BST without destroying the tree?

So i was trying to create a method that return's me all the nodes of a binary search tree without deleting them. I want to let my data structure untouched.
But i can't find a way to do it!
The only think a have succeed at was to create a pop method that return's the root (cloned) of the bst and after that delete's it. The code is this:
def pop(self):
#Empty tree.
if self.root == None:
return None
#Copy the root.
return_node = self.root.clone()
#Find MAX Node.
if self.root.left != None:
#Starting node.
MAX = self.root.left
#Find max node.
while MAX.right != None:
MAX = MAX.right
#Swap data with root.
self.root.swap_data(MAX)
#Destroy max node.
MAX.destroy()
#Find MIN Node.
elif self.root.right != None:
#Starting node.
MIN = self.root.right
#Find max node.
while MIN.left != None:
MIN = MIN.left
#Swap data with root.
self.root.swap_data(MIN)
#Destroy max node.
MIN.destroy()
#Else set root to none.
else:
self.root = None
#Return the poped node.
self.size -= 1
return return_node
So this code works like a charm with this loop:
node = tree.pop()
while node != None:
#Do something with current node
node = tree.pop() #Keep moving.
but the problem is that the tree will finally be destroyed.
After that i thought that a traversal method could do the trick but
i couldn't succeed.
def preorder(self, root):
if root == None:
return
#Do something here.
#But how am i going to return all the nodes
#Using this traversal method?
self.preorder(root.left)
self.preorder(root.right)
So is there a way to get all the items from a binary search tree
without destroying it?
Hard to tell without code that implements your BST and without how you want the result to look like.
The pop method you provided is incorrect. During node removal, 3 cases are possible:
Curent node has no right child node -- in that case we need to move left to current
Current node has only right child node without left subchild -- we move right to current
Current node has right child with left subclild -- need to move leftmost of right child to current
Seems you're reinventing the wheel, but there's almost iconic implementation of BST in Python here
Hope that helps.

How to null out exceptions in an htmlChecker

While this is a project assignment for class I am trying to understand how to do a specific part of the project.
I need to go through an html file and check if all the opening statements are matched to closing statements. Further, they must be in the correct order and this must be checked using a stack I've implemented. As of right now I am working on extracting each tag from the file. The tough part seems to be the two exceptions that I am working on here. The and the . I need these tags to be removed so the program doesn't read them as an opening or closing statement.
class Stack(object):
def __init__(self):
self.items = []
def isEmpty(self):
return self.items = []
def push(self, item):
self.items.append(item)
def pop(self):
return self.items[-1]
def getTag(file):
EXCEPTIONS = ['br/', 'meta']
s = Stack()
balanced = True
i = 0
isCopying = False
currentTag = ''
isClosing = False
while i < len(file) and balanced:
if symbol == "<":
if i < (len(file) - 1) and file[i + 1] == "/":
i = i + 1
isClosing == True
isCopying == True
if symbol == ">":
if isClosing == True:
top = s.pop()
if not matches(top, symbol):
balanced = False
else:
**strong text**
s.push(currentTag)
currentTag = ''
isCopying == False
if isCopying == True:
currentTag += symbol
The code reads in the file and goes letter by letter to search for <string>. If it exists it pushes it on to the stack. The matches functions checks to see if the closing statement equals the opening statement. The exceptions list is the ones I have to check for that will screw up the placing of the strings on the stack. I am having a tough time trying to incorporate them into my code. Any ideas? Before I push on to the stack I should go through a filter system to see whether that statement is valid or not valid. A basic if statement should suffice.
If I read your requirements correctly, you're going about this very awkwardly. What you're really looking to do is tokenize your file, and so the first thing you should do is get all the tokens in your file, and then check to see if it is a valid ordering of tokens.
Tokenization means you parse through your file and find all valid tokens and put them in an ordered list. A valid token in your case is any string length that starts with a < and ends with a >. You can safely discard the rest of the information I think? It would be easiest if you had a Token class to contain your token types.
Once you have that ordered list of tokens it is much easier to determine if they are a 'correct ordering' using your stack:
is_correct_ordering algorithm:
For each element in the list
if the element is an open-token, put it on the stack
if the element is a close-token
if the stack is empty return false
if the top element of the stack is a matching close token
pop the top element of the stack
else return false
discard any other token
If the stack is NOT empty, return false
Else return true
Naturally, having a reasonable Token class structure makes things easy:
class Token:
def matches(t: Token) -> bool:
pass # TODO Implement
#classmethod
def tokenize(token_string: str) -> Token:
pass # TODO Implement to return the proper subclass instantiation of the given string
class OpenToken:
pass
class CloseToken:
pass
class OtherToken:
pass
This breaks the challenge into two parts: first parsing the file for all valid tokens (easy to validate because you can hand-compare your ordered list with what you see in the file) and then validating that the ordered list is correct. Note that here, too, you can simplify what you're working on by delegating work to a sub-routine:
def tokenize_file(file) -> list:
token_list = []
while i < len(file):
token_string, token_end = get_token(file[i:])
token_list.append = Token.tokenize(token_string)
i = i + token_end # Skip to the end of this token
return token_list
def get_token(file) -> tuple:
# Note this is a naive implementation. Consider the edge case:
# <img src="Valid string with >">
token_string = ""
for x in range(len(file)):
token_string.append(file[x])
if file[x] == '>':
return token_string, x
# Note that this function will fail if the file terminates before you find a closing tag!
The above should turn something like this:
<html>Blah<meta src="lala"/><body><br/></body></html>
Into:
[OpenToken('<html>'),
OtherToken('<meta src="lala"/>'),
OpenToken('<body>'),
OtherToken('<br/>'),
CloseToken('</body>'),
CloseToken('</html>')]
Which can be much more easily handled to determine correctness.
Obviously this isn't a complete implementation of your problem, but hopefully it will help straighten out the awkwardness you've chosen with your current direction.

Resources