Trying to learn Python; code leads to infinite loop and I can't figure out why? - python-3.x

I am trying to learn Python and I am trying to run a random walk that plots the points. I have tried de-bugging this myself but I cannot figure out where this is going wrong. I apologise since this seems like a really simple problem but I am getting frustrated.
One file rw_visual.py sets things up and then calls the other file random_walk.py to generate the points in the walk.
rw_visual.py:
enter image description here
random_walk.py:
enter image description here
In debugging, rw_visual.py seems to run until it tries to run the command "rw.fill_walk()" and then it hangs. This tells me that there is something wrong in the while loop in random_walk.py causing this. As hard as I try, I cannot figure it out thought.
Sorry for the very basic question.

Python indentation implies scope. By getting the indentation of your while loop (and all it should contain) correct, I think this is producing the results you're looking for I left out the "graphical" part and just printed the x and y coordinates as a result of the random walk. You can take over the graphical part from here.
from random import choice
class RandomWalk():
def __init__(self, num_points=50):
self.num_points = num_points
self.x_values = [0]
self.y_values = [0]
def fill_walk(self):
while len(self.x_values) < self.num_points:
x_direction = choice([1, -1])
x_distance = choice([0,1,2,3,4])
x_step = x_direction * x_distance
y_direction = choice([1, -1])
y_distance = choice([0,1,2,3,4])
y_step = y_direction * y_distance
if x_step == 0 and y_step == 0:
continue
next_x = self.x_values[-1] + x_step
next_y = self.y_values[-1] + y_step
print (str(next_x) + " " + str(next_y))
self.x_values.append(next_x)
self.y_values.append(next_y)
rw = RandomWalk()
rw.fill_walk()
RESULTS
-2 -3
1 0
-2 0
-1 1
-1 -3
1 -1
4 0
0 0
0 4
0 5
3 5
1 3
1 4
1 3
-2 4
-3 7
0 7
1 7
-2 5
-2 1
-3 1
-1 0
-4 3
-3 5
0 9
3 7
3 4
-1 5
1 8
4 10
6 11
6 7
9 9
13 10
12 10
12 11
9 9
12 10
16 11
15 7
14 6
14 3
16 2
18 2
15 0
13 -2
12 -1
8 1
12 1

Related

recursive function does not work as expected

Could someone explain the code? I just can not understand why this code gives output like this:
1
3
6
10
15
21
I expected the code to give something like this:
1
3
5
7
9
11
What am I missing here?
def tri_recursion(k):
if(k > 0):
result = k + tri_recursion(k-1)
print(result)
else:
result = 0
return result
tri_recursion(6)
For your recursive function, the termination condition is k=0.
It's clear that if k=0, tri_recursion(0) = 0.
If k=1, tri_recursion(1) = 1 + tri_recursion(0), which from above, is 1 + 0 or 1.
If k=2, tri_recursion(2) = 2 + tri_recursion(1), which from above, is 2 + 1 or 3.
If k=3, tri_recursion(3) = 3 + tri_recursion(2), which from above, is 3 + 3 or 6.
If k=4, tri_recursion(4) = 5 + tri_recursion(3), which from above, is 4 + 6 or 10.
If k=5, tri_recursion(5) = 4 + tri_recursion(4), which from above, is 5 + 10 or 15.
If k=6, tri_recursion(6) = 6 + tri_recursion(5), which from above, is 6 + 15 or 21.
See the pattern?
Your code is calculating the sum of numbers up to n where n is 6 in the above case. The print statement prints the intermediate results. Hence the output 1 3 6 10 15 21.
1 - The sum of numbers from 0 to 1
3 - The sum of numbers from 0 to 2
6 - The sum of numbers from 0 to 3
10 - The sum of numbers from 0 to 4
15 - The sum of numbers from 0 to 5
21 - The sum of numbers from 0 to 6

Replacing the first column values according to the second column pattern

How to use regex to replace values in Data Frames, here, 5th column according to pattern of the 1st column? The column 5 consist only in ones for now. However, I would like to start changing this column when in the 1st column pattern 34444 appears. Then program suppose to replace ones with 11111, 22222, 33333 etc. until the end of the file when the pattern appears.
Sample of the file:
0 5 1 2 3 4
11 1 1 1 -173.386856 -0.152110 -58.235509
12 2 1 1 -176.102464 -1.020643 -1.217859
13 3 1 1 -175.792961 -57.458357 -58.538891
14 4 1 1 -172.774153 -59.284206 -1.988605
15 5 1 1 -174.974179 -56.371161 -58.406157
16 6 1 3 138.998480 12.596951 0.223780
17 7 1 4 138.333252 11.884713 -0.281429
18 8 1 4 139.498084 13.356891 -0.480091
19 9 1 4 139.710930 11.981460 0.697098
20 10 1 4 138.452807 13.136061 0.990663
21 11 1 3 138.998480 12.596951 0.223780
22 12 1 4 138.333252 11.884713 -0.281429
23 13 1 4 139.498084 13.356891 -0.480091
24 14 1 4 139.710930 11.981460 0.697098
25 15 1 4 138.452807 13.136061 0.990663
Expected result:
0 5 1 2 3 4
11 1 1 1 -173.386856 -0.152110 -58.235509
12 2 1 1 -176.102464 -1.020643 -1.217859
13 3 1 1 -175.792961 -57.458357 -58.538891
14 4 1 1 -172.774153 -59.284206 -1.988605
15 5 1 1 -174.974179 -56.371161 -58.406157
16 6 1 3 138.998480 12.596951 0.223780
17 7 1 4 138.333252 11.884713 -0.281429
18 8 1 4 139.498084 13.356891 -0.480091
19 9 1 4 139.710930 11.981460 0.697098
20 10 1 4 138.452807 13.136061 0.990663
21 11 2 3 138.998480 12.596951 0.223780
22 12 2 4 138.333252 11.884713 -0.281429
23 13 2 4 139.498084 13.356891 -0.480091
24 14 2 4 139.710930 11.981460 0.697098
25 15 2 4 138.452807 13.136061 0.990663
Yeah, if you really want re, there is a way. But I doubt it would be really more efficient than a for-loop.
1. re.finditer
import pandas as pd
import numpy as np
import re
# present col1 as number-strings
arr1 = df['1'].values
str1 = "".join([str(i) for i in arr1])
ans = np.ones(len(str1), dtype=int)
# when a pattern is found, increase latter elements by 1
for match in re.finditer('34444', str1):
e = match.end()
ans[e:] += 1
# replace column 5
df['5'] = ans
# Output
df[['0', '5', '1']]
Out[50]:
0 5 1
11 1 1 1
12 2 1 1
13 3 1 1
14 4 1 1
15 5 1 1
16 6 1 3
17 7 1 4
18 8 1 4
19 9 1 4
20 10 1 4
21 11 2 3
22 12 2 4
23 13 2 4
24 14 2 4
25 15 2 4
2. naïve for-loop
Checks the array directly element-by-element. By comparison with re.finditer, no typecasting is involved, but an explicit for-loop is written. The same output is obtained. Please benchmark by yourself if efficiency became relevant, say, if there were tens of millions of rows involved.
arr1 = df['1'].values
ans = np.ones(len(str1), dtype=int)
n = len(arr1)
for i, el in enumerate(arr1):
# termination
if i > n - 5:
break
# ignore non-3 elements
if el != 3:
continue
# if found, increase latter elements by 1
if np.all(arr1[i+1:i+5] == 4):
ans[i+5:] += 1
df['5'] = ans

Tracing sorting algorithms

I am trying to trace the changes in selection sort algorithm with python, Here's a piece of my code and what I've tried, the problem I am facing is printing the results in a table-like format
l = [2,5,1,7,9,5,3,0,-1]
iterat = 1
print('Iteration' + '\t\t\t' + 'Results')
for i in range(1, len(l)):
val_to_sort = l[i]
while l[i-1] > val_to_sort and i > 0:
l[i-1], l[i] = l[i], l[i-1]
i -= 1
print(iterat, '\t\t\t', l[0:iterat + 1],'|',l[iterat:])
iten += 1
from the code above, I am obtaining the following results:
But I am trying to obtain such results
Unident print one level to the left, so it is inside the for block instead of the while block.
Use join and map to print the lists as a string
You can use enumerate instead of manually incrementing iterat
def format_list(l):
return ' '.join(map(str, l))
l = [2,5,1,7,9,5,3,0,-1]
print('Iteration' + '\t\t\t' + 'Results')
for iterat, i in enumerate(range(1, len(l)), 1):
val_to_sort = l[i]
while l[i-1] > val_to_sort and i > 0:
l[i-1], l[i] = l[i], l[i-1]
i -= 1
print(iterat, '\t\t\t', format_list(l[0:iterat + 1]),'|', format_list(l[iterat:]))
Outputs
Iteration Results
1 2 5 | 5 1 7 9 5 3 0 -1
2 1 2 5 | 5 7 9 5 3 0 -1
3 1 2 5 7 | 7 9 5 3 0 -1
4 1 2 5 7 9 | 9 5 3 0 -1
5 1 2 5 5 7 9 | 9 3 0 -1
6 1 2 3 5 5 7 9 | 9 0 -1
7 0 1 2 3 5 5 7 9 | 9 -1
8 -1 0 1 2 3 5 5 7 9 | 9
I can't help you with the Cyrillic text though ;)

How can I write this code more efficiently to make it run faster?

The function of the code is to transform the dataset such that for each given pair of movies, it counts the number of users that have seen both movies and keep track of that value(store it as a column value).
I have tried writing the code as such but it takes a lot of time to execute when the pairs increase.
def dataset_to_item_graph(self):
self.dataset1=self.dataset
items=self.dataset['movieId'].unique()
print(len(items))
ux=combinations(items,2)
item_edges=[]
for x in ux:
i = x[0]
j = x[1]
a = set(self.dataset1.loc[self.dataset1['movieId'] == i]['userId'])
b = set(self.dataset1.loc[self.dataset1['movieId'] == j]['userId'])
c = a.intersection(b)
if len(c) >0:
edge_list=[i,j,len(c)]
item_edges.append(edge_list)
else:
continue
item_graph = pd.DataFrame(item_edges, columns=['movie1','movie2','weight'])
return item_graph
This is the sample dataset I am working with:
userId movieId rating timestamp
0 1 1 4.0 964982703
1 1 3 4.0 964981247
2 1 6 4.0 964982224
3 1 47 5.0 964983815
4 1 50 5.0 964982931
5 2 1 3.0 964982931
6 2 3 4.0 964982831
7 2 6 4.0 964982933
8 3 47 5.0 964981249
9 3 1 2.0 964981248
10 3 50 3.5 965982931
This is the output I am expecting:
movieId1 movieId sum
0 1 3 2
1 1 6 2
2 1 47 2
3 1 50 2
4 3 6 1
5 3 47 1
6 3 50 1
7 6 47 1
8 6 50 1
9 47 50 2
It seems your problem is that big for loop. It could be interesting to launch subprocesses to compute those steps in parallel instead of sequencially. Do you know the multiprocessing module? You could try looking at this article, especially the example at the end, that uses from multiprocessing import Queue.

writing a line in a text in a given sequence

I have found some code and I changed it for my purpose. The code goes to the given line number and adds a new line with a certain format. But it does not work if I have a sequence of numbers. I could not find out why.
This is my code:
import fileinput
x = [2, 4, 5, 6]
for line in fileinput.FileInput("1.txt", inplace=1):
print(line, end="")
for index, item in enumerate(x):
if line.startswith("ND "+str(x[index]-1)):
print("ND "+str(x[index])+" 0 0 0 0")
This is the input file "1.txt":
ND 1 12 11 8 9
ND 3 15 11 7 9
ND 7 8 9 2 3
ND 8 4 5 1 12
ND 9 2 3 6 10
This is the result now :
ND 1 12 11 8 9
ND 2 0 0 0 0
ND 3 15 11 7 9
ND 4 0 0 0 0
ND 7 8 9 2 3
ND 8 4 5 1 12
ND 9 2 3 6 10
What I need should be like this:
ND 1 12 11 8 9
ND 2 0 0 0 0
ND 3 15 11 7 9
ND 4 0 0 0 0
ND 5 0 0 0 0
ND 6 0 0 0 0
ND 7 8 9 2 3
ND 8 4 5 1 12
ND 9 2 3 6 10
Can you please give me a hint! How should I change my code?
Would something like that work for you ? (you don't need to maintain the list of missing lines in x). It's not the most elegant code but it can be improved later if that work for you.
import fileinput
n = 1
for line in fileinput.FileInput("1.txt", inplace=1):
while not line.startswith("ND %d" % n):
print("ND %d 0 0 0 0" % n)
n+=1
print(line)
n+=1
You are missing ND 4 and ND 5 lines in your 1.txt. That's why you cannot print the ND 5 0 0 0 0 and ND 6 0 0 0 0.
You can use the regex to extract line number from text and compare:
import fileinput
import re
x = [2, 4, 5, 6]
last = 0
for line in fileinput.FileInput("1.txt", inplace=1):
# using regex to extract the "current" line number from ND...
current = int(re.search(r'\d+', line).group())
for index, item in enumerate(x):
# "=" because there's a case that your given line already exists
if item > last and item <= current:
print("ND "+str(item)+" 0 0 0 0")
last = current
print(line, end="")
As I said in the comment, it needs a break with the maximum number of ND in the file. So in this case at n == 10.
n = 1
for line in fileinput.FileInput("1.txt", inplace=1):
while not line.startswith("ND %d" % n):
print("ND %d 0 0 0 0" % n)
n+=1
if n == 10:
break
print(line, end="")
n+=1

Resources