I've encountered some problems when I was trying simulate school's math question,I've test the inner loop independently and the result is what I expect.I have no idea where is the problem and where can I get the resolution.It should be a simple bug.
This is the question:
There is a bag which includes three red balls ,four white balls and five black balls.Take one ball each time.Then what is the probability when red balls were the first color being collected.
And This is my code:(All annotations were not added in my code)
import random as rd
y = 1000 *//total try*
succ = 0 *//success times*
orgbg = ['r','r','r','w','w','w','w','b','b','b','b','b'] *//original bag for each loop initialization*
while (y >= 0):
redball = 0
blackball = 0
whiteball = 0
newbg = orgbg *//every bag for a single try*
while (redball < 3 and whiteball < 4 and blackball < 5):
tknum = rd.randrange(0,len(newbg),1)
tkball = newbg[tknum]
if (tkball == 'r'):
redball = redball + 1
elif (tkball =='w'):
whiteball = whiteball + 1
else:
blackball = blackball + 1
del newbg[tknum]
if (redball == 3):
succ = succ + 1
y = y - 1
print (succ)
This is what the error report says:
ValueError: empty range for randrange() (0,0, 0)
When I turn the code
tknum = rd.randrange(0,len(newbg),1)
into
tknum = rd.randrange(5,len(newbg),1)
The error reoprt says:
ValueError: empty range for randrange() (5,5, 0)
I guess it is the initialization in the outer loop newbg = orgbg doesn't work out,but how can that happen?
Sorry for giving such a length question ,I'm a beginner and this is the first time I ask question on StackOverFlow,you can also give me some suggestion on my code style or method and the way of asking question,next time I will be better,hope you don't mind.
I think that your problem is indeed linked with the initialization in the outer loop newbg = orgbg. To correct your code, you should modify this line with
newbg = deepcopy(orgbg)
and import the corresponding module at the start of your code:
from copy import deepcopy
The explanation of the bug is a bit complicated and is linked with the way that Python handles the memory when copying a list. In fact, there is two possibility for this: a shallow or a deep copy. Here, you made a shallow copy when a deep copy would have been necessary. It is better explained here: https://www.python-course.eu/deep_copy.php or What exactly is the difference between shallow copy, deepcopy and normal assignment operation?
Related
I have code that reads data and grabs specific data from an object's fields.
How can I eliminate the quadruple for loop here? Its performance seems quite slow.
data = readnek(filename) # read in data
bigNum=200000
for myNodeVal in range(0, 7): # all 6 elements.
cs_coords = np.ones((bigNum, 2)) # initialize data
counter = 0
for iel in range(bigNum):
for ix in range(0,7):
for iy in range(0,7):
z = data.elem[iel].pos[2, myNodeVal, iy, ix]
x = data.elem[iel].pos[0, myNodeVal, iy, ix]
y = data.elem[iel].pos[1, myNodeVal, iy, ix]
cs_coords[counter, 0:2] = [x, y]
counter += 1
You can remove the two innermost loops using a transposed view that is reshaped so to build a block of 49 [x, y] values then assigned to cs_coords in a vectorized way. The access to z can be removed for better performance (since the Python interpreter optimize nearly nothing). Here is an (untested) example:
data = readnek(filename) # read in data
bigNum=200000
for myNodeVal in range(0, 7): # all 6 elements.
cs_coords = np.ones((bigNum, 2)) # initialize data
counter = 0
for iel in range(bigNum):
arr = data.elem[iel].pos
view_x = arr[0, myNodeVal, 0:7, 0:7].T
view_y = arr[1, myNodeVal, 0:7, 0:7].T
cs_coords[counter:counter+49] = np.hstack([view_x.reshape(-1, 1), view_y.reshape(-1, 1)])
counter += 49
Note that the initial code is probably flawed since cs_coords.shape[0] is bigNum and counter will be bigNum * 49. You certainly need to use the shape (bigNum*49, 2) instead so to avoid out of bound errors.
Note the above code is still far from being optimal since it will create many small arrays and Numpy is not optimized to deal with very small arrays (CPython neither). It is hard to do much better without more information on data. Using Numba or Cython can certainly help a lot to speed up this code. Still, even with such tool, the code will not be very efficient since the memory access pattern is inefficient (bad cache locality) and the overall code will be memory-bound.
I've boiled down the logic of something I'm struggling to figure out (new to Python, been a long time since I've done any coding). I have the code below, the intention is to roll a d6 limit times (in this case 300) and then over a defined number of iterations generate a file with the resulting dice rolls for each 300 iterations into its own file.
What I get is n (loops) files with the same data in each one. So right now this will return random1.txt, random2.txt and random3.txt and all will have the same values in them.
Obviously I need to reinitialize resultsin some way at the start of each iteration of the parent while loop (while loops >=1:), I just can't figure out how.
If anyone can take pity on a blundering artist I'd appreciate it! This is part of an art project I'm working on to make generative art with an axidraw if anyone is curious.
import numpy as np
import os
loops = 3 # will generate n files
limit = 300
throw = 1.0
results = []
while loops >=1:
loops -= 1
while throw <= limit:
roll = np.random.randint(1, 7)
throw += 1
results.append(roll)
n = 1
while os.path.exists("random%s.txt" % n):
n += 1
listToStr = ' '.join(map(str, results))
f = open("random%s.txt" % n, "w" )
f.write(listToStr)
f.close()
Some code for a blundering artist :-)
import numpy as np
nRolls = 8
for f in [1,2,3]:
# Generate 8 rolls of the die
rolls = np.random.randint(1,7,nRolls)
# Save like a CSV
np.savetxt(f'random{f}.txt', [rolls], delimiter=',', fmt='%d')
random1.txt
5,2,3,5,4,6,6,1
random2.txt
1,6,6,1,1,4,2,1
random3.txt
5,4,4,4,6,3,1,5
In the code supplied below I am trying to iterate over 2D numpy array [i][k]
Originally it is a code which was written in Fortran 77 which is older than my grandfather. I am trying to adapt it to python.
(for people interested whatabouts: it is a simple hydraulics transients event solver)
Bear in mind that all variables are introduced in my code which I don't paste here.
H = np.zeros((NS,50))
Q = np.zeros((NS,50))
Here I am assigning the first row values:
for i in range(NS):
H[0][i] = HR-i*R*Q0**2
Q[0][i] = Q0
CVP = .5*Q0**2/H[N]
T = 0
k = 0
TAU = 1
#Interior points:
HP = np.zeros((NS,50))
QP = np.zeros((NS,50))
while T<=Tmax:
T += dt
k += 1
for i in range(1,N):
CP = H[k][i-1]+Q[k][i-1]*(B-R*abs(Q[k][i-1]))
CM = H[k][i+1]-Q[k][i+1]*(B-R*abs(Q[k][i+1]))
HP[k][i-1] = 0.5*(CP+CM)
QP[k][i-1] = (HP[k][i-1]-CM)/B
#Boundary Conditions:
HP[k][0] = HR
QP[k][0] = Q[k][1]+(HP[k][0]-H[k][1]-R*Q[k][1]*abs(Q[k][1]))/B
if T == Tc:
TAU = 0
CV = 0
else:
TAU = (1.-T/Tc)**Em
CV = CVP*TAU**2
CP = H[k][N-1]+Q[k][N-1]*(B-R*abs(Q[k][N-1]))
QP[k][N] = -CV*B+np.sqrt(CV**2*(B**2)+2*CV*CP)
HP[k][N] = CP-B*QP[k][N]
for i in range(NS):
H[k][i] = HP[k][i]
Q[k][i] = QP[k][i]
Remember i is for rows and k is for columns
What I am expecting is that for all k number of columns the values should be calculated until T<=Tmax condition is met. I cannot figure out what my mistake is, I am getting the following errors:
RuntimeWarning: divide by zero encountered in true_divide
CVP = .5*Q0**2/H[N]
RuntimeWarning: invalid value encountered in multiply
QP[N][k] = -CV*B+np.sqrt(CV**2*(B**2)+2*CV*CP)
QP[N][k] = -CV*B+np.sqrt(CV**2*(B**2)+2*CV*CP)
ValueError: setting an array element with a sequence.
Looking at your first iteration:
H = np.zeros((NS,50))
Q = np.zeros((NS,50))
for i in range(NS):
H[0][i] = HR-i*R*Q0**2
Q[0][i] = Q0
The shape of H is (NS,50), but when you iterate over a range(NS) you apply that index to the 2nd dimension. Why? Shouldn't it apply to the dimension with size NS?
In numpy arrays have 'C' order by default. Last dimension is inner most. They can have a F (fortran) order, but let's not go there. Thinking of the 2d array as a table, we typically talk of rows and columns, though they don't have a formal definition in numpy.
Lets assume you want to set the first column to these values:
for i in range(NS):
H[i, 0] = HR - i*R*Q0**2
Q[i, 0] = Q0
But we can do the assignment whole rows or columns at a time. I believe new versions of Fortran also have these 'whole-array' functions.
Q[:, 0] = Q0
H[:, 0] = HR - np.arange(NS) * R * Q0**2
One point of caution when translating to Python. Indexing starts with 0; so does ranges and np.arange(...).
H[0][i] is functionally the same as H[0,i]. But when using slices you have to use the H[:,i] format.
I suspect your other iterations have similar problems, but I'll stop here for now.
Regarding the errors:
The first:
RuntimeWarning: divide by zero encountered in true_divide
CVP = .5*Q0**2/H[N]
You initialize H as zeros so it is normal that it complains of division by zero. Maybe you should add a conditional.
The third:
QP[N][k] = -CV*B+np.sqrt(CV**2*(B**2)+2*CV*CP)
ValueError: setting an array element with a sequence.
You define CVP = .5*Q0**2/H[N] and then CV = CVP*TAU**2 which is a sequence. And then you try to assign a derivate form it to QP[N][K] which is an element. You are trying to insert an array to a value.
For the second error I think it might be related to the third. If you could provide more information I would like to try to understand what happens.
Hope this has helped.
I've made a simple terrain generator, but it takes an excessive amount of time to generate anything bigger than 50x50. Is there anything I can do to optimise the code so that I can generate larger things? I know that things such as pygame or numpy might be better for doing this, but at my school they wont install those, so this is what I have to work with.
Here's the relevant code:
def InitMap(self):
aliveCells = []
for x in range(self.width):
for y in range(self.height):
if random.random() < self.aliveChance:
aliveCells.append(self.FindInGrid(x,y))
return aliveCells
def GenerateMap(self):
aliveCells = self.InitMap()
shallowCells=[]
self.count = 1
for i in range(self.steps):
aliveCells = self.DoGenStep(aliveCells)
for i in aliveCells:
self.canvas.itemconfig(i,fill="green")
for i in aliveCells:
for j in self.FindNeighbours(i):
if j not in aliveCells: self.canvas.itemconfig(i,fill="#0000FF")
def DoGenStep(self,oldAliveCells):
newAliveCells = []
for allCells in self.pos:
for cell in allCells:
self.root.title(str(round((self.count/(self.height*self.width)*100)/self.steps))+"%")
self.count += 1
aliveNeighbours = 0
for i in self.FindNeighbours(cell):
if i in oldAliveCells: aliveNeighbours += 1
if cell in oldAliveCells:
if aliveNeighbours < self.deathLimit:
pass
else:
newAliveCells.append(cell)
else:
if aliveNeighbours > self.birthLimit:
newAliveCells.append(cell)
return newAliveCells
def FindNeighbours(self,cell):
cellCoords = self.GetCoords(cell)
neighbours = []
for xMod in [-1,0,1]:
x = xMod+cellCoords[0]
for yMod in [-1,0,1]:
y = yMod+cellCoords[1]
if x < 0 or x >= self.width: pass
elif y < 0 or y >= self.height: pass
elif xMod == 0 and yMod == 0: pass
else: neighbours.append(self.FindInGrid(x,y))
return neighbours
NB: You didn't add the method "FindInGrid", so I'm making some assumptions. Please correct me if I'm wrong.
One thing which would help immensely for larger maps, and also when at high densities, is not to store just the alive cells, but the entire grid. By storing the alive cells, you make the behavior of your program in the order O( (x*y)^2), since you have to iterate over all alive cells for each alive cell. If you would store the entire grid, this would not be necessary, and the calculation can be performed with a time complexity linear to the surface of your grid, rather than a quadratic one.
Additional point:
self.root.title(str(round((self.count/(self.height*self.width)*100)/self.steps))+"%")
That's a string operation, which makes it relatively expensive. Are you sure you need to do this after each and every update of a single cell?
The code below generates two random integers within range specified by argv, tests if the integers match and starts again. At the end it prints some stats about the process.
I've noticed though that increasing the value of argv reduces the percentage of tested possibilities exponentially.
This seems counter intuitive to me so my question is, is this an error in the code or are the numbers real and if so then what am I not thinking about?
#!/usr/bin/python3
import sys
import random
x = int(sys.argv[1])
a = random.randint(0,x)
b = random.randint(0,x)
steps = 1
combos = x**2
while a != b:
a = random.randint(0,x)
b = random.randint(0,x)
steps += 1
percent = (steps / combos) * 100
print()
print()
print('[{} ! {}]'.format(a,b), end=' ')
print('equality!'.upper())
print('steps'.upper(), steps)
print('possble combinations = {}'.format(combos))
print('explored {}% possibilitys'.format(percent))
Thanks
EDIT
For example:
./runscrypt.py 100000
will returm me something like:
[65697 ! 65697] EQUALITY!
STEPS 115867
possble combinations = 10000000000
explored 0.00115867% possibilitys
"explored 0.00115867% possibilitys" <-- This number is too low?
This experiment is really a geometric distribution.
Ie.
Let Y be the random variable of the number of iterations before a match is seen. Then Y is geometrically distributed with parameter 1/x (the probability of generating two matching integers).
The expected value, E[Y] = 1/p where p is the mentioned probability (the proof of this can be found in the link above). So in your case the expected number of iterations is 1/(1/x) = x.
The number of combinations is x^2.
So the expected percentage of explored possibilities is really x/(x^2) = 1/x.
As x approaches infinity, this number approaches 0.
In the case of x=100000, the expected percentage of explored possibilities = 1/100000 = 0.001% which is very close to your numerical result.