For Seq2seq text summarization i am facing issue related to repetition. What are the best way to handle the repetition problem in seq2seq text summarization:
at the time of decoding i am getting repetition error like
Original summary: siri like digital assistant is poised to replace bing search in the windows phone 8 1 update is described as ‘a personal assistant on your phone ready to help with suggestions tasks and lots in the video it is expected to make its official debut at the build developer conference in san francisco next month
With Rep Predicted summary: the app is a app of the app of the app of the app of the app to create a app to create a app to create a app
Without Rep Predicted summary: the app is a app of the app of the app of the app of the app to create a app to create a app to create a app
code for decode with repetition:
def decode_sequence_original(input_seq):
# Encode the input as state vectors.
e_out, e_h, e_c = encoder_model.predict(input_seq)
# Generate empty target sequence of length 1.
target_seq = np.zeros((1,1))
# Populate the first word of target sequence with the start word.
target_seq[0, 0] = target_word_index['sostok']
stop_condition = False
decoded_sentence = ''
while not stop_condition:
output_tokens, h, c = decoder_model.predict([target_seq] + [e_out, e_h, e_c])
# Sample a token
sampled_token_index = np.argmax(output_tokens[0, -1, :])
sampled_token = reverse_target_word_index[sampled_token_index]
if(sampled_token!='eostok'):
decoded_sentence += ' '+sampled_token
# Exit condition: either hit max length or find stop word.
if (sampled_token == 'eostok' or len(decoded_sentence.split()) >= (max_summary_len-1)):
stop_condition = True
# Update the target sequence (of length 1).
target_seq = np.zeros((1,1))
target_seq[0, 0] = sampled_token_index
# Update internal states
e_h, e_c = h, c
return decoded_sentence
to avoid repeatation added this change:
def decode_sequence(input_seq):
# Encode the input as state vectors.
e_out, e_h, e_c = encoder_model.predict(input_seq)
# Generate empty target sequence of length 1.
target_seq = np.zeros((1,1))
# Populate the first word of target sequence with the start word.
target_seq[0, 0] = target_word_index['sostok']
stop_condition = False
decoded_sentence = ''
prev_token=0
while not stop_condition:
output_tokens, h, c = decoder_model.predict([target_seq] + [e_out, e_h, e_c])
# Sample a token argmax stands for greedy approach
sampled_token_index = np.argmax(output_tokens[0, -1, :])
sampled_token = reverse_target_word_index[sampled_token_index]
if (sampled_token_index==prev_token):
#print("inside similar",sampled_token_index,prev_token)
temp15=np.argsort(output_tokens[0, -1, :])
sampled_token_index=(temp15[-2:][0])#slice operation
# print(sampled_token_index)
sampled_token = reverse_target_word_index[sampled_token_index]
if(sampled_token!='eostok'):
decoded_sentence += ' '+sampled_token
prev_token=sampled_token_index
# Exit condition: either hit max length or find stop word.
if (sampled_token == 'eostok' or len(decoded_sentence.split()) >= (max_summary_len-1)):
stop_condition = True
# Update the target sequence (of length 1).
target_seq = np.zeros((1,1))
target_seq[0, 0] = sampled_token_index
# Update internal states
e_h, e_c = h, c
return decoded_sentence
Related
Premise
I am trying to solve a set of coupled PDEs that describes the diffusion of charged particles with different diffusion coefficients using FiPy. The ultimate goal is to obtain the concentration profile for both species and the electric field.
The geometry is an infinitely long cylinder with radius R. I want to use a non-uniform grid with more points close to the domain walls.
Charged particles diffuse from the center of the domain (left boundary) to the walls of the domain (right boundary). This translates to a Dirichlet boundary condition (B.C.) at the left boundary where both species concentration = 0, and a Neumann B.C. at the right boundary where species flux are 0 to describe radial symmetry. Because the charged species diffuse at different rates, there is an electric field that arises from the space charge. The electric field accelerates the slower species and decelerates the faster species proportional to the field magnitude.
P is the positively charged species concentration and N is negatively charged charged species concentration. E is the space charge electric field.
Issue
I can't seem to get a sensible solution from my code and I think it may be related on how I casted the gradient/divergence terms as a ConvectionTerm:
from fipy import *
import scipy.constants as constant
from fipy.tools import numerix
import numpy as np
## Defining physical constants
pi = constant.pi
m_argon = 6.6335e-26 # kg
k_b = constant.k # J/K
e_0 = constant.epsilon_0 # F/m
q_e = constant.elementary_charge # C
m_e = constant.electron_mass # kg
planck = constant.h
def char_diff_length(L,R):
"""Characteristic diffusion length in a cylinder.
Used for determining the ambipolar diffusion coefficient.
ref: https://doi.org/10.6028/jres.095.035"""
a = (pi/L)**2
b = (2.405/R)**2
c = (a+b)**(1/2)
return c
def L_Debye(ne,Te):
"""Electron Debye screening length given in m.
ne is in #/m3, Te is in K."""
if ne < 3.3e-5:
ne = 3.3e-5
return (((e_0*k_b*Te)/(ne*q_e**2)))**(1/2)
## Setting system parameters
# Operation parameters
Pressure = 1.e5 # ambient pressure Pa
T_g = 400. # background gas temperature K
n_g = Pressure/k_b/T_g # gas number density #/m3
Q_std = 300. # standard volumetric flowrate in sccm
T_e_0 = 11. # plasma temperature ratio T_e/T_g here assumed to be T_e = 0.5 eV and T_g = 500 K
n_e_0 = 1.e20 # electron density in bulk plasma #/m3
# Geometric parameters
R_b = 1.e-3 # radius cylinder m
L = 1.e-1 # length of cylinder m
# Transport parameters
D_ion = 4.16e-6 #m2/s ion diffusion, obtained from https://doi.org/10.1007/s12127-020-00258-z
mu_ion = D_ion*q_e/k_b/T_g # ion electrical mobility using Einstein relation
D_e = 100.68122*D_ion #m2/s electron diffusion
mu_e = D_e*q_e/k_b/T_g # electron electrical mobility using Einstein relation
Lambda = char_diff_length(L,R_b)
debyelength_e = L_Debye(n_e_0,T_g)
gamma = (Lambda/debyelength_e)**2
delta = D_ion/D_e
def d_j(rb,n): #sets the desired spatial steps for mesh
dj = np.zeros(n)
for j in range(n):
dj[j] = 2*rb*(1 - j/n)/n
return dj
#Initializing mesh
dj = d_j(1.,100) # 100 points
mesh = CylindricalGrid1D(dr = dj)
#Declaring cell variables
N = CellVariable(mesh=mesh, value = 1., hasOld = True, name = "electron density")
P = CellVariable(mesh=mesh, value = 1., hasOld = True, name = "ion density")
H = CellVariable(mesh=mesh, value = 0., hasOld = True, name = "electric field")
#Setting boundary conditions
N.constrain(0.,mesh.facesRight) # electron density = 0 at walls
P.constrain(0.,mesh.facesRight)# ion density = 0 at walls
H.constrain(0.,mesh.facesLeft) # electric field = 0 in the center
N.faceGrad.constrain([0.],mesh.facesLeft) # flux of electron = 0 in the center
P.faceGrad.constrain([0.],mesh.facesLeft) # flux of ion = 0 in the center
if __name__ == '__main__':
viewer = Viewer(vars=(P,N))
viewer.plot()
eqn1 = (TransientTerm(var=P) == DiffusionTerm(coeff=delta,var=P)
- ConvectionTerm(coeff=[H.cellVolumeAverage,],var=P)
- ConvectionTerm(coeff=[P.cellVolumeAverage,],var=H))
eqn2 = (TransientTerm(var=N) == DiffusionTerm(var=N)
+ (1/delta)*(ConvectionTerm(coeff=[H.cellVolumeAverage,],var=N)
+ConvectionTerm(coeff=[N.cellVolumeAverage,],var=H)))
eqn3 = (TransientTerm(var=H) == gamma*(ConvectionTerm(coeff=[delta**2,],var=P)
- ConvectionTerm(coeff=[delta,],var=N)
- H*(delta*P.cellVolumeAverage + N.cellVolumeAverage)))
P.setValue(1.)
N.setValue(1.)
H.setValue(0.)
eqn1d = eqn1 & eqn2 & eqn3
timesteps = 1e-5
steps = 100
for i in range(steps):
P.updateOld()
N.updateOld()
H.updateOld()
res = 1e10
sweep = 0
while res > 1e-3 and sweep < 20:
res = eqn1d.sweep(dt=timesteps)
sweep += 1
if __name__ == '__main__':
viewer.plot()
Electric field is a vector, not a scalar.
H = CellVariable(rank=1, mesh=mesh, value = 0., hasOld = True, name = "electric field")
Correcting that should make converting the terms to FiPy clearer:
There's no reason to run the chain rule on the last term of eq1 or eq2; they're already in the canonical form for a FiPy ConvectionTerm. After the chain rule, they become, e.g., , neither of which is a form that FiPy likes. You could write those last two terms as explicit sources, but you shouldn't.
eqn1 = (TransientTerm(var=P) == DiffusionTerm(coeff=delta, var=P)
- ConvectionTerm(coeff=H, var=P))
eqn2 = (TransientTerm(var=N) == DiffusionTerm(var=N)
+ (1/delta)*ConvectionTerm(coeff=H, var=N))
I don't really understand eq3. It looks sort of like an integration of the continuity equation? I don't see it on a quick scan of the Phelps paper you cite. Regardless, it's not in a form that FiPy is amenable to; you can write it, but it won't solve well. The terms on the right aren't ConvectionTerms, they're just gradients.
If you're going to be allowing charge separation and worrying about the Debye length, I think you should be solving Poisson's equation. Can you share where this equation comes from? We might be able to put it in a form that FiPy will be happier with.
eq3 is a modified Poisson's equation. I tried to follow the procedure outlined by Freeman where a time derivative is performed on the Poisson equation to substitute the species continuity equations. Freeman solved these equations using the Gear package which I can only assume is a package on Fortran. I followed his steps out of naivite because I am out of my depth with numerical methods.
I will try solving again with the Poisson equation in its standard form.
edit: I have changed the electric field H to a rank 1 tensor and I have modified eq3 as well as slightly changed the definition of gamma. Everything else is unchanged.
H = CellVariable(rank = 1, mesh=mesh, value = 0., hasOld = True, name = "electric field")
charlength = char_diff_length(L,R_b)
debyelength_e = L_Debye(n_e_0,T_g)
gamma = (debyelength_e/charlength)**2
delta = D_ion/D_e
eqn1 = (TransientTerm(var=P) == DiffusionTerm(coeff=delta,var=P)
- ConvectionTerm(coeff=H,var=P))
eqn2 = (TransientTerm(var=N) == DiffusionTerm(var=N)
+ (1/delta)*ConvectionTerm(coeff=H,var=N))
eqn3 = (ConvectionTerm(coeff = gamma/delta, var=H) == ImplicitSourceTerm(var=P)
- ImplicitSourceTerm(var=N))
P.setValue(1.)
N.setValue(1.)
H.setValue(0.)
eqn1d = eqn1 & eqn2 & eqn3
timesteps = 1e-8
steps = 100
for i in range(steps):
P.updateOld()
N.updateOld()
H.updateOld()
res = 1e10
sweep = 0
while res > 1e-3 and sweep < 20:
res = eqn1d.sweep(dt=timesteps)
sweep += 1
if __name__ == '__main__':
viewer.plot()
It does not give me the same errors as before which is some indication of progress. However, it is spitting out a new error:
ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 1 dimension(s) and the array at index 2 has 2 dimension(s)
I'm trying to extract the orientation of a wiimote using cwiid in python. I've managed to get the accelerometer values but there doesn't seem to be any object attributes relating to the purely gyroscopic data.
This guy managed to do it in python, but to the best of my knowledge there's no python code online with an example.
https://www.youtube.com/watch?v=cUjh0xQO6eY
There is information on wiibrew about the controller data but again this seems to be excluded from any python library.
Has anyone got any suggestions? This link has an example of getting gyro data but the packages used don't seem available.
I was actually looking for this a few days ago, found this post: https://ofalcao.pt/blog/2014/controlling-the-sbrick-with-a-wiimote. More specifically, I think the code you're looking for is:
# roll = accelerometer[0], standby ~125
# pitch = accelerometer[1], standby ~125
...
roll=(wm.state['acc'][0]-125)
pitch=(wm.state['acc'][1]-125)
I'm assuming you can use the z-axis (index 2) for the yaw
So this question has a few parts, firstly how to extract the gyro data from the motion plus sensor. To do this, the motion plus will first need to be enabled.
The gyro provides the angular rotation vectors, but due to drift caused by integration errors you can't simply use a combination of these two things to get Eular angles. The second part of the question is how to use this data to give position, and that is done by using a Kalman filter, a highly complex matrix sequence, or a complementary filter, a less complex mathematical operation. Both of these filters are essentially combining the gyro and accelerometer data, so as mentioned in a comment above, resulting in more stable measurements, less drift and a system not prone to breaking when the remote is shaken.
Kalman filter:
http://blog.tkjelectronics.dk/2012/09/a-practical-approach-to-kalman-filter-and-how-to-implement-it/
Using PyKalman on Raw Acceleration Data to Calculate Position
Complementary filter
https://www.instructables.com/Angle-measurement-using-gyro-accelerometer-and-Ar/
Still currently developing the core but will post when I'm finished, hopefully tomorrow.
The foundation code I am using to test the measurements is found here:
http://andrew-j-norman.blogspot.com/2010/12/more-code.html. Very handy, as it plots the sensor readings automatically after recording. You can see by doing this that even when still, the position estimate by using simple integration of the angular velocities results in a drift in the position vector.
EDIT:
Testing this allows for the gyro sensor to accurately calculate the angle changed over time, however there remains drift in the acceleration - which I believe is unavoidable.
Here is an image demonstrating the gyro motion sensor:
Just finished up the code:
#!/usr/bin/python
import cwiid
from time import time, asctime, sleep, perf_counter
from numpy import *
from pylab import *
import math
import numpy as np
from operator import add
HPF = 0.98
LPF = 0.02
def calibrate(wiimote):
print("Keep the remote still")
sleep(3)
print("Calibrating")
messages = wiimote.get_mesg()
i=0
accel_init = []
angle_init = []
while (i<1000):
sleep(0.01)
messages = wiimote.get_mesg()
for mesg in messages:
# Motion plus:
if mesg[0] == cwiid.MESG_MOTIONPLUS:
if record:
angle_init.append(mesg[1]['angle_rate'])
# Accelerometer:
elif mesg[0] == cwiid.MESG_ACC:
if record:
accel_init.append(list(mesg[1]))
i+=1
accel_init_avg = list(np.mean(np.array(accel_init), axis=0))
print(accel_init_avg)
angle_init_avg = sum(angle_init)/len(angle_init)
print("Finished Calibrating")
return (accel_init_avg, angle_init_avg)
def plotter(plot_title, timevector, data, position, n_graphs):
subplot(n_graphs, 1, position)
plot(timevector, data[0], "r",
timevector, data[1], "g",
timevector, data[2], "b")
xlabel("time (s)")
ylabel(plot_title)
print("Press 1+2 on the Wiimote now")
wiimote = cwiid.Wiimote()
# Rumble to indicate a connection
wiimote.rumble = 1
print("Connection established - release buttons")
sleep(0.2)
wiimote.rumble = 0
sleep(1.0)
wiimote.enable(cwiid.FLAG_MESG_IFC | cwiid.FLAG_MOTIONPLUS)
wiimote.rpt_mode = cwiid.RPT_BTN | cwiid.RPT_ACC | cwiid.RPT_MOTIONPLUS
accel_init, angle_init = calibrate(wiimote)
str = ""
print("Press plus to start recording, minus to end recording")
loop = True
record = False
accel_data = []
angle_data = []
messages = wiimote.get_mesg()
while (loop):
sleep(0.01)
messages = wiimote.get_mesg()
for mesg in messages:
# Motion plus:
if mesg[0] == cwiid.MESG_MOTIONPLUS:
if record:
angle_data.append({"Time" : perf_counter(), \
"Rate" : mesg[1]['angle_rate']})
# Accelerometer:
elif mesg[0] == cwiid.MESG_ACC:
if record:
accel_data.append({"Time" : perf_counter(), "Acc" : [mesg[1][i] - accel_init[i] for i in range(len(accel_init))]})
# Button:
elif mesg[0] == cwiid.MESG_BTN:
if mesg[1] & cwiid.BTN_PLUS and not record:
print("Recording - press minus button to stop")
record = True
start_time = perf_counter()
if mesg[1] & cwiid.BTN_MINUS and record:
if len(accel_data) == 0:
print("No data recorded")
else:
print("End recording")
print("{0} data points in {1} seconds".format(
len(accel_data), perf_counter() - accel_data[0]["Time"]))
record = False
loop = False
else:
pass
wiimote.disable(cwiid.FLAG_MESG_IFC | cwiid.FLAG_MOTIONPLUS)
if len(accel_data) == 0:
sys.exit()
timevector = []
a = [[],[],[]]
v = [[],[],[]]
p = [[],[],[]]
last_time = 0
velocity = [0,0,0]
position = [0,0,0]
for n, x in enumerate(accel_data):
if (n == 0):
origin = x
else:
elapsed = x["Time"] - origin["Time"]
delta_t = x["Time"] - last_time
timevector.append(elapsed)
for i in range(3):
acceleration = x["Acc"][i] - origin["Acc"][i]
velocity[i] = velocity[i] + delta_t * acceleration
position[i] = position[i] + delta_t * velocity[i]
a[i].append(acceleration)
v[i].append(velocity[i])
p[i].append(position[i])
last_time = x["Time"]
n_graphs = 3
if len(angle_data) == len(accel_data):
n_graphs = 5
angle_accel = [(math.pi)/2 if (j**2 + k**2)==0 else math.atan(i/math.sqrt(j**2 + k**2)) for i,j,k in zip(a[0],a[1],a[2])]
ar = [[],[],[]] # Angle rates
aa = [[],[],[]] # Angles
angle = [0,0,0]
for n, x in enumerate(angle_data):
if (n == 0):
origin = x
else:
delta_t = x["Time"] - last_time
for i in range(3):
rate = x["Rate"][i] - origin["Rate"][i]
angle[i] = HPF*(np.array(angle[i]) + delta_t * rate) + LPF*np.array(angle_accel)
ar[i].append(rate)
aa[i].append(angle[i])
last_time = x["Time"]
plotter("Acceleration", timevector, a, 1, n_graphs)
if n_graphs == 5:
plotter("Angle Rate", timevector, ar, 4, n_graphs)
plotter("Angle", timevector, aa, 5, n_graphs)
show()
I am facing a problem in segmenting characters from a license plate image.
I have applied following method to extract license plate characters, after this I will input every character into a OCR (tesseract):
Rgb2gray.
Canny edge detection and dilation.
Mark and filter areas.
Extract a single characters image.
The results are attached below:
On the first column work well.
At the second column there is a Chinese plate, and it is difficult to correctly identify every character.
At the third column there is another plate, which it was rotated through the hough transform because it was from another angle, however, it cannot detect any character.
I'm working here, on Google Colab.
What can I improve? any advice?
These are the image im working at this link
I think the problem it maybe at Canny edge detection:
img3 = feature.canny(img2, sigma=3)
img4 = morphology.dilation(img3)
Or filtering areas:
label_img = measure.label(img4)
regions = measure.regionprops(label_img)
fig, ax = plt.subplots()
ax.imshow(img, cmap=plt.cm.gray)
def in_bboxes(bbox, bboxes):
for bb in bboxes:
minr0, minc0, maxr0, maxc0 = bb
minr1, minc1, maxr1, maxc1 = bbox
if minr1 >= minr0 and maxr1 <= maxr0 and minc1 >= minc0 and maxc1 <= maxc0:
return True
return False
bboxes = []
for props in regions:
y0, x0 = props.centroid
minr, minc, maxr, maxc = props.bbox
if maxc - minc > img4.shape[1] / 7 or maxr - minr < img4.shape[0] / 3:
continue
bbox = [minr, minc, maxr, maxc]
if in_bboxes(bbox, bboxes):
continue
if abs(y0 - img4.shape[0] / 2) > img4.shape[0] / 4:
continue
bboxes.append(bbox)
bx = (minc, maxc, maxc, minc, minc)
by = (minr, minr, maxr, maxr, minr)
ax.plot(bx, by, '-r', linewidth=2)
For apply projection before steps.
Edit
I believe there is a problem with the normalization of the histogram, since one must divide with the radius of each element.
I am trying trying to calculate the fluctuations of particle number and the radial distribution function of a 2d LennardJones(LJ) system using python3. Although I believe the particle fluctuations come out right, the pair correlation g(r) come right for small distances but then blow up ( the calculation uses numpy's histogram method).
The thing is, I can' t find out why such a behavior emerges- perhaps of some misunderstanding of a method? As it is, I am posting the relevant code right below, and if needed, I could also upload other parts of the code or the entire script.
Note first, that since we are working with the Grand-Canonical Ensemble, as the number of particles changes, so is the array that stores the particles- and perhaps that's another point where a mistake in implementation could exist.
Particle removal or insertion
def mcex(L,npart,particles,beta,rho0,V,en):
factorin=(rho0*V)/(npart+1)
factorout=(npart)/(V*rho0)
print("factorin=",factorin)
print("factorout",factorout)
# Produce random number and check:
rand = random.uniform(0, 1)
if rand <= 0.5:
# Insert a particle at a random location
x_new_coord = random.uniform(0, L)
y_new_coord = random.uniform(0, L)
new_particle = [x_new_coord,y_new_coord]
new_E = particleEnergy(new_particle,particles, npart+1)
deltaE = new_E
print("dEin=",deltaE)
# Acceptance rule for inserting
if(deltaE>10):
P_in=0
else:
P_in = (factorin) *math.exp(-beta*deltaE)
print("pinacc=",P_in)
rand= random.uniform(0, 1)
if rand <= P_in :
particles.append(new_particle)
en += deltaE
npart += 1
print("accepted insertion")
else:
if npart != 0:
p = random.randint(0, npart-1)
this_particle = particles[p]
prev_E = particleEnergy(this_particle, particles, p)
deltaE = prev_E
print("dEout=",deltaE)
# Acceptance rule for removing
if(deltaE>10):
P_re=1
else:
P_re = (factorout)*math.exp(beta*deltaE)
print("poutacc=",P_re)
rand = random.uniform(0, 1)
if rand <= P_re :
particles.remove(this_particle)
en += deltaE
npart = npart - 1
print("accepted removal")
print()
return particles, en, npart
Monte Carlo relevant part: for 1/10 runs, check the possibility of inserting or removing a particle
# MC
for step in range(0, runTimes):
print(step)
print()
rand = random.uniform(0,1)
if rand <= 0.9:
#----------- change energies-------------------------
#........
#........
else:
particles, en, N = mcex(L,N,particles,beta,rho0,V, en)
# stepList.append(step)
if((step+1)%1000==0):
for i, particle1 in enumerate(particles):
for j, particle2 in enumerate(particles):
if j!= i:
# print(particle1)
# print(particle2)
# print(i)
# print(j)
dist.append(distancesq(particle1, particle2))
NList.append(N)
where we call the function mcex and perhaps the particles array is not updated correctly:
def mcex(L,npart,particles,beta,rho0,V,en):
factorin=(rho0*V)/(npart+1)
factorout=(npart)/(V*rho0)
print("factorin=",factorin)
print("factorout",factorout)
# Produce random number and check:
rand = random.uniform(0, 1)
if rand <= 0.5:
# Insert a particle at a random location
x_new_coord = random.uniform(0, L)
y_new_coord = random.uniform(0, L)
new_particle = [x_new_coord,y_new_coord]
new_E = particleEnergy(new_particle,particles, npart+1)
deltaE = new_E
print("dEin=",deltaE)
# Acceptance rule for inserting
if(deltaE>10):
P_in=0
else:
P_in = (factorin) *math.exp(-beta*deltaE)
print("pinacc=",P_in)
rand= random.uniform(0, 1)
if rand <= P_in :
particles.append(new_particle)
en += deltaE
npart += 1
print("accepted insertion")
else:
if npart != 0:
p = random.randint(0, npart-1)
this_particle = particles[p]
prev_E = particleEnergy(this_particle, particles, p)
deltaE = prev_E
print("dEout=",deltaE)
# Acceptance rule for removing
if(deltaE>10):
P_re=1
else:
P_re = (factorout)*math.exp(beta*deltaE)
print("poutacc=",P_re)
rand = random.uniform(0, 1)
if rand <= P_re :
particles.remove(this_particle)
en += deltaE
npart = npart - 1
print("accepted removal")
print()
return particles, en, npart
and finally, we create the g(r) histogramm
where perhaps the normalization or the use of the histogram method are not as they should
RDF(N,particles,L)
with the function:
def RDF(N,particles, L):
minb=0
maxb=8
nbin=500
skata=np.asarray(dist).flatten()
rDf = np.histogram(skata, np.linspace(minb, maxb,nbin))
prefactor = (1/2/ np.pi)* (L**2/N **2) /len(dist) *( nbin /(maxb -minb) )
# prefactor = (1/(2* np.pi))*(L**2/N**2)/(len(dist)*num_increments/(rMax + 1.1 * dr ))
rDf = [prefactor*rDf[0], 0.5*(rDf[1][1:]+rDf[1][:-1])]
print('skata',len(rDf[0]))
print('incr',len(rDf[1]))
plt.figure()
plt.plot(rDf[1],rDf[0])
plt.xlabel("r")
plt.ylabel("g(r)")
plt.show()
The results are:
Particle N number fluctuations
and
[
but we want
Although I have accepted an answer, I am posting here some more details.
To normalize the pair correlation correctly one must divide each "number of particles found at a certain distance" or mathematically the sum of delta function of the distances , one must divide with the distance it's self.
Understanding first that a numpy.histogram is an array of two elements, first element the array of all counted events and second element the vector of bins, one must take each element of the first array, lets say np.histogram[0] and multiply it pairwise with np.histogram[1] of the second array.
That is, one must do the following:
def RDF(N,particles, L):
minb=0
maxb=25
nbin=200
width=(maxb-minb)/(nbin)
rings=np.linspace(minb, maxb,nbin)
skata=np.asarray(dist).flatten()
rDf = np.histogram(skata, rings ,density=True)
prefactor = (1/( np.pi*(L**2/N**2)))
rDf = [prefactor*rDf[0], 0.5*(rDf[1][1:]+rDf[1][:-1])]
rDf[0]=np.multiply(rDf[0],1/(rDf[1]*( width )))
where before the last multiply line, we are centering the bins so that their numbers equals the number of elements of the first array( you have five fingers, but four intermediate gaps between them)
Your g(r) is not correctly normalised. You need to divide the number of pairs found in each bin by the average density of the system times the area of the annulus associated to that bin, where the latter is just 2 pi r dr, with r being the bin's midpoint and dr the bin size. As far as I can tell, your prefactor does not contain the "r" bit. There is also something else that is missing, but it's hard to tell without knowing what all the other constants contain.
EDIT: here is a link that will guide you the implementation of a routine to compute the radial distribution function in 2D and 3D
# Zero-pad an image
def zero_pad(image, pad_height, pad_width):
H, W = image.shape
out = np.zeros((H+2*pad_height, W+2*pad_width))
out[pad_height:H+pad_height,pad_width:W+pad_width] = image
return out
# An step-by-step implementation of convolution filter
def conv(image, kernel):
Hi, Wi = image.shape
Hk, Wk = kernel.shape
out = np.zeros((Hi, Wi))
image_pad = zero_pad(image, (Hk-1)//2, (Wk-1)//2)
for i in range(Hi):
for j in range(Wi):
out[i,j] = np.sum(kernel*image_pad[i:Hk+i,j:Wk+j])
return out
# accelerate convolution using FFT
def conv_faster(image, kernel):
Hi, Wi = image.shape
Hk, Wk = kernel.shape
out = np.zeros((Hi, Wi))
# expand image and kernel by zero-padding
if( (Hi+Hk) % 2 == 0):
x = (Hi+Hk)
else:
x = (Hi+Hk)-1
if( (Wi+Wk) % 2 == 0):
y = (Wi+Wk)
else:
y = (Wi+Wk)-1
image_pad = np.zeros((x,y))
kernel_pad = np.zeros((x,y))
image_pad[0:Hi,0:Wi] = image
kernel_pad[0:Hk,0:Wk] = kernel
# make image and kernel at the center of frequency domain
for p in range(Hi):
for q in range(Wi):
image_pad[p,q]*=(-1)**(p+q)
for p in range(Hk):
for q in range(Wk):
kernel_pad[p,q]*=(-1)**(p+q)
# do fft for image and kernel
image_pad_fft = np.fft.fft2(image_pad)
kernel_pad_fft = np.fft.fft2(kernel_pad)
# get the imaginary part of kernel's transformation
kernel_pad_fft = 1j*np.imag(kernel_pad_fft)
# multiply the two in frequency domain
out_pad = np.fft.ifft2(image_pad_fft*kernel_pad_fft)
# get he real part of result
out = np.real(out_pad[0:Hi,0:Wi])
# Counteract the preceding centralization
for p in range(Hi):
for q in range(Wi):
out[p,q]*=(-1)**(p+q)
return out
There are some difference between their returns.
I think there result should be same, how could I refine it? Thinks!
the kernel is [[1,0,-1],[2,0,-2],[1,0,-1]]
I input this image for the two function
The step-by-step function obtains this result
The accelerated function obtains this result
For a convolution, the Kernel must be flipped. What you do in conv() is a correlation.
Since your Kernel is symmetric apart from a minus sign, result2 = -result1 in your current results