How to fix the issue of plotting a 2D sine wave in python - python-3.x

I want to generate 2D travelling sine wave. To do this, I've set the parameters for the plane wave and generate wave for any time instants like as follows:
import numpy as np
import random
import matplotlib.pyplot as plt
f = 10 # frequency
fs = 100 # sample frequency
Ts = 1/fs # sample period
t = np.arange(0,0.5, Ts) # time index
c = 50 # speed of wave
w = 2*np.pi *f # angular frequency
k = w/c # wave number
resolution = 0.02
x = np.arange(-5, 5, resolution)
y = np.arange(-5, 5, resolution)
dx = np.array(x); M = len(dx)
dy = np.array(y); N = len(dy)
[xx, yy] = np.meshgrid(x, y);
theta = np.pi / 4 # direction of propagation
kx = k* np.cos(theta)
ky = k * np.sin(theta)
So, the plane wave would be
plane_wave = np.sin(kx * xx + ky * yy - w * t[1])
plt.figure();
plt.imshow(plane_wave,cmap='seismic',origin='lower', aspect='auto')
that gives a smooth plane wave as shown in . Also, the sine wave variation with plt.figure(); plt.plot(plane_wave[2,:]) time is given in .
However, when I want to append plane waves at different time instants then there is some discontinuity arises in figure 03 & 04 , and I want to get rid of from this problem.
I'm new in python and any help will be highly appreciated. Thanks in advance.
arr = []
for count in range(len(t)):
p = np.sin(kx * xx + ky * yy - w * t[count]); # plane wave
arr.append(p)
arr = np.array(arr)
print(arr.shape)
pp,q,r = arr.shape
sig = np.reshape(arr, (-1, r))
print('The signal shape is :', sig.shape)
plt.figure(); plt.imshow(sig.transpose(),cmap='seismic',origin='lower', aspect='auto')
plt.xlabel('X'); plt.ylabel('Y')
plt.figure(); plt.plot(sig[2,:])

This is not that much a problem of programming. It has to do more with the fact that you are using the physical quantities in a somewhat unusual way. Your plots are absolutely fine and correct.
What you seem to have misunderstood is the fact that you are talking about a 2D problem with a third dimension added for time. This is by no means wrong but if you try to append the snapshot of the 2D wave side-by-side you are using (again) the x spatial dimension to represent temporal variations. This leads to an inconsistency of the use of that coordinate axis. Now, to make this more intuitive, consider the two time instances separately. Does it not coincide with your intuition that all points on the 2D plane must have different amplitudes (unless of course the time has progressed by a multiple of the period of the wave)? This is the case indeed. Thus, when you try to append the two snapshots, a discontinuity is exhibited. In order to avoid that you have to either use a time step equal to one period, which I believe is of no practical use, or a constant time step that will make the phase of the wave on the left border of the image in the current time equal to the phase of the wave on the right border of the image in the previous time step. Yet, this will always be a constant time step, alternating the phase (on the edges of the image) between the two said values.
The same applies to the 1D case because you use the two coordinate axes to represent the wave (x is the x spatial dimension and y is used to represent the amplitude). This is what can be seen in your last plot.
Now, what would be the solution you may ask. The solution is provided by simple inspection of the mathematical formula of the wave function. In 2D, it is a scalar function of three variables (that is, takes as input three values and outputs one) and so you need at least four dimensions to represent it. Alas, we can't perceive a fourth spatial dimension, but this is not a problem in your case as the output of the function is represented with colors. Then there are three dimensions that could be used to represent the temporal evolution of your function. All you have to do is to create a 3D array where the third dimension represents time and all 2D snapshots will be stored in the first two dimensions.
When it comes to visual representation of the results you could either use some kind of waterfall plots where the z-axis will represent time or utilize the fourth dimension we can perceive, time that is, to create an animation of the evolution of the wave.
I am not very familiar with Python, so I will only provide a generic naive implementation. I am sure a lot of people here could provide some simplification and/or optimisation of the following snippet. I assume that everything in your first two blocks of code is available so changes have to be done only in the last block you present
arr = np.zeros((len(xx), len(yy), len(t))) # Initialise the array to hold the temporal evolution of the snapshots
for i in range(len(t)):
arr[:, :, i] = np.sin(kx * xx + ky * yy - w * t[i])
# Below you can plot the figures with any function you prefer or make an animation out of it

Related

Detecting duplicate audio files

I have snippets of audio that are almost the same that I want to group together (samples 5 and 3 below). There are other portions that are similar, but differ (3 and 4, there is a double drum hit at the end for 3) and completely different ones (sample 8).
How can I group together samples that are almost the same? I tried taking the difference (attempting to minimize it), but that does not work since they are not aligned. I also tried to take audio features like pitch distribution, but since the sounds are similar in pitch those don't get separated well.
The files are available here: https://drive.google.com/drive/folders/14UQQDfIBUNRO_1Pv8bkPf9noi86M7lKd
Here's something that appears to work for the data you are using but may (likely does) have weaknesses when it comes to other data or other sorts of data. But maybe it will be helpful nonetheless.
The basic idea of this solution is to compute the MFCCs of each of the samples to get feature vectors and then find a distance (here just using basic Euclidean distance) between those feature sets with the assumption (which seems to be true for your data) that the least similar samples will have a large distance and the closest will have the least. Here's the code:
import librosa
import scipy
import matplotlib.pyplot as plt
sample3, rate = librosa.load('sample3.wav', sr=None)
sample4, rate = librosa.load('sample4.wav', sr=None)
sample5, rate = librosa.load('sample5.wav', sr=None)
sample8, rate = librosa.load('sample8.wav', sr=None)
# cut the longer sounds to same length as the shortest
len5 = len(sample5)
sample3 = sample3[:len5]
sample4 = sample4[:len5]
sample8 = sample8[:len5]
mf3 = librosa.feature.mfcc(sample3, sr=rate)
mf4 = librosa.feature.mfcc(sample4, sr=rate)
mf5 = librosa.feature.mfcc(sample5, sr=rate)
mf8 = librosa.feature.mfcc(sample8, sr=rate)
# average across the frames. dubious?
amf3 = mf3.mean(axis=0)
amf4 = mf4.mean(axis=0)
amf5 = mf5.mean(axis=0)
amf8 = mf8.mean(axis=0)
f_list = [amf3, amf4, amf5, amf8]
results = []
for i, features_a in enumerate(f_list):
results.append([])
for features_b in f_list:
result = scipy.spatial.distance.euclidean(features_a,
features_b)
results[i].append(result)
plt.ion()
fig, ax = plt.subplots()
ax.imshow(results, cmap='gray_r', interpolation='nearest')
spots = [0, 1, 2, 3]
labels = ['s3', 's4', 's5', 's8']
ax.set_xticks(spots)
ax.set_xticklabels(labels)
ax.set_yticks(spots)
ax.set_yticklabels(labels)
The code plots a heatmap of the distances across all the samples. The code is lazy so it both re-computes the elements that are symmetric across the diagonal, which are the same, and the diagonal itself (which should be zero distance) but those are sort of sanity checks as it is nice to see white down the diagonal and that the matrix is symmetric.
The real information is that clip 8 is black against all the other clips (i.e. furthest from them) and clip 3 and clip 5 are the least distant from one another.
This basic idea could be done with a feature vector generated in a different sort of way (e.g. instead of MFCCs, you could use the embeddings from something like YAMNet) or with a different way of finding a distance between the feature vectors.
For the grouping part of what you want to do, you could experimentally work out a threshold on the distance metric below which you would consider a clip to be in the same group as another. With more clips, you could compute all these distances and then hand that distance matrix over to a clustering algorithm (like HDBSCAN) to cluster the clips.

using Geopandas, How to randomly select in each polygon 5 Points by sampling method

I want to select 5 Points in each polygon based on random sampling method. And required 5 points co-ordinates(Lat,Long) in each polygon for identify which crop is grawn.
Any ideas for do this using geopandas?
Many thanks.
My suggestion involves sampling random x and y coordinates within the shape's bounding box and then checking whether the sampled point is actually within the shape. If the sampled point is within the shape then return it, otherwise repeat until a point within the shape is found. For sampling, we can use the uniform distribution, such that all points in the shape have the same probability of being sampled. Here is the function:
from shapely.geometry import Point
def random_point_in_shp(shp):
within = False
while not within:
x = np.random.uniform(shp.bounds[0], shp.bounds[2])
y = np.random.uniform(shp.bounds[1], shp.bounds[3])
within = shp.contains(Point(x, y))
return Point(x,y)
and here's an example how to apply this function to an example GeoDataFrame called geo_df to get 5 random points for each entry:
for num in range(5):
geo_df['Point{}'.format(num)] = geo_df['geometry'].apply(random_point_in_shp)
There might be more efficient ways to do this, but depending on your application the algorithm could be sufficiently fast. With my test file, which contains ~2300 entries, generating five random points for each entry took around 15 seconds on my machine.

How can I compute (for later uses) a wave wtih a very high frequency?

I'm running a physics simulation related to visible light, and the resulting wave function has a very, very high frequency -- cyclic frequency is on the order of 1.0e15, and the spatial frequency k is on the order of 1.0e7. Thankfully, I only use the spatial frequency, but when I calculate it for later usage (using either math or numpy), I get something that resembles a beat wave, unless I use N ~= k sample points, because I have to calculate it over a much greater range (on the order of 1.0e-3 - 1.0e-1). It produces a beat wave so consistently I spent a few hours to make sure I'm not actually calculating one. I'll also have to use fft() on the resulting wave and I'm afraid it won't work properly with a misrepresented wave.
I've tried using various amounts of sample points, but unless it's extraordinarily high (takes a good minute or two to calculate), only the prominence of beating changes. Just in case I'm misusing numpy, I tried the same thing with appending wave.value calculated by math.sin to a float array, but it had the same result.
import numpy as np
import matplotlib.pyplot as plt
mmScale = 1.0e-3
nmScale = 1.0e-9
c = 3.0e8
N = 1000
class Wave:
def __init__(self, amplitude, wavelength):
self.wavelength = wavelength*nmScale
self.amplitude = amplitude
self.omega = 2*pi*c/self.wavelength
self.k = 2*pi/self.wavelength
def value(self, time, travel):
return self.amplitude*np.sin(self.omega*time - self.k*travel)
x = np.linspace(50, 250, N)*mmScale
wave = Wave(1, 400)
y = wave.value(0.1, x)
plt.plot(x,y)
plt.show()
The code above produces a graph of the function, and you can put in different values for N to see how it gives different waveforms.
Your sampling spatial frequency is:
1/Ts = 1 / ((250-50)*mmScale) / N) = 5000 [samples/meter]
Your wave's spatial frequency is:
1/Tw = 1 / wavelength = 1 / (400e-9) = 2500000 [wavelengths/meter]
You fail to satisfy Nyquist criterion by a factor of (2*2500000 ) / 5000 = 1000.
Thus you must expect serious aliasing effects. See https://en.wikipedia.org/wiki/Aliasing.
Not much can be done to battle it. But there are some tricks that may help you depending on application. One is to represent a wave as a complex envelop around carier frequency, which is 400e-9. Please provide more detail on what you do with the wave.

Point of change in the slope of the signal

Following is the graph which you get on plotting the given data points. There is an exact point where the slope changes before giving a stable line. What we have done is obtaining the first derivative and looking out for a point where the slope makes transition from positive to negative values. But for many data points,such a transition was not found. So is there a better method to do this?
How do you find that point (marked as a red circle in the graph) using slope in python?
graph of the signal
[-0.0006029533498891765, -0.0005180378648295125, -0.0004122940532457625, -0.0002953349889182749, -0.00018692087906219124, -0.00010093727469359659, -4.699724959278395e-05, -1.602178963390488e-05, -5.340596544722853e-07, 9.079014125876195e-06, 1.976020721514149e-05, 3.0441400304406785e-05, 3.845229512135229e-05, 4.3258832011533466e-05, 4.432695132046416e-05, 4.592913028383938e-05, 5.020160751956215e-05, 5.6076263718660146e-05, 5.9814681299896755e-05, 6.195091991774426e-05, 6.408715853560565e-05, 6.568933749899475e-05, 6.889369542577295e-05, 7.209805335256503e-05, 7.370023231594025e-05]
This problem requires some additional specification. The core question is what do you mean by "a stable line"? One potential definition is "consecutive line segments with the exact same slope." However, since the slope of each line segment is likely not precisely similar, this may not be helpful.
Another potential definition is "consecutive line segments whose slopes differ by less than a defined cut-off value." Whenever we're talking about a difference of slopes, we want to look at the second derivative. We can identify transition points by finding where the absolute value of the second derivative is less than the chosen cut-off value at each point.
The question then becomes, what cut-off value is acceptable? Since you want a method that classifies the 10th point as a transition point, I'll use that to inform the decision.
Here is code that defines a cut-off value and uses that to identify points with nearly similar slopes:
data = [-0.0006029533498891765, -0.0005180378648295125, -0.0004122940532457625, -0.0002953349889182749, -0.00018692087906219124, -0.00010093727469359659, -4.699724959278395e-05, -1.602178963390488e-05, -5.340596544722853e-07, 9.079014125876195e-06, 1.976020721514149e-05, 3.0441400304406785e-05, 3.845229512135229e-05, 4.3258832011533466e-05, 4.432695132046416e-05, 4.592913028383938e-05, 5.020160751956215e-05, 5.6076263718660146e-05, 5.9814681299896755e-05, 6.195091991774426e-05, 6.408715853560565e-05, 6.568933749899475e-05, 6.889369542577295e-05, 7.209805335256503e-05, 7.370023231594025e-05]
# Import required libraries
import numpy as np
import matplotlib.pyplot as plt
# Generate figure
plt.figure()
plt.subplot(2,1,1)
plt.title('Data')
plt.plot(data)
plt.scatter(np.arange(len(data)), data)
plt.scatter(9, data[9], c='r') # Identify 10th point
plt.subplot(2,1,2)
plt.title('First Derivative')
deriv1 = data - np.roll(data, -1) # Use simple difference to compute the derivative
deriv1 = deriv1[0:-1] # Remove the last point
plt.plot(deriv1)
plt.scatter(np.arange(len(deriv1)), deriv1)
plt.scatter(9, deriv1[9], c='r') # Identify 10th point
plt.tight_layout()
# Approximate second derivative
deriv2 = deriv1 - np.roll(deriv1, -1) # Use simple difference to compute the derivative
deriv2 = deriv2[0:-1] # Remove the last point
# Plot data
plt.figure()
plt.subplot(2,1,1)
plt.title('Second Derivative')
x = np.arange(len(deriv2))
plt.plot(deriv2)
plt.scatter(x, deriv2)
plt.scatter(9, y[9], c='r') # Identify 10th point
plt.subplot(2,1,2)
plt.title('Absolute Value of Second Derivative')
y = np.abs(deriv2)
plt.plot(x, y)
plt.scatter(x, y)
plt.scatter(9, y[9], c='r') # Identify 10th point
# Correctly scale y axis
diff = max(y) - min(y)
scale = 0.1*diff
plt.ylim(min(y)-scale, max(y)+scale)
# Define cutoff value
cutoff = 1e-17
# Identify points where abs(deriv2) < cutoff
idx_filter = y <= cutoff
plt.axhline(y = cutoff, c='r', linestyle='--', alpha=0.5)
plt.scatter(x[idx_filter], y[idx_filter], s=200, edgecolor='r', facecolor = '')
plt.tight_layout()
As it turns out, there ARE two line segments with precisely the same slope. The 10th point identifies the start of them. The following code finds that transition point concisely, and should work for data with exactly one such point. It can be adapted to find multiple transition points if needed.
# Compute the first derivative
deriv = data - np.roll(data, -1) # Use simple difference to compute the derivative
deriv = deriv[0:-1] # Remove the last point
# Compute the second derivative
deriv2 = deriv - np.roll(deriv, -1) # Use simple difference to compute the derivative
deriv2 = deriv2[0:-1] # Remove the last point
# Define cutoff value
cutoff = 1e-17
# Identify points where abs(deriv2) < cutoff
idx_filter = y <= cutoff
x_transition = int(x[idx_filter][0])
y_transition = data[x_transition]
print('Transition Point Index: '+str(x_transition))
print('Transition Point Value: '+str(y_transition))
print('Difference in slopes: {:.20f}'.format(deriv2[x_transition]))
>>> Transition Point Index: 9
>>> Transition Point Value: 9.079014125876195e-06
>>> Difference in slopes: 0.00000000000000000000
Since there were no x-values provided, the derivative approximation is simplified by assuming that the x-distance between each successive point is 1. Addition of x-data would require slight modifications to the derivative approximations, and another derivative approximation method may be more appropriate if the data is unevenly distributed along the x-axis.

Cosmic ray removal in spectra

Python developers
I am working on spectroscopy in a university. My experimental 1-D data sometimes shows "cosmic ray", 3-pixel ultra-high intensity, which is not what I want to analyze. So I want to remove this kind of weird peaks.
Does anybody know how to fix this issue in Python 3?
Thanks in advance!!
A simple solution could be to use the algorithm proposed by Whitaker and Hayes, in which they use modified z scores on the derivative of the spectrum. This medium post explains how it works and its implementation in python https://towardsdatascience.com/removing-spikes-from-raman-spectra-8a9fdda0ac22 .
The idea is to calculate the modified z scores of the spectra derivatives and apply a threshold to detect the cosmic spikes. Afterwards, a fixer is applied to remove the spike points and replace it by the mean values of the surrounding pixels.
# definition of a function to calculate the modified z score.
def modified_z_score(intensity):
median_int = np.median(intensity)
mad_int = np.median([np.abs(intensity - median_int)])
modified_z_scores = 0.6745 * (intensity - median_int) / mad_int
return modified_z_scores
# Once the spike detection works, the spectrum can be fixed by calculating the average of the previous and the next point to the spike. y is the intensity values of a spectrum, m is the window which we will use to calculate the mean.
def fixer(y,m):
threshold = 7 # binarization threshold.
spikes = abs(np.array(modified_z_score(np.diff(y)))) > threshold
y_out = y.copy() # So we don't overwrite y
for i in np.arange(len(spikes)):
if spikes[i] != 0: # If we have an spike in position i
w = np.arange(i-m,i+1+m) # we select 2 m + 1 points around our spike
w2 = w[spikes[w] == 0] # From such interval, we choose the ones which are not spikes
y_out[i] = np.mean(y[w2]) # and we average the value
return y_out
The answer depends a on what your data looks like: If you have access to two-dimensional CCD readouts that the one-dimensional spectra were created from, then you can use the lacosmic module to get rid of the cosmic rays there. If you have only one-dimensional spectra, but multiple spectra from the same source, then a quick ad-hoc fix is to make a rough normalisation of the spectra and remove those pixels that are several times brighter than the corresponding pixels in the other spectra. If you have only one one-dimensional spectrum from each source, then a less reliable option is to remove all pixels that are much brighter than their neighbours. (Depending on the shape of your cosmics, you may even want to remove the nearest 5 pixels or something, to catch the wings of the cosmic ray peak as well).

Resources