I am hoping to gain help to understand how and where I would insert my own binary string that I generated in order to test it for randomness through the linear complexity test. I am very new to coding and would appreciate any help I could get. I uploaded a picture of the code I was using as I was unsuccessful in running the test.
Thanks in advance!
from copy import copy as copy
from numpy import dot as dot
from numpy import histogram as histogram
from numpy import zeros as zeros
from scipy.special import gammainc as gammainc
class ComplexityTest:
#staticmethod
def linear_complexity_test(my_binary_string:str, verbose=False, block_size=4):
"""
Note that this description is taken from the NIST documentation [1]
[1] http://csrc.nist.gov/publications/nistpubs/800-22-rev1a/SP800-22rev1a.pdf
The focus of this test is the length of a linear feedback shift register (LFSR). The purpose of this test is to
determine whether or not the sequence is complex enough to be considered random. Random sequences are
characterized by longer LFSRs. An LFSR that is too short implies non-randomness.
:param my_binary_string: a binary string
:param verbose True to display the debug messgae, False to turn off debug message
:param block_size: Size of the block
:return: (p_value, bool) A tuple which contain the p_value and result of frequency_test(True or False)
"""
my_binary_string = '0010101100010010100010011110110101111010011111110001111001101101'
length_of_my_binary_string = len(my_binary_string)
# The number of degrees of freedom;
# K = 6 has been hard coded into the test.
degree_of_freedom = 6
# π0 = 0.010417, π1 = 0.03125, π2 = 0.125, π3 = 0.5, π4 = 0.25, π5 = 0.0625, π6 = 0.020833
# are the probabilities computed by the equations in Section 3.10
pi = [0.01047, 0.03125, 0.125, 0.5, 0.25, 0.0625, 0.020833]
t2 = (block_size / 3.0 + 2.0 / 9) / 2 ** block_size
mean = 0.5 * block_size + (1.0 / 36) * (9 + (-1) ** (block_size + 1)) - t2
number_of_block = int(length_of_my_binary_string / block_size)
if number_of_block > 1:
block_end = block_size
block_start = 0
blocks = []or i in range(number_of_block):
blocks.append(binary_data[block_start:block_end])
block_start += block_size
block_end += block_size
complexities = []
for block in blocks:
complexities.append(ComplexityTest.berlekamp_massey_algorithm(block))
t = ([-1.0 * (((-1) ** block_size) * (chunk - mean) + 2.0 / 9) for chunk in complexities])
vg = histogram(t, bins=[-9999999999, -2.5, -1.5, -0.5, 0.5, 1.5, 2.5, 9999999999])[0][::-1]
im = ([((vg[ii] - number_of_block * pi[ii]) ** 2) / (number_of_block * pi[ii]) for ii in range(7)])
xObs = 0.0
for i in range(len(pi)):
xObs += im[i]
# P-Value = igamc(K/2, xObs/2)
p_value = gammainc(degree_of_freedom / 2.0, xObs / 2.0)
if verbose:
print('Linear Complexity Test DEBUG BEGIN:')
print("\tLength of input:\t", length_of_binary_data)
print('\tLength in bits of a block:\t', )
print("\tDegree of Freedom:\t\t", degree_of_freedom)
print('\tNumber of Blocks:\t', number_of_block)
print('\tValue of Vs:\t\t', vg)
print('\txObs:\t\t\t\t', xObs)
print('\tP-Value:\t\t\t', p_value)
print('DEBUG END.')
return (p_value, (p_value >= 0.01))
else:
return (-1.0, False)
Related
question up front
def shade_func(color, offset):
return tuple([int(c * (1 - offset)) for c in color])
def tint_func(color, offset):
return tuple([int(c + (255 - c) * offset) for c in color])
def tone_func(color, offset):
return tuple([int(c * (1 - offset) + 128 * offset) for c in color])
given an objective over a collection of colors that returns the least distance to a target color, how do I ensure that basinhopping isn't better than minimization in scikit learn?
I was thinking that, for any one color, there will be up to 4 moments in a v-shaped curve, and so only one minimum. if the value with offset zero is itself a minimum, maybe it could be 5. Am I wrong? In any case each is a single optimum, so if we are only searching one color at a time, no reason to use basinhopping.
If we instead use basinhopping to scan all colors at once (we can scale the two different dimensions, in fact this is where the idea of a preprocessor function first came from), it scans them, but does not do such a compelling just of scanning all colors. Some colors it only tries once. I think it might completely skip some colors with large enough sets.
details
I was inspired by way artyclick shows colors and allows searching for them. If you look at an individual color, for example mauve you'll notice that it prominently displays the shades, tints, and tones of the color, rather like an artist might like. If you ask it for the name of a color, it will use a hidden unordered list of about a thousand color names, and some javascript to find the nearest colorname to the color you chose. In fact it will also show alternatives.
I noticed that quite often a shade, tint or tone of an alternative (or even the best match) was often a better match than the color it provided. For those who don't know about shade, tint and tone, there's a nice write up at Dunn Edward's Paints. It looks like shade and tint are the same but with signs reversed, if doing this on tuples representing colors. For tone it is different, a negative value would I think saturate the result.
I felt like there must be authoritative (or at least well sourced) colorname sources it could be using.
In terms of the results, since I want any color or its shade/tint/tone, I want a result like this:
{'color': '#aabbcc',
'offset': {'type': 'tint', 'value': 0.31060384614807254}}
So I can return the actual color name from the color, plus the type of color transform to get there and the amount of you have to go.
For distance of colors, there is a great algorithm that is meant to model human perception that I am using, called CIEDE 2000. Frankly, I'm just using a snippet I found that implements this, it could be wrong.
So now I want to take in two colors, compare their shade, tint, and tone to a target color, and return the one with the least distance. After I am done, I can reconstruct if it was a shade, tint or tone transform from the result just by running all three once and choosing the best fit. With that structure, I can iterate over every color, and that should do it. I use optimization because I don't want to hard code what offsets it should consider (though I am reconsidering this choice now!).
because I want to consider negatives for tone but not for shade/tint, my objective will have to transform that. I have to include two values to optimize, since the objection function will need to know what color to transform (or else the result will give me know way of knowing which color to use the offset with).
so my call should look something like the following:
result = min(minimize(objective, (i,0), bounds=[(i, i), (-1, 1)]) for i in range(len(colors)))
offset_type = resolve_offset_type(result)
with that in mind, I implemented this solution, over the past couple of days.
current solution
from scipy.optimize import minimize
import numpy as np
import math
def clamp(low, x, high):
return max(low, min(x, high))
def hex_to_rgb(hex_color):
hex_color = hex_color.lstrip('#')
return tuple(int(hex_color[i:i+2], 16) for i in (0, 2, 4))
def rgb_to_hex(rgb):
return '#{:02x}{:02x}{:02x}'.format(*rgb)
def rgb_to_lab(color):
# Convert RGB to XYZ color space
R = color[0] / 255.0
G = color[1] / 255.0
B = color[2] / 255.0
R = ((R + 0.055) / 1.055) ** 2.4 if R > 0.04045 else R / 12.92
G = ((G + 0.055) / 1.055) ** 2.4 if G > 0.04045 else G / 12.92
B = ((B + 0.055) / 1.055) ** 2.4 if B > 0.04045 else B / 12.92
X = R * 0.4124 + G * 0.3576 + B * 0.1805
Y = R * 0.2126 + G * 0.7152 + B * 0.0722
Z = R * 0.0193 + G * 0.1192 + B * 0.9505
return (X,Y,Z)
def shade_func(color, offset):
return tuple([int(c * (1 - offset)) for c in color])
def tint_func(color, offset):
return tuple([int(c + (255 - c) * offset) for c in color])
def tone_func(color, offset):
return tuple([int(c * (1 - offset) + 128 * offset) for c in color])
class ColorNameFinder:
def __init__(self, colors, distance=None):
if distance is None:
distance = ColorNameFinder.ciede2000
self.distance = distance
self.colors = [hex_to_rgb(color) for color in colors]
#classmethod
def euclidean(self, left, right):
return (left[0] - right[0]) ** 2 + (left[1] - right[1]) ** 2 + (left[2] - right[2]) ** 2
#classmethod
def ciede2000(self, color1, color2):
# Convert color to LAB color space
lab1 = rgb_to_lab(color1)
lab2 = rgb_to_lab(color2)
# Compute CIE 2000 color difference
C1 = math.sqrt(lab1[1] ** 2 + lab1[2] ** 2)
C2 = math.sqrt(lab2[1] ** 2 + lab2[2] ** 2)
a1 = math.atan2(lab1[2], lab1[1])
a2 = math.atan2(lab2[2], lab2[1])
dL = lab2[0] - lab1[0]
dC = C2 - C1
dA = a2 - a1
dH = 2 * math.sqrt(C1 * C2) * math.sin(dA / 2)
L = 1
C = 1
H = 1
LK = 1
LC = math.sqrt(math.pow(C1, 7) / (math.pow(C1, 7) + math.pow(25, 7)))
LH = math.sqrt(lab1[0] ** 2 + lab1[1] ** 2)
CB = math.sqrt(lab2[1] ** 2 + lab2[2] ** 2)
CH = math.sqrt(C2 ** 2 + dH ** 2)
SH = 1 + 0.015 * CH * LC
SL = 1 + 0.015 * LH * LC
SC = 1 + 0.015 * CB * LC
T = 0.0
if (a1 >= a2 and a1 - a2 <= math.pi) or (a2 >= a1 and a2 - a1 > math.pi):
T = 1
else:
T = 0
dE = math.sqrt((dL / L) ** 2 + (dC / C) ** 2 + (dH / H) ** 2 + T * (dC / SC) ** 2)
return dE
def __factory_objective(self, target, preprocessor=lambda x: x):
def fn(x):
print(x, preprocessor(x))
x = preprocessor(x)
color = self.colors[x[0]]
offset = x[1]
bound_offset = abs(offset)
offsets = [
shade_func(color, bound_offset),
tint_func(color, bound_offset),
tone_func(color, offset)]
least_error = min([(right, self.distance(target, right)) \
for right in offsets], key = lambda x: x[1])[1]
return least_error
return fn
def __resolve_offset_type(self, sample, target, offset):
bound_offset = abs(offset)
shade = shade_func(sample, bound_offset)
tint = tint_func(sample, bound_offset)
tone = tone_func(sample, offset)
lookup = {}
lookup[shade] = "shade"
lookup[tint] = "tint"
lookup[tone] = "tone"
offsets = [shade, tint, tone]
least_error = min([(right, self.distance(target, right)) for right in offsets], key = lambda x: x[1])[0]
return lookup[least_error]
def nearest_color(self, target):
target = hex_to_rgb(target)
preprocessor=lambda x: (int(x[0]), x[1])
result = min(\
[minimize( self.__factory_objective(target, preprocessor=preprocessor),
(i, 0),
bounds=[(i, i), (-1, 1)],
method='Powell') \
for i, color in enumerate(self.colors)], key=lambda x: x.fun)
color_index = int(result.x[0])
nearest_color = self.colors[color_index]
offset = preprocessor(result.x)[1]
offset_type = self.__resolve_offset_type(nearest_color, target, offset)
return {
"color": rgb_to_hex(nearest_color),
"offset": {
"type": offset_type,
"value": offset if offset_type == 'tone' else abs(offset)
}
}
let's demonstrate this with mauve. We'll define a target that is similar to a shade of mauve, include mauve in a list of colors, and ideally we'll get mauve back from our test.
colors = ['#E0B0FF', '#FF0000', '#000000', '#0000FF']
target = '#DFAEFE'
agent = ColorNameFinder(colors)
agent.nearest_color(target)
we do get mauve back:
{'color': '#e0b0ff',
'offset': {'type': 'shade', 'value': 0.0031060384614807254}}
the distance is 0.004991238317138219
agent.distance(hex_to_rgb(target), shade_func(hex_to_rgb(colors[0]), 0.0031060384614807254))
why use Powell's method?
in this arrangement, it is simply the best. No other method that uses bounds did a good job of scanning positives and negatives, and I had mixed results using the preprocessor to scale the values back to negative with bounds of (0,2).
I do notice that in the sample test, a range between about 0.003 and 0.0008 seems to produce the same distance, and that the values my approach considers includes a large number of these. is there a more efficient solution?
If I'm wrong, please let me know.
correctness of the color transformations
what is adding a negative amount of white? (in the case of a tint) I was thinking it is like adding a positive amount of black -- ie a shade, with signs reversed.
my implementation is not correct:
agent.distance(hex_to_rgb(target), shade_func(hex_to_rgb(colors[0]), 0.1)) - agent.distance(hex_to_rgb(target), tint_func(hex_to_rgb(colors[0]), -0.1))
produces 0.3239904390784106 instead of 0.
I'll probably be fixing that soon
I want to determine the quality score of the text by giving them some score or rating (something like ' image-text is 90% bad. Texts are not readable ).
What I am doing now is I am using the Blind/referenceless image spatial quality evaluator (BRISQUE) model to assess the quality.
It gives scores from 0 to 100. 0 score for good quality and 100 for bad quality.
The problem I am having with this code is that it is giving bad scores to even good quality "images-texts".
Also, the score exceeds 100 sometimes but according to the reference I am taking, the score should be between 0 to 100 only.
Can someone please suggest to me how can I get promising and reliable results for assessing the quality of the text-based images?
import collections
from itertools import chain
# import urllib.request as request
import pickle
import numpy as np
import scipy.signal as signal
import scipy.special as special
import scipy.optimize as optimize
# import matplotlib.pyplot as plt
import skimage.io
import skimage.transform
import cv2
from libsvm import svmutil
from os import listdir
# Calculating Local Mean
def normalize_kernel(kernel):
return kernel / np.sum(kernel)
def gaussian_kernel2d(n, sigma):
Y, X = np.indices((n, n)) - int(n/2)
gaussian_kernel = 1 / (2 * np.pi * sigma ** 2) * np.exp(-(X ** 2 + Y ** 2) / (2 * sigma ** 2))
return normalize_kernel(gaussian_kernel)
def local_mean(image, kernel):
return signal.convolve2d(image, kernel, 'same')
# Calculating the local deviation
def local_deviation(image, local_mean, kernel):
"Vectorized approximation of local deviation"
sigma = image ** 2
sigma = signal.convolve2d(sigma, kernel, 'same')
return np.sqrt(np.abs(local_mean ** 2 - sigma))
# Calculate the MSCN coefficients
def calculate_mscn_coefficients(image, kernel_size=6, sigma=7 / 6):
C = 1 / 255
kernel = gaussian_kernel2d(kernel_size, sigma=sigma)
local_mean = signal.convolve2d(image, kernel, 'same')
local_var = local_deviation(image, local_mean, kernel)
return (image - local_mean) / (local_var + C)
# It is found that the MSCN coefficients are distributed as a Generalized Gaussian Distribution (GGD) for a broader spectrum of distorted image.
# Calculate GGD
def generalized_gaussian_dist(x, alpha, sigma):
beta = sigma * np.sqrt(special.gamma(1 / alpha) / special.gamma(3 / alpha))
coefficient = alpha / (2 * beta() * special.gamma(1 / alpha))
return coefficient * np.exp(-(np.abs(x) / beta) ** alpha)
# Pairwise products of neighboring MSCN coefficients
def calculate_pair_product_coefficients(mscn_coefficients):
return collections.OrderedDict({
'mscn': mscn_coefficients,
'horizontal': mscn_coefficients[:, :-1] * mscn_coefficients[:, 1:],
'vertical': mscn_coefficients[:-1, :] * mscn_coefficients[1:, :],
'main_diagonal': mscn_coefficients[:-1, :-1] * mscn_coefficients[1:, 1:],
'secondary_diagonal': mscn_coefficients[1:, :-1] * mscn_coefficients[:-1, 1:]
})
# Asymmetric Generalized Gaussian Distribution (AGGD) model
def asymmetric_generalized_gaussian(x, nu, sigma_l, sigma_r):
def beta(sigma):
return sigma * np.sqrt(special.gamma(1 / nu) / special.gamma(3 / nu))
coefficient = nu / ((beta(sigma_l) + beta(sigma_r)) * special.gamma(1 / nu))
f = lambda x, sigma: coefficient * np.exp(-(x / beta(sigma)) ** nu)
return np.where(x < 0, f(-x, sigma_l), f(x, sigma_r))
# Fitting Asymmetric Generalized Gaussian Distribution
def asymmetric_generalized_gaussian_fit(x):
def estimate_phi(alpha):
numerator = special.gamma(2 / alpha) ** 2
denominator = special.gamma(1 / alpha) * special.gamma(3 / alpha)
return numerator / denominator
def estimate_r_hat(x):
size = np.prod(x.shape)
return (np.sum(np.abs(x)) / size) ** 2 / (np.sum(x ** 2) / size)
def estimate_R_hat(r_hat, gamma):
numerator = (gamma ** 3 + 1) * (gamma + 1)
denominator = (gamma ** 2 + 1) ** 2
return r_hat * numerator / denominator
def mean_squares_sum(x, filter=lambda z: z == z):
filtered_values = x[filter(x)]
squares_sum = np.sum(filtered_values ** 2)
return squares_sum / ((filtered_values.shape))
def estimate_gamma(x):
left_squares = mean_squares_sum(x, lambda z: z < 0)
right_squares = mean_squares_sum(x, lambda z: z >= 0)
return np.sqrt(left_squares) / np.sqrt(right_squares)
def estimate_alpha(x):
r_hat = estimate_r_hat(x)
gamma = estimate_gamma(x)
R_hat = estimate_R_hat(r_hat, gamma)
solution = optimize.root(lambda z: estimate_phi(z) - R_hat, [0.2]).x
return solution[0]
def estimate_sigma(x, alpha, filter=lambda z: z < 0):
return np.sqrt(mean_squares_sum(x, filter))
def estimate_mean(alpha, sigma_l, sigma_r):
return (sigma_r - sigma_l) * constant * (special.gamma(2 / alpha) / special.gamma(1 / alpha))
alpha = estimate_alpha(x)
sigma_l = estimate_sigma(x, alpha, lambda z: z < 0)
sigma_r = estimate_sigma(x, alpha, lambda z: z >= 0)
constant = np.sqrt(special.gamma(1 / alpha) / special.gamma(3 / alpha))
mean = estimate_mean(alpha, sigma_l, sigma_r)
return alpha, mean, sigma_l, sigma_r
# Calculate BRISQUE features
def calculate_brisque_features(image, kernel_size=7, sigma=7 / 6):
def calculate_features(coefficients_name, coefficients, accum=np.array([])):
alpha, mean, sigma_l, sigma_r = asymmetric_generalized_gaussian_fit(coefficients)
if coefficients_name == 'mscn':
var = (sigma_l ** 2 + sigma_r ** 2) / 2
return [alpha, var]
return [alpha, mean, sigma_l ** 2, sigma_r ** 2]
mscn_coefficients = calculate_mscn_coefficients(image, kernel_size, sigma)
coefficients = calculate_pair_product_coefficients(mscn_coefficients)
features = [calculate_features(name, coeff) for name, coeff in coefficients.items()]
flatten_features = list(chain.from_iterable(features))
return np.array(flatten_features, dtype=object)
# Loading image from local machine
def load_image(file):
return cv2.imread(file)
# return skimage.io.imread("img.png", plugin='pil')
path = "C:\\Users\\Krishna\\PycharmProjects\\ImageScore\\images2\\"
image_list = listdir(path)
for file in image_list:
image = load_image(path+file)
gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# image = load_image()
# gray_image = skimage.color.rgb2gray(image)
# _ = skimage.io.imshow(image)
#%%time
# Calculate Coefficients
mscn_coefficients = calculate_mscn_coefficients(gray_image, 7, 7/6)
coefficients = calculate_pair_product_coefficients(mscn_coefficients)
# Fit Coefficients to Generalized Gaussian Distributions
brisque_features = calculate_brisque_features(gray_image, kernel_size=7, sigma=7/6)
# Resize Image and Calculate BRISQUE Features
downscaled_image = cv2.resize(gray_image, None, fx=1/2, fy=1/2, interpolation = cv2.INTER_CUBIC)
downscale_brisque_features = calculate_brisque_features(downscaled_image, kernel_size=7, sigma=7/6)
brisque_features = np.concatenate((brisque_features, downscale_brisque_features))
# a pretrained SVR model to calculate the quality assessment. However, in order to have good results, we need to scale the features to [-1, 1]
def scale_features(features):
with open('normalize.pickle', 'rb') as handle:
scale_params = pickle.load(handle)
min_ = np.array(scale_params['min_'])
max_ = np.array(scale_params['max_'])
return -1 + (2.0 / (max_ - min_) * (features - min_))
def calculate_image_quality_score(brisque_features):
model = svmutil.svm_load_model('brisque_svm.txt')
scaled_brisque_features = scale_features(brisque_features)
x, idx = svmutil.gen_svm_nodearray(
scaled_brisque_features,
isKernel=(model.param.kernel_type == svmutil.PRECOMPUTED))
nr_classifier = 1
prob_estimates = (svmutil.c_double * nr_classifier)()
return svmutil.libsvm.svm_predict_probability(model, x, prob_estimates)
print(calculate_image_quality_score(brisque_features))
Here is one output for the quality score I am getting for one of the "text-based image"
156.04440687506016
A bit of background:
I want to calculate the array factor of a MxN antenna array, which is given by the following equation:
Where w_i are the complex weight of the i-th element, (x_i,y_i,z_i) is the position of the i-th element, k is the wave number, theta and phi are the elevation and azimuth respectively, and i ranges from 0 to MxN-1.
In the code I have:
-theta and phi are np.mgrid with shape (200,200) each,
-w_i, and (x,y,z)_i are np.array with shape (NxM,) each
so AF is a np.array with shape (200,200) (sum over i).There is no problem so far, and I can get AF easily doing:
af = zeros([theta.shape[0],phi.shape[0]])
for i in range(self.size[0]*self.size[1]):
af = af + ( w[i]*e**(-1j*(k * x_pos[i]*sin(theta)*cos(phi) + k * y_pos[i]* sin(theta)*sin(phi)+ k * z_pos[i] * cos(theta))) )
Now, each w_i depends on frequency, so AF too, and now I have w_i with shape (NxM,1000) (I have 1000 samples of each w_i in frequency). I tried to use the above code changing
af = zeros([1000,theta.shape[0],phi.shape[0]])
but I get 'operands could not be broadcast together'. I can solve this by using a for loop through the 1000 values, but it is slow and is a bit ugly. So, what is the correct way to do the summation, or the correct way to properly define w_i and AF ?
Any help would be appreciated. Thanks.
edit
The code with the new dimension I'm trying to add is the next:
from numpy import *
class AntennaArray:
def __init__(self,f,asize=None,tipo=None,dx=None,dy=None):
self.Lambda = 299792458 / f
self.k = 2*pi/self.Lambda
self.size = asize
self.type = tipo
self._AF_DATA_SIZE = 200
self.theta,self.phi = mgrid[0 : pi : self._AF_DATA_SIZE*1j,0 : 2*pi : self._AF_DATA_SIZE*1j]
self.element_pos = None
self.element_amp = None
self.element_pha = None
if dx == None:
self.dx = self.Lambda/2
else:
self.dx = dx
if dy == None:
self.dy = self.Lambda/2
else:
self.dy = dy
self.generate_array()
def generate_array(self):
M = self.size[0]
N = self.size[1]
dx = self.dx
dy = self.dy
x_pos = arange(0,dx*N,dx)
y_pos = arange(0,dy*M,dy)
z_pos = 0
ele = zeros([N*M,3])
for i in range(M):
ele[i*N:(i+1)*N,0] = x_pos[:]
for i in range(M):
ele[i*N:(i+1)*N,1] = y_pos[i]
self.element_pos = ele
#self.array_factor = self.calculate_array_factor()
def calculate_array_factor(self):
theta,phi = self.theta,self.phi
k = self.k
x_pos = self.element_pos[:,0]
y_pos = self.element_pos[:,1]
z_pos = self.element_pos[:,2]
w = self.element_amp*exp(1j*self.element_pha)
if len(self.element_pha.shape) > 1:
#I have f_size samples of w_i(f)
f_size = self.element_pha.shape[1]
af = zeros([f_size,theta.shape[0],phi.shape[0]])
else:
#I only have w_i
af = zeros([theta.shape[0],phi.shape[0]])
for i in range(self.size[0]*self.size[1]):
**strong text**#This for loop does the summation over i
af = af + ( w[i]*e**(-1j*(k * x_pos[i]*sin(theta)*cos(phi) + k * y_pos[i]* sin(theta)*sin(phi)+ k * z_pos[i] * cos(theta))) )
return af
I tried to test it with the next main
from numpy import *
f_points = 10
M = 2
N = 2
a = AntennaArray(5.8e9,[M,N])
a.element_amp = ones([M*N,f_points])
a.element_pha = zeros([M*N,f_points])
af = a.calculate_array_factor()
But I get
ValueError: 'operands could not be broadcast together with shapes (10,) (200,200) '
Note that if I set
a.element_amp = ones([M*N])
a.element_pha = zeros([M*N])
This works well.
Thanks.
I had a look at the code, and I think this for loop:
af = zeros([theta.shape[0],phi.shape[0]])
for i in range(self.size[0]*self.size[1]):
af = af + ( w[i]*e**(-1j*(k * x_pos[i]*sin(theta)*cos(phi) + k * y_pos[i]* sin(theta)*sin(phi)+ k * z_pos[i] * cos(theta))) )
is wrong in many ways. You are mixing dimensions, you cannot loop that way.
And by the way, to make full use of numpy efficiency, never loop over the arrays. It slows down the execution significantly.
I tried to rework that part.
First, I advice you to not use from numpy import *, it's bad practice (see here). Use import numpy as np. I reintroduced the np abbreviation, so you can understand what comes from numpy.
Frequency independent case
This first snippet assumes that w is a 1D array of length 4: I am neglecting the frequency dependency of w, to show you how you can get what you already obtained without the for loop and using instead the power of numpy.
af_points = w[:,np.newaxis,np.newaxis]*np.e**(-1j*
(k * x_pos[:,np.newaxis,np.newaxis]*np.sin(theta)*np.cos(phi) +
k * y_pos[:,np.newaxis,np.newaxis]*np.sin(theta)*np.sin(phi) +
k * z_pos[:,np.newaxis,np.newaxis]*np.cos(theta)
))
af = np.sum(af_points, axis=0)
I am using numpy broadcasting to obtain a 3D array named af_points, whose shape is (4, 200, 200). To do it, I use np.newaxis to extend the number of axis of an array in order to use broadcasting correctly. More here on np.newaxis.
So, w[:,np.newaxis,np.newaxis] is an array of shape (4, 1, 1). Similarly for x_pos[:,np.newaxis,np.newaxis], y_pos[:,np.newaxis,np.newaxis] and z_pos[:,np.newaxis,np.newaxis]. Since the angles have shape (200, 200), broadcasting can be done, and af_points has shape (4, 200, 200).
Finally the sum is done by np.sum, summing over the first axis to obtain a (200, 200) array.
Frequency dependent case
Now w has shape (4, 10), where 10 are the frequency points. The idea is the same, just consider that the frequency is an additional dimension in your numpy arrays: now af_points will be an array of shape (4, 10, 200, 200) where 10 are the f_points you have defined.
To keep it understandable, I've split the calculation:
#exp_point is only the exponent, frequency independent. Will be a (4, 200, 200) array.
exp_points = np.e**(-1j*
(k * x_pos[:,np.newaxis,np.newaxis]*np.sin(theta)*np.cos(phi) +
k * y_pos[:,np.newaxis,np.newaxis]*np.sin(theta)*np.sin(phi) +
k * z_pos[:,np.newaxis,np.newaxis]*np.cos(theta)
))
af_points = w[:,:,np.newaxis,np.newaxis] * exp_points[:,np.newaxis,:,:]
af = np.sum(af_points, axis=0)
And now af has shape (10, 200, 200).
I have a dataset from kaggle of 45,253 rows and a single column for temperature in Kelvin for the city of Detroit. It's mean = 282.97, std = 11, min = 243.48, max = 308.05.
This is the result when plotted as a histogram of 100 bins with density=True:
I am expected to write the following two functions and see whichever one approximates the closest to the histogram:
Like this one here using scipy.stats.norm.pdf:
I generated the above image using:
x = np.linspace(dataset.Detroit.min(), dataset.Detroit.max(), 1001)
P_norm = norm.pdf(x, dataset.Detroit.mean(), dataset.Detroit.std())
plot_pdf_single(x, P_norm)
However, whenever I try to implement any of the two approximation functions all of my values for P_norm result in 0s or infs.
This is what I tried:
P_norm = [(1.0/(np.sqrt(2.0*pi*(std*std))))*np.exp(((-x_i-mu)*(-x_i-mu))/(2.0*(std*std))) for x_i in x]
I also broke it down into parts for a single x_i:
part1 = ((-x[0] - mu)*(-x[0] - mu)) / (2.0*(std * std))
part2 = np.exp(part1)
part3 = 1.0 / (np.sqrt(2.0 * pi * (std*std)))
total = part3*part2
I got the following values:
1145.3913234604413
inf
0.036267480036493875
inf
Since both of the equations use the same formula:
def pdf_approximation(x_i, mu, std):
return (1.0 / (np.sqrt(2.0 * pi * (std*std)))) * np.exp((-(x_i-mu)*(x_i-mu)) / (2.0 * (std*std)))
The code for the first approximation is:
mu = 283
std = 11
P_norm = np.array([pdf_approximation(x_i, mu, std) for x_i in x])
plot_pdf_single(x, P_norm)
The code for the second approximation is:
mu1 = 276
std1 = 6
mu2 = 293
std2 = 6.5
P_norm = np.array([(pdf_approximation(x_i, mu1, std1) * 0.5) + (pdf_approximation(x_i, mu2, std2) * 0.5) for x_i in x])
plot_pdf_single(x, P_norm)
I have a list of growth rates and would like to calculate all available compounded growth rates:
l = [0.3, 0.2, 0.1]
Output (as a list):
o = [0.56, 0.716]
calculation detail about the compounded growth rates:
0.56 = (1 + 0.3) * (1 + 0.2) - 1
0.716 = (1 + 0.3) * (1 + 0.2) * (1 + 0.1) - 1
The function should be flexible to the length of the input list.
You could express the computation with list comprehensions / generator expressions and using itertools.accumulate to handle the compounding:
import itertools as IT
import operator
def compound_growth_rates(l):
result = [xy-1 for xy in
IT.islice(IT.accumulate((1+x for x in l), operator.mul), 1, None)]
return result
l = [0.3, 0.2, 0.1]
print(compound_growth_rates(l))
prints
[0.56, 0.7160000000000002]
Or, equivalently, you could write this with list-comprehensions and a for-loop:
def compound_growth_rates(l):
add_one = [1+x for x in l]
products = [add_one[0]]
for x1 in add_one[1:]:
products.append(x1*products[-1])
result = [p-1 for p in products[1:]]
return result
I think the advantage of using itertools.accumulate is that it expresses the
intent of the code better than the for-loop. But the for-loop may be more
readable in the sense that it uses more commonly known syntax.