How can I set default parameters for mutable structs in Julia? - struct

Is there a way to add default parameters for mutable structs in Julia?
I'm trying to write something like the following:
mutable struct Scale
# Set default values that will be changed by fit!()
domain_min::Float64 = 0.0
domain_max::Float64 = 1.0
range_min::Float64 = 0.0
range_max::Float64 = 1.0
end
function fit!(data::Array)
# Set struct params here using `data`
end
Is there a way to do this or should I try a different approach?

This is exactly what Base.#kwdef does:
julia> Base.#kwdef mutable struct Scale
# Set default values that will be changed by fit!()
domain_min::Float64 = 0.0
domain_max::Float64 = 1.0
range_min::Float64 = 0.0
range_max::Float64 = 1.0
end
Scale
# All parameters to their default values
julia> Scale()
Scale(0.0, 1.0, 0.0, 1.0)
# Specify some parameter(s) using keyword argument(s)
julia> Scale(range_min = 0.5)
Scale(0.0, 1.0, 0.5, 1.0)

I prefer using Parameters.jl because it provides also a nicer way the structs are displayed which is much nicer for debugging:
julia> using Parameters
julia> #with_kw struct A
a::Int=5
b::String="hello"
c::Float64
end;
julia> A(c=3.5)
A
a: Int64 5
b: String "hello"
c: Float64 3.5

Or you can also just go the long way and define it yourself with constructors, as you would normally do if you want to instantiate it in several possible ways.
mutable struct Scale
# Set default values that will be changed by fit!()
domain_min::Float64
domain_max::Float64
range_min::Float64
range_max::Float64
end
# With default values, but no keywords
julia> Scale(dmin=1.,dmax=2.,rmin=1.,rmax=2.) = Scale(dmin, dmax, rmin, rmax)
Scale
julia> Scale(3.,4.)
Scale(3.0, 4.0, 1.0, 2.0)
# With keyword arguments:
julia> Scale(;dmin=1.,dmax=2.,rmin=1.,rmax=2.) = Scale(dmin, dmax, rmin, rmax)
Scale
julia> Scale(rmax=3., rmin=1.2)
Scale(1.0, 2.0, 1.2, 3.0)
Notice the difference between the two constructors, one has a semicolon ; the other not. I would not recommend using both constructors at the same time, this may lead to some confusion.

Related

How to keep a trailing zero when printing a float without setting precision?

Given a whole floating point number, Rust does not include any decimals when converting it to a string. I want a way to keep the .0 around without setting a fixed precision since I like the default formatting for numbers that do have decimals (playground):
fn main() {
println!("{}", 1.0);
println!("{}", 1.1999999);
println!("{:.1}", 1.0);
println!("{:.1}", 1.999999)
}
1
1.1999999
1.0
2.0
So I would like to be able to print that extra .0 without it affecting anything else.
A way to keep the additional .0, differentiating it from an integer, is to use the Debug formatter:
println!("{:?}", 1.0);
println!("{:?}", 1.1999999);
1.0
1.1999999
I don't see a way to dictate this behavior with the precision format specifier since providing a precision uses that as an "exact" precision, and without it internally uses a "min" precision of 0 and 1 for Display and Debug respectively. (source)

Counting Occurrences of Floats in Array of Floats

Say I have a list of keys that are floats.
keys = [0.999999, 1.999999]
Say I have another list of values.
vals = [1.0, 2.0, 3.0, 4.0, 5.0, 1.0, 1.0, 2.0]
I want to find the total number of times each key occurs in vals and I measure equality using np.isclose(). In the example above, the answer would 5. The following snippet can return this answer, but it is extremely slow when keys and vals are larger in size (10^6 and 10^7, resp.).
def count_float_keys(keys,vals):
count = 0
for key in keys:
present = np.where(np.isclose(vals,key))[0]
count += len(present)
return count
Is there a faster and cleaner alternative to do this?
Edit: 0.99999 is only used as a simplifying example. My data has random float values like 0.035014 that I am not allowed to round further.
Here you go:
# generate random vals
vals = np.random.randint(0,2,(10,10)) + np.random.uniform(0,1,(10,10))
keys = [0.999999, 1.999999]
# check how often each value is in the tolerance of each key
res = [np.sum(np.isclose(vals,k, rtol=0.1, atol=0.1)) for k in keys]

Evaluating logpdf of vector of observations where each observation has different mean parameter

New to Julia and just trying to implement a basic Bayesian model. I would like to evaluate the log-likelihood of each data point, where each data point has a different mean parameter depending on their corresponding covariate, without having to implement a for loop over all data points.
using Distributions
y = -50:1:49
a = 1
b = 1
N = 100
x = rand(Normal(0, 1), N)
mu = a .+ b.*x
sigma = 5
# Can we evaluate the logpdf of every point in one call to logpdf without doing a for loop
loglikelihood = logpdf(Normal(mu, sigma), y)
MethodError: no method matching Normal(::Vector{Float64}, ::Int64)
Edit: I would like to clarify that the mu specified above is a vector of the same dimensions as y, and that instead evaluating logpdf of each observation using the function Normal(::Real, ::Real) in an iterative procedure, I would like to something that handles something to the effect of
logpdf(Normal(::Array, ::Real), ::Array). The code I provide in the following chunk does what I want by taking the sum of the log-likelihood across observations, but I would prefer to not have to transform to a multivariate distribution.
using LinearAlgebra
logpdf(MvNormal(mu, diagm(repeat([sigma], outer=N))), y)
Thanks for your help.
Your code doesn't actually run, as there are undefined variables (a, b, y). But in general what you're asking works out of the box:
julia> using Distributions
julia> μ = 2.0; σ = 3.0;
julia> logpdf(Normal(μ, σ), 0:0.5:4)
9-element Vector{Float64}:
-2.2397730440950046
-2.1425508218727822
-2.073106377428338
-2.0314397107616715
-2.0175508218727827
-2.0314397107616715
-2.073106377428338
-2.1425508218727822
-2.2397730440950046
Here I'm getting the log pdf at values 0, 0.5, 1, ..., 3.5, 4. This works because there's a method for logpdf which takes an AbstractArray as second argument:
julia> #which logpdf(Normal(μ, σ), 0:0.5:4)
logpdf(d::UnivariateDistribution{S} where S<:ValueSupport, X::AbstractArray) in Distributions at deprecated.jl:70
julia> #which logpdf(Normal(μ, σ), 0.5)
logpdf(d::Normal, x::Real) in Distributions at ...\Distributions\bawf4\src\univariate\continuous\normal.jl:105
As you see there though, that method signature is actually deprecated. Let's start Julia with depwarn=yes to see the deprecation notice:
$> julia --depwarn=yes
julia> using Distributions
julia> logpdf(Normal(), 1:10)
┌ Warning: `logpdf(d::UnivariateDistribution, X::AbstractArray)` is deprecated, use `logpdf.(d, X)` instead.
│ caller = top-level scope at REPL[4]:1
└ # Core REPL[4]:1
What this tells you is that actually you don't need a method signature which accepts an array, as Julia's built-in broadcasting syntax - appending a dot to a function call - gives you this for free. Returning to the first example:
julia> logpdf.(Normal(μ, σ), 0:0.5:4)
9-element Vector{Float64}:
-2.2397730440950046
-2.1425508218727822
-2.073106377428338
-2.0314397107616715
-2.0175508218727827
-2.0314397107616715
-2.073106377428338
-2.1425508218727822
-2.2397730440950046
Here, I'm actually calling the logpdf(d::Normal, x::Real) method, but the . after logpdf applies the function elementwise to the range 0:0.5:4.
The broadcast syntax also extends to constructors, so you can use it to construct multiple normal distributions with different mean:
julia> μ = rand(3)
3-element Vector{Float64}:
0.5341692431981215
0.5696647074299088
0.3021675356902611
julia> Normal.(μ, 5)
3-element Vector{Normal{Float64}}:
Normal{Float64}(μ=0.5341692431981215, σ=5.0)
Normal{Float64}(μ=0.5696647074299088, σ=5.0)
Normal{Float64}(μ=0.3021675356902611, σ=5.0)
that's what the error above is telling you - the Normal constructor does not accept a vector as first element, but a single value. If you want to apply it to multiple values, just broadcast!

How do I know the variable ordering for CheckSatisfied?

I am trying to write some unit tests for my constraints using the CheckSatisfied function. How do I know the variable order of the input vector x?
E.g.
q = prog.NewContinuousVariables(1, 'q')
r = prog.NewContinuousVariables(2, 'r')
formula = le(q, r[0] + r[1])
constraint = prog.AddConstraint(formula)
assert(constraint.evaluator().CheckSatisfied([0.3, 0.5, 1]))
How do I know the which variable 0.3, 0.5, 1 corresponds to?
Is it dependent on how the constraints are added, and if so, how do I know the variable order for constraints added in the myriad of ways?
The order of the variables is stored in the return argument of AddConstraint. If you check constraint.variables(), you would see the variable order. The pseudo code is
constraint = prog.AddConstraint(formula)
print(f"{constraint.variables()}")

How does this code of list comprehension with multiple variables assigned works

I have a list of strings. I need to parse and convert the string into floats and use that for a calculation.
After multiple attempts, I figured out the easiest way to do this.
List=["1x+1y+0","1x-1y+0","1x+0y-3","0x+1y-0.5"]
I need to extract the numerical coefficients of x and y
I used:
for coef in re.split('x|y', line):
float(coeff)
This was not serving the purpose and then I found out that,
for line in list:
a,b,c = [float(coef) for coef in re.split('x|y', line)]
this code works.
If I do
a=[float(coeff) for coeff in re.split('x|y',lines)]
then a is a list with coefficients of the line
[1.0, 1.0, 0.0]
[1.0, -1.0, 0.0]
[1.0, 0.0, -3.0]
[0.0, 1.0, -0.5]
However, I am struggling to understand the logic. Here we used list comprehension. How can we assign multiple variables in a list comprehension? Is the way it works as follows:
for each string element in the list, it splits the string and converts into float. And then assign the three numbers resulting from the operation to three numbers.
But how is that if we assign only one variable it is a list, but if we assign multiple variables the type is changing?
I am sorry if the question is too basic. Am new to python hence the doubt.
a, b, c = x is called sequence unpacking. It is (almost) equivalent to:
a = x[0]
b = x[1]
c = x[2]
So a,b,c = [float(coef) for coef in re.split('x|y', line)] actually means:
x = [float(coef) for coef in re.split('x|y', line)]
a = x[0]
b = x[1]
c = x[2]
But a = x is not unpacking - it's just normal assignment: it makes a reference x. The difference: in the first case you assign a list to three variables, each "gets" one item of the list. In the second case, you assign a list to one variable and that variable "gets" the whole list. Assigning a list of three numbers to two variables (a, b = [1, 2, 3]) is invalid - you get an error message saying that there are too many values to unpack.

Resources