Both bayes_R2 and loo_R2 get weird estimate 1 - brms

I ran a brms model on a dataset containing almost 11000 species using the following command.
fit <- brm(formula =brl ~ LH + (1|species),
data = data_,
cov_ranef = list(species=phyloMat),
family = gaussian(),
save_all_pars = T,
chains = 2,
cores = 10,
backend = "cmdstanr",
threads = threading(15)
)
Then, we used bayes_R2 and loo_R2 methods to calculate the r2. However, we got a weird result: R2 estimates were 1 in both cases.
bayes_r2 = bayes_R2(fit)
loo_r2 = loo_R2(fit)
> print(bayes_r2)
Estimate Est.Error Q2.5 Q97.5
R2 1 2.769589e-09 1 1
> print(loo_r2)
Estimate Est.Error Q2.5 Q97.5
R2 1 1.131317e-10 1 1
Can you help us infer what caused this problem?
Thank you very much!

Related

Plot output differences between python and julia

I am trying to use julai as main language for my work. But I find that this plot is different than python (Which outputs the right plot)
Here is the python code and output
import numpy as np
import math
import matplotlib.pyplot as plt
u = 9.27*10**(-21)
k = 1.38*10**(-16)
j2 = 7/2
nrr = 780
h = 1000
na = 6*10**(23)
rho = 7.842
mgd = 157.25
a = mgd
d = na*rho*u/a
m_f = []
igd = 7.0
for t in range(1,401):
while True:
h1 = h+d*nrr*igd
x2 = (7*u*h1)/(k*t)
x4 = 2*j2
q2 = (x4+1)/x4
m = abs(7*(q2*math.tanh(q2*x2)**-1 - (1/x4)*math.tanh(x2/x4)**-1))
if abs(m - igd) < 10**(-12):
break
else:
igd = m
m_f.append(abs(m))
plt.plot(range(1,401), m_f)
plt.savefig("Py_plot.pdf")
and it gives the following right plot
The right plot as expected
But when I do the same calculations in julia it gives different output than python, here is my julia code
using Plots
u = 9.27*10^(-21)
k = 1.38*10^(-16)
j2 = 7/2
nrr = 780
h = 1000
na = 6*10^(23)
rho = 7.842
mgd = 157.25
a = mgd
d = na*rho*u/a
igd = 7.0
m = 0.0
m_f = Float64[]
for t in 1:400
while true
h1 = h+d*nrr*igd
x2 = (7*u*h1)/(k*t)
x4 = 2*j2
q2 = (x4+1)/x4
m = 7*(q2*coth(rad2deg(q2*x2))-(1/x4)*coth(rad2deg(x2/x4)))
if abs(abs(m)-igd) < 10^(-10)
break
else
igd = m
end
end
push!(m_f, abs(m))
end
plot(1:400, m_f)
and this is the unexpected julia output
unexpected wrong output from julia
I wish for help....
Code:
using Plots
const u = 9.27e-21
const k = 1.38e-16
const j2 = 7/2
const nrr = 780
const h = 1000
const na = 6.0e23
const rho = 7.842
const mgd = 157.25
const a = mgd
const d = na*rho*u/a
function plot_graph()
igd = 7.0
m = 0.0
trange = 1:400
m_f = Vector{Float64}(undef, length(trange))
for t in trange
while true
h1 = h+d*nrr*igd
x2 = (7*u*h1)/(k*t)
x4 = 2*j2
q2 = (x4+1)/x4
m = abs(7*(q2*coth(q2*x2)-(1/x4)*coth(x2/x4)))
if isapprox(m, igd, atol = 10^(-10))
break
else
igd = m
end
end
m_f[t] = m
end
plot(trange, m_f)
end
Plot:
Changes for correctness:
Changed na = 6.0*10^(23) to na = 6.0e23.
Since ^ has a higher precedence than *, 10^23 is evaluated first, and since the operands are Int values, the result is also an Int. However, Int (i.e. Int64) can only hold numbers up to approximately 9 * 10^18, so 10^23 overflows and gives a wrong result.
julia> 10^18
1000000000000000000
julia> 10^19 #overflow starts here
-8446744073709551616
julia> 10^23 #and gives a wrong value here too
200376420520689664
6.0e23 avoids this problem by directly using the scientific e-notation to create a literal Float64 value (Float64 can hold this value without overflowing).
Removed the rad2deg calls when calling coth. Julia trigonometric functions by default take radians, so there's no need to make this conversion.
Other changes
Marked all the constants as const, and moved the rest of the code into a function. See Performance tip: Avoid non-constant global variables
Changed the abs(m - igd) < 10^-10 to isapprox(m, igd, atol = 10^-10) which performs basically the same check, but is clearer and more flexible (for eg. if you wanted to change to a relative tolerance rtol later).
Stored the 1:400 as a named variable trange. This is just because it's used multiple times, so it's easier to manage as a variable.
Changed m_f = Float64[] to m_f = Vector{Float64}(undef, length(trange)) (and the push! at the end to an assignment). If the size of the array is known beforehand (as it is in this case), it's better for performance to pre-allocate it with undef values and then assign to it.
Changed u and k also to use the scientific e-notation, for consistency and clarity (thanks to #DNF for suggesting the use of this notation in the comments).

getting marginal effect post-estimation for nested logit using R mlogit package

I have estimated nested logit in R using the mlogit package. However, I encountered some problems when trying to estimate the marginal effect. Below is the code I implemented.
library(mlogit)
# data
data2 = read.csv(file = "neat_num_energy.csv")
new_ener2 <- mlogit.data(
data2,
choice="alter4", shape="long",
alt.var="energy_altern",chid.var="id")
# estimate model
nest2 <- mlogit(
alter4 ~ expendmaint + expendnegy |
educ + sex + ppa_power_sp + hu_price_powersupply +
hu_2price +hu_3price + hu_9price + hu_10price +
hu_11price + hu_12price,
data = data2,
nests = list(
Trad = c('Biomas_Trad', 'Solar_Trad'),
modern = c('Biomas_Modern', 'Solar_Modern')
), unscaled=FALSE)
# create Z variable
z3 <- with(data2, data.frame(
expendnegy = tapply(expendnegy, idx(nest2,2), mean),
expendmaint= tapply(expendmaint, idx(nest2,2), mean),
educ= mean(educ),
sex = mean(sex),
hu_price_powersupply = mean(hu_price_powersupply),
ppa_power_sp = mean(ppa_power_sp),
hu_2price = mean(hu_2price),
hu_3price = mean(hu_3price),
hu_9price = mean(hu_9price),
hu_10price = mean(hu_10price),
hu_11price = mean(hu_11price),
ppa_power_sp = mean(ppa_power_sp),
hu_12price = mean(hu_12price)
))
effects(nest2, covariate = "sex", data = z3, type = "ar")
#> ** Error in Solve.default (H, g[!fixed]): Lapack routine dgesv: #> system is exactly singular:U[6,6] =0.**
My data is in long format with expendmaint and expendnegy being the only alternative specific while every other variable is case specific.
altern4 is a nominal variable representing each alternative

Calculate probability of an event not by exclusion

I have some doubt with these kind of problems, example:
"If we asked 20,000 in a stadium to toss a coin 10 times, what it's the probability of at least one person getting 10 heads?"
I took this example from Practical Statistics for Data Scientist.
So, the probability of at least one person getting 10 heads it's calculated using: 1 - P(of nobody in the stadium getting 10 heads).
So we kind of doing an exclude procedure here, first I get the probability of the contrary event I am trying to measure, not the ACTUAL experiment I want to measure: at least one people getting 10 heads.
Why do we do it this way?
How can I calculate the probability of at least someone getting 10 heads but without passing through the probability of no one getting 10 heads?
As #Robert Dodier mentioned in the comments, the reason is that the calculations are simpler. I will use a stadium of 20 people instead of 20000 as an example:
Method 1:
Probability of not getting 10 heads for one individual
= 1 - probability of getting 10 heads
= 1 - 10!/(10!0!)*0.5^10*(1-0.5)^0
= 0.9990234375
Probability of at least one person in the stadium getting 10 heads
= 1 - P(of nobody in the stadium getting 10 heads)
= 1 - 0.9990234375**20 (because all coin tosses are independent)
= 0.019351109194852834
Method 2:
Probability of getting 10 heads for one individual
= 10!/(10!0!)*0.5^10*(1-0.5)^0
= 0.0009765625
Probability of exactly 1, 2, 3, etc. persons in the stadium getting 10 heads:
p1 = 20!/(1!19!)*0.0009765625^1*(1-0.0009765625)^(20-1) = 0.019172021325613825
p2 = 20!/(2!18!)*0.0009765625^2*(1-0.0009765625)^(20-2) = 0.00017803929872270904
p3 = 20!/(3!17!)*0.0009765625^3*(1-0.0009765625)^(20-3) = 1.0442187608370032e-06
p4 = 20!/(4!16!)*0.0009765625^4*(1-0.0009765625)^(20-4) = 4.338152232216289e-09
p5 = 20!/(5!15!)*0.0009765625^5*(1-0.0009765625)^(20-5) = 1.3569977656981548e-11
p6 = 20!/(6!14!)*0.0009765625^6*(1-0.0009765625)^(20-6) = 3.316221323798032e-14
p7 = 20!/(7!13!)*0.0009765625^7*(1-0.0009765625)^(20-7) = 6.483326146232712e-17
p8 = 20!/(8!12!)*0.0009765625^8*(1-0.0009765625)^(20-8) = 1.029853859983202e-19
p9 = 20!/(9!11!)*0.0009765625^9*(1-0.0009765625)^(20-9) = 1.342266353839299e-22
p10 = 20!/(10!10!)*0.0009765625^10*(1-0.0009765625)^(20-10) = 1.443297154665913e-25
p11 = 20!/(11!9!)*0.0009765625^11*(1-0.0009765625)^(20-11) = 1.2825887804726853e-28
p12 = 20!/(12!8!)*0.0009765625^12*(1-0.0009765625)^(20-12) = 9.403143551852531e-32
p13 = 20!/(13!7!)*0.0009765625^13*(1-0.0009765625)^(20-13) = 5.656451493707817e-35
p14 = 20!/(14!6!)*0.0009765625^14*(1-0.0009765625)^(20-14) = 2.7646390487330485e-38
p15 = 20!/(15!5!)*0.0009765625^15*(1-0.0009765625)^(20-15) = 1.0809927854283668e-41
p16 = 20!/(16!4!)*0.0009765625^16*(1-0.0009765625)^(20-16) = 3.3021529369146104e-45
p17 = 20!/(17!3!)*0.0009765625^17*(1-0.0009765625)^(20-17) = 7.59508466888531e-49
p18 = 20!/(18!2!)*0.0009765625^18*(1-0.0009765625)^(20-18) = 1.2373875315877011e-52
p19 = 20!/(19!1!)*0.0009765625^19*(1-0.0009765625)^(20-19) = 1.2732289258503896e-56
p20 = 20!/(20!0!)*0.0009765625^20*(1-0.0009765625)^(20-20) = 6.223015277861142e-61
Probability of at least one person in the stadium getting 10 heads
= p1 + p2 + p3 + p4 + p5 + p6 + p7 + p8 + p9 + p10 +
p11 + p12 + p13 + p14 + p15 + p16 + p17 + p18 + p19 + p20
= 0.01935110919485281
So the result is the same (the tiny difference is due to floating point precision), but as you can see the first calculation is slightly simpler for 20 people, never mind for 20000 ;)

How to create a watchdog on a program in python?

I want to know is it even possible to create a watchdog on a program,
I am trying to do Discrete event simulation to simulate a functioning machine,
the problem is, once I inspect my machine at let's say time = 12 (inspection duration is 2 hours lets say) if the event failure is at 13-time units) there is no way that it can be because I am "busy inspecting"
so is there a sort of "watchdog" to constantly test if the value of a variable reached a certain limit to stop doing what the program is doing,
Here is my inspection program
def machine_inspection(tt, R, Dpmi, Dinv, FA, TRF0, Tswitch, Trfn):
End = 0
TPM = 0
logging.debug(' cycle time %f' % tt)
TRF0 = TRF0 - Dinv
Tswitch = Tswitch - Dinv
Trfn = Trfn - Dinv
if R == 0:
if falsealarm == 1:
FA += 1
else:
tt = tt + Dpmi
TPM = 1
End = 1
return (tt, End, R, TPM, FA, TRF0, Trfn, Tswitch)
Thank you very much!
basically you can't be inspecting during x time if tt + x will be superior to the time to failure TRF0 or Trfn

Fortran error: Program received signal SIGSEGV: Segmentation fault - invalid memory reference

I'm try to run an ocean temperature model for 25 years using the explicit method (parabolic differential equation).
If I run for a year a = 3600 or five years a = 18000 it works fine.
However, when I run it for 25 years a = 90000 it crashes.
a is the amount of time steps used. And a year is considered to be 360 days. The time step is 4320 seconds, delta_t = 4320..
Here is my code:
program task
!declare the variables
implicit none
! initial conditions
real,parameter :: initial_temp = 4.
! vertical resolution (delta_z) [m], vertical diffusion coefficient (av) [m^2/s], time step delta_t [s]
real,parameter :: delta_z = 2., av = 2.0E-04, delta_t = 4320.
! gamma
real,parameter :: y = (av * delta_t) / (delta_z**2)
! horizontal resolution (time) total points
integer,parameter :: a = 18000
!declaring vertical resolution
integer,parameter :: k = 101
! declaring pi
real, parameter :: pi = 4.0*atan(1.0)
! t = time [s], temp_a = temperature at upper boundary [°C]
real,dimension(0:a) :: t
real,dimension(0:a) :: temp_a
real,dimension(0:a,0:k) :: temp
integer :: i
integer :: n
integer :: j
t(0) = 0
do i = 1,a
t(i) = t(i-1) + delta_t
end do
! temperature of upper boundary
temp_a = 12. + 6. * sin((2. * t * pi) / 31104000.)
temp(:,0) = temp_a(:)
temp(0,1:k) = 4.
! Vertical resolution
do j = 1,a
do n = 1,k
temp(j,n) = temp(j-1,n) + (y * (temp(j-1,n+1) - (2. * temp(j-1,n)) + temp(j-1,n-1)))
end do
temp(:,101) = temp(:,100)
end do
print *, temp(:,:)
end program task
The variable a is on line 11 (integer,parameter :: a = 18000)
As said, a = 18000 works, a = 90000 doesn't.
At 90000 get I get:
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Backtrace for this error:
RUN FAILED (exit value 1, total time: 15s)
I'm using a fortran on windows 8.1, NetBeans and Cygwin (which has gfortran built in).
I'm not sure if this problem is caused through bad compiler or anything else.
Does anybody have any ideas to this? It would help me a lot!
Regards
Take a look at the following lines from your code:
integer,parameter :: k = 101
real,dimension(0:a,0:k) :: temp
integer :: n
do n = 1,k
temp(j,n) = temp(j-1,n) + (y * (temp(j-1,n+1) - (2. * temp(j-1,n)) + temp(j-1,n-1)))
end do
Your array temp has bounds of 0:101, you loop n from 1 to 101 where in iteration n=101 you access temp(j-1,102), which is out of bounds.
This means you are writing to whatever memory lies beyond temp and while this makes your program always incorrect, it is only causing a crash sometimes which depends on various other things. Increasing a triggers this because column major ordering of your array means k changes contiguously and is strided by a, and as a increases your out of bounds access of the second dimension is further in memory beyond temp changing what is getting overwritten by your invalid access.
After your loop you set temp(:,101) = temp(:,100) meaning there is no need to calculate temp(:,101) in the above loop, so you can change its loop bounds from
do n = 1,k
to
do n = 1, k-1
which will fix the out of bounds access on temp.

Resources