mgcv::gam, Error in names(dat) <- object$term : attribut 'names' [2] as to be same length as vector [1] - gam

I want to run a hieratchical GAM in the mgcv package using the gam function. I used the same form of model in brms without problem and I will eventually re-run the same model in brms, but the deadline for an abstract submission in Sunday so I want to try the model in mgcv to have quicker results.
My formula:
f = MDS1 ~ 1 + exposed + s(YEAR,bs = "tp")+ s(LEVEL, bs = "tp") +
t2(YEAR, SITE, bs = c("tp","re")) + s(INTERTIDAL_TRANSECT, bs = "re",
m = 1)
My data:
Classes ‘data.table’ and 'data.frame': 3992 obs. of 9 variables:
$ unique_id : chr "Babb's Cove-1-0-1988" "Babb's Cove-1-0-1989" "Babb's Cove-1-0-1990" "Babb's Cove-1-0-1992" ...
$ MDS1 : num -0.607 -0.607 -0.607 -0.607 -0.607 ...
$ MDS2 : num 0.19 0.19 0.19 0.19 0.19 ...
$ MDS3 : num 0.36 0.36 0.36 0.36 0.36 ...
$ SITE : chr "Babb's Cove" "Babb's Cove" "Babb's Cove" "Babb's Cove" ...
$ INTERTIDAL_TRANSECT: Factor w/ 21 levels "1","2","5","7",..: 1 1 1 1 1 1 1 1 1 1 ...
$ LEVEL : num 0 0 0 0 0 0 0 1 1 1 ...
$ YEAR : num 1988 1989 1990 1992 1994 ...
$ exposed : Factor w/ 2 levels "1","2": 2 2 2 2 2 2 2 2 2 2 ...
- attr(*, ".internal.selfref")=<externalptr>
- attr(*, "sorted")= chr "unique_id"
I have 2 questions:
a)
When I try to fit the model with fit_count <- gam(f, data = count_merge, method = "REML", family = gaussian()) I get :
Error in names(dat) <- object$term :
attribut 'names' [2] doit être de même longueur que le vecteur [1]
I think it's something with the t2() argument of the formula.
b)
I usually run GAM with brms and my formula for that model was :
MDS1 ~ 1 + exposed + s(YEAR,bs = "tp")+ s(LEVEL, bs = "tp") + t2(YEAR, SITE, bs = c("tp","re"), full = T) +(1|r|INTERTIDAL_TRANSECT),
family = gaussian()
Is my way to adapt the formula to mgcv::gam OK?

Q a)
Your SITE vector is a character vector and it is required to be a factor.
Q b)
That looks OK, but you don't need the m = 1 in the s(INTERTIDAL_TRANSECT, bs = "re") term.
You should also use the full = TRUE option on the t2() term if you want the parameterisation to be the same between your {brms} call the {mgcv} one.

Related

OpenMDAO Dymos: How to run kinematic optimization with initial and final state values that depend on design variables?

The image above shows the kinematic optimization problem statement that I'm trying to implement. The initial and final state values are directly proportional to design variables.
And the following are the boundary constraints for the final joint positions:
0 <= xB <= 0.6*lo + d
0 <= yB <= 0.9*b
I believe this solves the problem without the need for integration, only relying on the geometric constraints of the system. Setting the constants d, b, and L0 to the appropriate values should let you find your particular solution.
import openmdao.api as om
import numpy as np
d = 1.0
b = 1.0
L0 = 1.0
p = om.Problem()
exec_1 = om.ExecComp('beta = arccos(0.5 * b - d_ab * sin(theta) / d_bc)')
exec_2 = om.ExecComp(['x_c = d_ab * cos(theta) + d_bc * cos(beta)',
'x_b = d_ab * cos(theta)',
'y_b = d_ab * sin(theta)'])
p.model.add_subsystem('exec_1', exec_1,
promotes_inputs=['b', 'd_ab', 'theta', 'd_bc'],
promotes_outputs=['beta'])
p.model.add_subsystem('exec_2', exec_2,
promotes_inputs=['d_ab', 'theta', 'd_bc', 'beta'],
promotes_outputs=['x_c', 'x_b', 'y_b'])
p.model.add_design_var('theta', lower=np.radians(45), upper=np.radians(360))
p.model.add_design_var('d_ab', lower=0.05, upper=0.9)
p.model.add_design_var('d_bc', lower=0.05)
p.model.add_constraint('x_b', lower=0.0, upper=d + 0.6 * L0)
p.model.add_constraint('y_b', lower=0.0, upper=0.9 * b)
p.model.add_constraint('x_c', equals=d + 0.6 * L0)
p.model.add_objective('beta', scaler=-1)
p.driver = om.pyOptSparseDriver(optimizer='IPOPT')
p.driver.opt_settings['print_level'] = 5
# p.driver = om.ScipyOptimizeDriver()
p.setup()
p.set_val('theta', np.pi/4)
p.run_driver()
Which gives
Optimization Problem -- Optimization using pyOpt_sparse
================================================================================
Objective Function: _objfunc
Solution:
--------------------------------------------------------------------------------
Total Time: 0.0620
User Objective Time : 0.0077
User Sensitivity Time : 0.0216
Interface Time : 0.0105
Opt Solver Time: 0.0221
Calls to Objective Function : 9
Calls to Sens Function : 9
Objectives
Index Name Value
0 exec_1.beta -1.268661E+00
Variables (c - continuous, i - integer, d - discrete)
Index Name Type Lower Bound Value Upper Bound Status
0 theta_0 c 7.853982E-01 9.733898E-01 6.283185E+00
1 d_ab_0 c 5.000000E-02 9.000000E-01 9.000000E-01 u
2 d_bc_0 c 5.000000E-02 3.675735E+00 1.000000E+30
Constraints (i - inequality, e - equality)
Index Name Type Lower Value Upper Status Lagrange Multiplier (N/A)
0 exec_2.x_c e 1.600000E+00 1.600000E+00 1.600000E+00 9.00000E+100
1 exec_2.x_b i 0.000000E+00 5.062501E-01 1.600000E+00 9.00000E+100
2 exec_2.y_b i 0.000000E+00 7.441175E-01 9.000000E-01 9.00000E+100

MDP Policy Plot for a Maze

I have a 5x-5 maze specified as follows.
r = [1 0 1 1 1
1 1 1 0 1
0 1 0 0 1
1 1 1 0 1
1 0 1 0 1];
Where 1's are the paths and 0's are the walls.
Assume I have a function foo(policy_vector, r) that maps the elements of the policy vector to the elements in r. For example 1=UP, 2=Right, 3=Down, 4=Left. The MDP is set up such that the wall states are never realized so policies for those states are ignored in the plot.
policy_vector' = [3 2 2 2 3 2 2 1 2 3 1 1 1 2 3 2 1 4 2 3 1 1 1 2 2]
symbols' = [v > > > v > > ^ > v ^ ^ ^ > v > ^ < > v ^ ^ ^ > >]
I am trying to display my policy decision for a Markov Decision Process in the context of solving a maze. How would I plot something that looks like this? Matlab is preferable but Python is fine.
Even if some body could show me how to make a plot like this I would be able to figure it out from there.
function[] = policy_plot(policy,r)
[row,col] = size(r);
symbols = {'^', '>', 'v', '<'};
policy_symbolic = get_policy_symbols(policy, symbols);
figure()
hold on
axis([0, row, 0, col])
grid on
cnt = 1;
fill([0,0,col,col],[row,0,0,row],'k')
for rr = row:-1:1
for cc = 1:col
if r(row+1 - rr,cc) ~= 0 && ~(row == row+1 - rr && col == cc)
fill([cc-1,cc-1,cc,cc],[rr,rr-1,rr-1,rr],'g')
text(cc - 0.55,rr - 0.5,policy_symbolic{cnt})
end
cnt = cnt + 1;
end
end
fill([cc-1,cc-1,cc,cc],[rr,rr-1,rr-1,rr],'b')
text(cc - 0.70,rr - 0.5,'Goal')
function [policy_symbolic] = get_policy_symbols(policy, symbols)
policy_symbolic = cell(size(policy));
for ii = 1:length(policy)
policy_symbolic{ii} = symbols{policy(ii)};
end

How to extract data from .txt table with a for loop

I want to open Learn_full_data.txt extract some rows from it and write them on a new file called All_Data.txt using a foor loop.
Learn_full_data.txt Table:
vp run trial img_order mimg perc_aha_norm perc_gen_norm moon_onset moon_pulse moon_pulse_time answer_time answer_pulse answer_pulse_time fix_time fix_pulse fixpulse_time flash_onset flash_pulse flash_pulse_time_(flash_onset) tar_time_(greyscale) tar_pulse tarpulse_time answer RT_answer aha RT_aha condition solved_testphase RT_solvedtest oldnew RT_oldnew remknow RT_remknow
1 1 1 70 mimg433 0,4375 0,5625 18066 6 20029 20083 7 22029 22099 8 24029 24116 8 24029 24633 10 28029 nicht_erkannt 1055 Aha 1145 exp 0 0 old 2030 know 381
1 1 2 146 mimg665 0,6 0,4 30666 12 32029 32683 13 34029 34699 16 40028 40716 16 40028 41233 18 44028 erkannt 990 keinAha 1240 exp 1 2758 old 634 rem 1063
2 1 1 130 mimg640 0,666667 1 17366 5 19328 19383 6 21328 21399 8 25328 25416 8 25328 25933 10 29328 erkannt 871 keinAha 2121 base 1 2891 old 3105 know 533
2 1 2 83 mimg500 0,454545 0,272727 33966 13 35328 35983 14 37328 37999 15 39328 40016 15 39328 40533 17 43328 nicht_erkannt 1031 Aha 1153 exp 0 0 new 2358 kA 2358
The row Vp has two subjects, so I created a list with the subjects from the row Vp (there are many more, but I've just pasted an excerpt from it):
list = ['1','2']
Now I want to iterate over the list with this code (if the item in the list is the same as Vp, than write on All_Data.txt some rows from Learn_full_data.txt):
Learn = open('Learn_full_data.txt','r')
file = open('All_Data.txt','w')
file.write('Vp\tImg\tDescription\tPerc_gen_norm\tPerc_aha_norm\tCond\tGen\tRt_Gen\tRt_Solved\tInsight\tRt_Insight\tOldNew\tRt_OldNew\tRemKnow\tRt_RemKnow\n')
for i in list:
for splitted in Learn:
splitted = splitted.split()
Vp = splitted[0]
Img = str(splitted[4])
Perc_gen_norm = splitted[6]
Perc_aha_norm = splitted[5]
Cond = splitted[26]
Gen = splitted[22]
Rt_Gen = splitted[23]
Insight = splitted[24]
Rt_Insight = splitted[25]
Rt_Solved = splitted[28]
OldNew = splitted[29]
Rt_OldNew = splitted[30]
RemKnow = splitted[31]
Rt_Remknow = splitted[32]
if i == str(Vp):
file.write(str(Vp)+'\t'+str(Img)+'\t'+'Description'+'\t'+str(Perc_gen_norm)+'\t'+str(Perc_aha_norm)+'\t'+str(Cond)+'\t'+str(Gen)+'\t'+str(Rt_Gen)+'\t'+str(Insight)+'\t'+str(Rt_Insight)+'\t'+str(Rt_Solved)+'\t'+str(OldNew)+'\t'+str(Rt_OldNew)+'\t'+str(RemKnow)+'\t'+str(Rt_Remknow)+'\n’)
The Code output is just the first iteration from the list. I was expecting it to continue iterating:
Vp Img Description Perc_gen_norm Perc_aha_norm Cond Gen Rt_Gen Rt_Solved Insight Rt_Insight OldNew Rt_OldNew RemKnow Rt_RemKnow
1 mimg433 Description 0,5625 0,4375 exp nicht_erkannt 1055 Aha 1145 0 old 2030 know 381
1 mimg665 Description 0,4 0,6 exp erkannt 990 keinAha 1240 2758 old 634 rem 1063
The second iteration designated on the list doesn't happen. The second item of the list is '2' and the Vp item is also '2', so the second iteration should return the same for Vp '2' as it did for Vp '1'. Why does the for loop stop in Vp '1'?
The problem is that you iterate through all the lines in your code in the first iteration of your for i in list loop. In the second iteration, e.g. i = 2, the read cursor is still at the end of the file. You have to set it to the first line in each iteration. This can be done with Learn.seek(0):
for i in list:
Learn.seek(0)
for splitted in Learn:
splitted = splitted.split('\t')
Vp = splitted[0]
Img = str(splitted[4])
Perc_gen_norm = splitted[6]
Perc_aha_norm = splitted[5]
Cond = splitted[26]
Gen = splitted[22]
Rt_Gen = splitted[23]
Insight = splitted[24]
Rt_Insight = splitted[25]
Rt_Solved = splitted[28]
OldNew = splitted[29]
Rt_OldNew = splitted[30]
RemKnow = splitted[31]
Rt_Remknow = splitted[32]
if i == str(Vp):
file.write(str(Vp)+'\t'+str(Img)+'\t'+'Description'+'\t'+str(Perc_gen_norm)+'\t'+str(Perc_aha_norm)+'\t'+str(Cond)+'\t'+str(Gen)+'\t'+str(Rt_Gen)+'\t'+str(Insight)+'\t'+str(Rt_Insight)+'\t'+str(Rt_Solved)+'\t'+str(OldNew)+'\t'+str(Rt_OldNew)+'\t'+str(RemKnow)+'\t'+str(Rt_Remknow))

Understanding direct self-reference in Haskell

I just started learning Haskell a few hours a go, trying to comprehend what this
The Fibonacci sequence does:
fibs = 0 : 1 : next fibs
where
next (a : t#(b:_)) = (a+b) : next t
next function is strange to me, it will eventually get some "invalid" input, like at first it goes like this:
next (0:1) = (0+1) : next [1]
but then next ([1]) is not operable, since t#(b:_) has no input in it. So how does next work?
And my next confusion is fib itself, since it's suppose to be a Fibonacci sequence, I assume it will get fibs = 0 : 1 : 1 : next fibs after the first step, but then we will need to compute next([0, 1, 1]) witch gives (0+1): next([1, 1]) == 1 : next([1, 1]), we get the initial element 1, so in next([0, 1, 1]), the first value of the list (in next fibs) will be 1, but attached this 1 to the original fib, we get 0 : 1 : 1 : 1 which is not Fibonacci sequence.
I think I misunderstood something, so how it actually works?
The standard way to define the result of a recursive definition is to approximate such value starting from undefined and unfolding the recursion from there as follows:
-- A function describing the recursion
f x = 0 : 1 : next x
fibs0 = undefined
fibs1 = f fibs0 = 0 : 1 : next undefined
-- next requires at least 2 elements
= 0 : 1 : undefined
fibs2 = f fibs1 = 0 : 1 : next fibs1
= 0 : 1 : next (0 : 1 : undefined)
= 0 : 1 : 1 : next (1 : undefined)
-- next requires at least 2 elements
= 0 : 1 : 1 : undefined
fibs3 = f fibs2 = 0 : 1 : next fibs2
= 0 : 1 : next (0 : 1 : 1 : undefined)
= 0 : 1 : 1 : next (1 : 1 : undefined)
= 0 : 1 : 1 : 2 : next (1 : undefined)
-- next requires at least 2 elements
= 0 : 1 : 1 : 2 : undefined
fibs4 = f fibs3 = 0 : 1 : next fibs3
= 0 : 1 : next (0 : 1 : 1 : 2 : undefined)
...
If we keep going on we will approach the full sequence "at the limit", approximating it step by step. This informal argument can be formally justified through the Kleene's fixed point theorem.
The next function actually generates the fibs - so it won't call next [0, 1, 1] it will call next (0 : 1 : 1 : next rest).
There is what is going on in pictures:
fib = 0 : 1 : not-yet-evaluated-part
fib = 0 : 1 : [+] : not-yet-evaluated-part
^ ^ |
*---*----* (applying next to fib)
fib = 0 : 1 : 1 : [+] : not-yet-evaluated-part
^ ^ |
*---*----* (next calls itself)
fib = 0 : 1 : 1 : 2 : [+] : not-yet-evaluated-part
^ ^ |
*---*----* (etc)
The next is the "railroad-builder train" requiring some initial rails to run (0 : 1 : ...).
Because there is to []-end of list placed nowhere in the fibs list, it will go infinite.
However, I recommend you to start from less obscure things - for instance, you should try understand lists alone.

constructing an identifier string for each row in data

I have the following data:
library(data.table)
d = data.table(a = c(1:3), b = c(2:4))
and would like to get this result (in a way that would work with arbitrary number of columns):
d[, c := paste0('a_', a, '_b_', b)]
d
# a b c
#1: 1 2 a_1_b_2
#2: 2 3 a_2_b_3
#3: 3 4 a_3_b_4
The following works, but I'm hoping to find something shorter and more legible.
d = data.table(a = c(1:3), b = c(2:4))
d[, c := apply(mapply(paste, names(.SD), .SD, MoreArgs = list(sep = "_")),
1, paste, collapse = "_")]
one way, only slightly cleaner:
d[, c := apply(d, 1, function(x) paste(names(d), x, sep="_", collapse="_")) ]
a b c
1: 1 2 a_1_b_2
2: 2 3 a_2_b_3
3: 3 4 a_3_b_4
Here is an approach using do.call('paste'), but requiring only a single call to paste
I will benchmark on a situtation where the columns are integers (as this seems a more sensible test case
N <- 1e4
d <- setnames(as.data.table(replicate(5, sample(N), simplify = FALSE)), letters[seq_len(5)])
f5 <- function(d){
l <- length(d)
o <- c(1L, l + 1L) + rep_len(seq_len(l) -1L, 2L * l)
do.call('paste',c((c(as.list(names(d)),d))[o],sep='_'))}
microbenchmark(f1(d), f2(d),f5(d))
Unit: milliseconds
expr min lq median uq max neval
f1(d) 41.51040 43.88348 44.60718 45.29426 52.83682 100
f2(d) 193.94656 207.20362 210.88062 216.31977 252.11668 100
f5(d) 30.73359 31.80593 32.09787 32.64103 45.68245 100
To avoid looping through rows, you can use this:
do.call(paste, c(lapply(names(d), function(n)paste0(n,"_",d[[n]])), sep="_"))
Benchmarking:
N <- 1e4
d <- data.table(a=runif(N),b=runif(N),c=runif(N),d=runif(N),e=runif(N))
f1 <- function(d)
{
do.call(paste, c(lapply(names(d), function(n)paste0(n,"_",d[[n]])), sep="_"))
}
f2 <- function(d)
{
apply(d, 1, function(x) paste(names(d), x, sep="_", collapse="_"))
}
require(microbenchmark)
microbenchmark(f1(d), f2(d))
Note: f2 inspired in #Ricardo's answer.
Results:
Unit: milliseconds
expr min lq median uq max neval
f1(d) 195.8832 213.5017 216.3817 225.4292 254.3549 100
f2(d) 418.3302 442.0676 451.0714 467.5824 567.7051 100
Edit note: previous benchmarking with N <- 1e3 didn't show much difference in times. Thanks again #eddi.

Resources