Mysterious syntax error [...] near "data" in jags - jags

When running my jags model I get the following error message
Error parsing model file: syntax error on line 5 near "data"
In brief, what I have is two response variables (cmax and cmd) and a predictor variable (dbh). My idea was to estimate the correlation between cmax and cmd that is not explained by dbh, since the simple correlations among these variables are trivial.
Below is my code:
# Model input
dat = list(N = nrow(data), # number of observations
data = as.matrix(log(data[,c("dbh","cmax","cmd")])), # log-transform variables
T = diag(2)/1000, # var-covar matrix for non-informative priors
r = 2, # number of variables
m = c(0,0)) # means for non-informative priors
inits = list(P=diag(2)/1000, A=c(0,0), B=c(0,0))
# JAGS model
cat("model{
### Likelihood
for(i in 1:N){
M[i,1:r] <- A[1:r] + B[1:r]*data[i,1]
data[i,2:3] ~ dmnorm(M[i,1:r],P[1:r,1:r])
}
### Priors
P[1:r,1:r] ~ dwish(T,r)
A[1:r] ~ dmnorm(m,T)
B[1:r] ~ dmnorm(m,T)
### Statistics
V <- inverse(P)
sigmaH <- sqrt(V[1,1])
sigmaW <- sqrt(V[2,2])
covHW <- V[1,2]
corHW <- covHW/(sigmaH*sigmaW)
}",
file="Ch1/BM2.txt")
# Run JAGS
res = jags.model(file="Ch1/BM2.txt", data=dat, inits=inits, n.chains=1,
n.adapt=500)

So, when I came to know there was another module ("data") aside from "model" in a jags model file, I began to suspect that calling my input dataframe "data" was not the most clever idea. After changing the name of this dataframe everything seems to have worked fine.

Related

Log transformed data in GAM, how to plot response?

I used log-transformed data (dependent varibale=count) in my generalised additive model (using mgcv) and tried to plot the response by using "trans=plogis" as for logistic GAMs but the results don't seem right. Am I forgetting something here? When I used linear models for my data first, I plotted the least-square means. Any idea how I could plot the output of my GAMs in a more interpretable way other than on the log scale?
Cheers
Are you running a logistic regression for count data? Logistic regression is normally a binary variable or a proportion of binary outcomes.
That being said, the real question here is that you want to backtransform a variable that was fit on the log scale back to the original scale for plotting. That can be easily done using the itsadug package. I've simulated some silly data here just to show the code required.
With itsadug, you can visually inspect many aspects of GAM models. I'd encourage you to look at this: https://cran.r-project.org/web/packages/itsadug/vignettes/inspect.html
The transform argument of plot_smooth() can also be used with custom functions written in R. This can be useful if you have both centred and logged a dependent variable.
library(mgcv)
library(itsadug)
# Setting seed so it's reproducible
set.seed(123)
# Generating 50 samples from a uniform distribution
x <- runif(50, min = 20, max = 50)
# Taking the sin of x to create a dependent variable
y <- sin(x)
# Binding them to a dataframe
d <- data.frame(x, y)
# Logging the dependent variable after adding a constant to prevent negative values
d$log_y <- log(d$y + 1)
# Fitting a GAM to the transformed dependent variable
model_fit <- gam(log_y ~ s(x),
data = d)
# Using the plot_smooth function from itsadug to backtransform to original y scale
plot_smooth(model_fit,
view = "x",
transform = exp)
You can specify the trans function for back-transforming as :trans = function(x){exp(coef(gam)[1]+x)}, where gam is your fitted model, and coef(gam)[1] is the intercept.

Error: "Slicer stuck at value with infinite density" running binomial-beta model in JAGS

I'm trying to run a binomial-beta model in JAGS (see example code below). I keep getting the error:
Error: The following error was encountered while attempting to run the JAGS model:
Error in node a0
Slicer stuck at value with infinite density
which I am struggling to make sense of. I thought perhaps the initial conditions were sending the beta distribution into infinite regions of parameter space but after some investigation that doesn't seem to be the case.
Any thoughts on what this error means or how to adjust the code to accomodate it?
I've put my code below along with some made up sample data. This is the kind of data I might expect in my dataset.
#Generate some sample data
counts = c(80,37,10,43,55,23,53,100,7,11)
n = c(100,57,25,78,55,79,65,100,9,11)
consp = c(1.00, 0.57, 0.25, 0.78, 0.55, 0.79, 0.65, 1.00, 0.09, 0.11)
treat = c(0.5,0.5,0.2,0.9,0.5,0.2,0.5,0.9,0.5,0.2)
#Model spec
model1.string <-"model{
for (i in 1:length(counts)){
counts[i] ~ dbin(p[i],n[i])
p[i] ~ dbeta( ( mu[i] * theta[i]) , ((1-mu[i])*theta[i]))
mu[i] <- ilogit(m0 + m1*consp[i] + m2*treat[i])
theta[i] <- exp(n0 + n1*consp[i])
}
m0 ~ dnorm(0, 1)
m1 ~ dnorm(0, 1)
m2~ dnorm(-1, 1)
k0 ~ dnorm(1, 1)
k1 ~ dnorm(0, 1)
}"
#Specify number of chains
chains=5
#Generate initial conditions
inits=replicate(chains, list(m0 = runif(1, 0.05, 0.25),
m1 = runif(1, 0,0.2),
m2=runif(1,-1,0),
k0 = runif(1, 0.5, 1.5),
k1 = runif(1, 0, 0.3)),simsplify = F)
#Run
model1.spec<-textConnection(model1.string)
results <- autorun.jags(model1.string,startsample = 10000,
data = list('counts' = counts,
'n' = n,
'consp'=consp,
"treat"=treat),
startburnin=5000,
psrf.target=1.02,
n.chains=5,
monitor = c("m0", "m1", "m2","k0", "k1"), inits = inits),
The slice sampler (which is used by JAGS) doesn't work when the probability density of the sampled variable is infinite at a point. This can happen with the Beta distribution at 0 or 1.
A work around is to truncate the node that creates the problem, as in:
p[i] ~ dbeta( ( mu[i] * theta[i]) , ((1-mu[i])*theta[i])) I(0.001,0.999)
(I don't really remember the syntax, but JAGS definitively allows truncated random variables)

Error in caret-svm - "NAs are not allowed in subscripted assignments"

experts. I am a beginner to R. I am trying to use caret-SVM to make classification. The kernel is svmPoly.
First, I used the default parameters to train the model with leave-one-out cross-validation
The code is :
ctrl <- trainControl(method = "LOOCV",
classProbs = T,
savePredictions = T,
repeats = 1)
modelFit <- train(group~.,data=table_svm,method="svmPoly",
preProc = c("center","scale"),
trControl = ctrl)
The best accuracy is 80%. And the final values used for the model were degree = 1, scale = 0.1 and C = 1 .
Second, I tried to tune the parameters.
The code is:
grid_svmpoly=expand.grid(degree=c(1:11),scale=seq(0,5,length.out=25),C=10^c(0:4))
modelFit_tune <- train(group~.,data=table_svm,method="svmPoly",
preProc = c("center","scale"),
tuneGrid=grid_svmpoly,
trControl = ctrl)
I got an error message: Error in { :
task 264 failed - "NAs are not allowed in subscripted assignments"
I checked the data and found no NA.
There must be some NA inside the data-set. I am not new to this but not much expert. To ensure there is no NA inside first convert data-set into matrix format using:
x <- data.matrix(dataframe)
then use which() function which very handy in this case:
which(is.na(x)==T)
I hope this will help you finding the answer. The values will be in row wise order.
Let me know if this resolve your query.

unused variable(s) warning in runjags model

I am running JAGS models through the R package runjags. I just updated to JAGS 4.0.0 from JAGS 3.4, and have noticed some unexpected behavior that seems to be related to the update.
First, when I run a model, I now get a warning message WARNING: Unused variable(s) in data table: followed by a list of data objects that are referenced in the model and provided as data. It doesn't seem to affect the results (but it is very puzzling). I have, however, noticed a few times while playing around with this that for some variables the posteriors were virtually identical to the priors (indicating that no updating occured). I can't seem to recreate the update failure right now, but below is a reproducible code example illustrating the odd warning message. The code example on the run.jags help page also produces the same warning.
Second, I thought I'd check to see if the same message pops up if I use the R package R2jags instead of runjags, but R2jags won't load because apparently rjags (one of the dependencies) is not compatible with JAGS 4.0 (its looking for JAGS 3.X). Also, in the runjags function run.jags, the argument method="rjags" doesn't seem to work anymore, but method="parallel" does work.
I'm using runjags_2.0.1-4 and R 3.2.2.
So my questions are:
1) Is rjags really incompatible with JAGS 4.0? The motivation to go to 4.0 was to use vectors as indices (see https://martynplummer.wordpress.com/2015/08/16/whats-new-in-jags-4-0-0-part-34-r-style-features/).
2) What is up with the unused variable(s) warning, and should I be concerned about it?
Thanks,
Glenn
Code:
#--- GENERATE DATA ------------------------
rm(list=ls())
# Number of sites and observations per site
N <- 200
nobs <- 3
# generate covariates and standardize (where appropriate)
set.seed(123)
forest <- rnorm(N)
# relationship between occupancy and covariates
b0 <- 0.5
b.for <- 0.5
psi <- plogis(b0 + b.for*forest)
# draw occupancy for each site
z <- rbinom(n=N, size=1,prob=psi)
# specify detection probablility
p <- 0.5
pz <- p*z
# generate the observations
Y <- rbinom(n=N, size=nobs,prob=pz)
#---- BUGS model ------------------------
model1 <- "model {
for (i in 1:N){
logit(eta[i]) <- b0 + b.for*forest[i]
z[i] ~ dbern(eta[i])
pz[i] <- z[i]*p
y[i] ~ dbin(pz[i],nobs)
} #i
b0.0 ~ dunif(0,1)
b0 <- log(b0.0/(1-b0.0))
b.for ~ dnorm(0,0.01)
p ~ dunif(0,1)
}"
occ.data1 <-list(y=Y,N=N,nobs=nobs,forest=forest)
inits1 <- function(){list(b0.0=runif(1),b.for=rnorm(1),p=runif(1),z=as.numeric(Y>0))}
parameters1 <- c("b0","b.for","p")
#---- RUN MODEL ------------------------
library(runjags)
ni <- 2000
nt <- 1
nb <- 1000
nc <- 3
ad <- 100
out <- run.jags(model=model1,data=occ.data1,monitor=parameters1,n.chains=nc,inits=inits1,burnin=nb,
sample=ni,adapt=ad,thin=nt,modules=c("glm","dic"),method="parallel")
To answer your questions:
1) rjags and JAGS used linked (non-interchangable) versions, and CRAN systems are still using JAGS_3.4.0 so the version of rjags on CRAN matches. This will be updated soon, and in the meantime you can grab the correct version of rjags from the sourceforge page as #jbaums notes.
2) This is a helpful message from JAGS/rjags telling you that you have specified something as data that the model isn't using. Remember that variable names are case sensitive i.e.
library('runjags')
model <- "model {
m ~ dunif(-1000,1000)
#data# M
#inits# m
#monitor# m
}"
M <- 0
m <- list(-10, 10)
results <- run.jags(model, method="interruptible", n.chains=2)
results <- run.jags(model, method="rjags", n.chains=2)
... gives you a warning because M does not match m. Also note that the warning looks a bit different from the two function calls - in the first it comes half-way down the JAGS output and in the second it comes as a warning in R after the function is completed.
As for 'should I be concerned' - yes if you think these variables should be in your model. If you can't find the problem try posting the code you are using - it got cut off from your original post.
Matt

matrices are not aligned Error: Python SciPy fmin_bfgs

Problem Synopsis:
When attempting to use the scipy.optimize.fmin_bfgs minimization (optimization) function, the function throws a
derphi0 = np.dot(gfk, pk)
ValueError: matrices are not aligned
error. According to my error checking this occurs at the very end of the first iteration through fmin_bfgs--just before any values are returned or any calls to callback.
Configuration:
Windows Vista
Python 3.2.2
SciPy 0.10
IDE = Eclipse with PyDev
Detailed Description:
I am using the scipy.optimize.fmin_bfgs to minimize the cost of a simple logistic regression implementation (converting from Octave to Python/SciPy). Basically, the cost function is named cost_arr function and the gradient descent is in gradient_descent_arr function.
I have manually tested and fully verified that *cost_arr* and *gradient_descent_arr* work properly and return all values properly. I also tested to verify that the proper parameters are passed to the *fmin_bfgs* function. Nevertheless, when run, I get the ValueError: matrices are not aligned. According to the source review, the exact error occurs in the
def line_search_wolfe1
function in # Minpack's Wolfe line and scalar searches as supplied by the scipy packages.
Notably, if I use scipy.optimize.fmin instead, the fmin function runs to completion.
Exact Error:
File
"D:\Users\Shannon\Programming\Eclipse\workspace\SBML\sbml\LogisticRegression.py",
line 395, in fminunc_opt
optcost = scipy.optimize.fmin_bfgs(self.cost_arr, initialtheta, fprime=self.gradient_descent_arr, args=myargs, maxiter=maxnumit, callback=self.callback_fmin_bfgs, retall=True)
File
"C:\Python32x32\lib\site-packages\scipy\optimize\optimize.py", line
533, in fmin_bfgs old_fval,old_old_fval)
File "C:\Python32x32\lib\site-packages\scipy\optimize\linesearch.py", line
76, in line_search_wolfe1
derphi0 = np.dot(gfk, pk)
ValueError: matrices are not aligned
I call the optimization function with:
optcost = scipy.optimize.fmin_bfgs(self.cost_arr, initialtheta, fprime=self.gradient_descent_arr, args=myargs, maxiter=maxnumit, callback=self.callback_fmin_bfgs, retall=True)
I have spent a few days trying to fix this and cannot seem to determine what is causing the matrices are not aligned error.
ADDENDUM: 2012-01-08
I worked with this a lot more and seem to have narrowed the issues (but am baffled on how to fix them). First, fmin (using just fmin) works using these functions--cost, gradient. Second, the cost and the gradient functions both accurately return expected values when tested in a single iteration in a manual implementation (NOT using fmin_bfgs). Third, I added error code to optimize.linsearch and the error seems to be thrown at def line_search_wolfe1 in line: derphi0 = np.dot(gfk, pk).
Here, according to my tests, scipy.optimize.optimize pk = [[ 12.00921659]
[ 11.26284221]]pk type = and scipy.optimize.optimizegfk = [[-12.00921659] [-11.26284221]]gfk type =
Note: according to my tests, the error is thrown on the very first iteration through fmin_bfgs (i.e., fmin_bfgs never even completes a single iteration or update).
I appreciate ANY guidance or insights.
My Code Below (logging, documentation removed):
Assume theta = 2x1 ndarray (Actual: theta Info Size=(2, 1) Type = )
Assume X = 100x2 ndarray (Actual: X Info Size=(2, 100) Type = )
Assume y = 100x1 ndarray (Actual: y Info Size=(100, 1) Type = )
def cost_arr(self, theta, X, y):
theta = scipy.resize(theta,(2,1))
m = scipy.shape(X)
m = 1 / m[1] # Use m[1] because this is the length of X
logging.info(__name__ + "cost_arr reports m = " + str(m))
z = scipy.dot(theta.T, X) # Must transpose the vector theta
hypthetax = self.sigmoid(z)
yones = scipy.ones(scipy.shape(y))
hypthetaxones = scipy.ones(scipy.shape(hypthetax))
costright = scipy.dot((yones - y).T, ((scipy.log(hypthetaxones - hypthetax)).T))
costleft = scipy.dot((-1 * y).T, ((scipy.log(hypthetax)).T))
def gradient_descent_arr(self, theta, X, y):
theta = scipy.resize(theta,(2,1))
m = scipy.shape(X)
m = 1 / m[1] # Use m[1] because this is the length of X
x = scipy.dot(theta.T, X) # Must transpose the vector theta
sig = self.sigmoid(x)
sig = sig.T - y
grad = scipy.dot(X,sig)
grad = m * grad
return grad
def fminunc_opt_bfgs(self, initialtheta, X, y, maxnumit):
myargs= (X,y)
optcost = scipy.optimize.fmin_bfgs(self.cost_arr, initialtheta, fprime=self.gradient_descent_arr, args=myargs, maxiter=maxnumit, retall=True, full_output=True)
return optcost
In case anyone else encounters this problem ....
1) ERROR 1: As noted in the comments, I incorrectly returned the value from my gradient as a multidimensional array (m,n) or (m,1). fmin_bfgs seems to require a 1d array output from the gradient (that is, you must return a (m,) array and NOT a (m,1) array. Use scipy.shape(myarray) to check the dimensions if you are unsure of the return value.
The fix involved adding:
grad = numpy.ndarray.flatten(grad)
just before returning the gradient from your gradient function. This "flattens" the array from (m,1) to (m,). fmin_bfgs can take this as input.
2) ERROR 2: Remember, the fmin_bfgs seems to work with NONlinear functions. In my case, the sample that I was initially working with was a LINEAR function. This appears to explain some of the anomalous results even after the flatten fix mentioned above. For LINEAR functions, fmin, rather than fmin_bfgs, may work better.
QED
As of current scipy version you need not pass fprime argument. It will compute the gradient for you without any issues. You can also use 'minimize' fn and pass method as 'bfgs' instead without providing gradient as argument.

Resources