array index is greater than array upper bound for d - jags

Can anyone help me with this issue please? I have used other data sets with different number of studies (NS) and treatments (NT) and it worked fine.
Any help will be highly appreciated.
The dataset is as follows:
list(N=186, NS=5, NT=3, mean=c(0,0);
Where
N= number of intervals
NS= number of studies
NT= number of treatment
s[]= Study ID
r[]= no of events
n[]= no at risk
t[]= study arm ID
b[]= Study arm base
time[]= time in months
dt[]= difference in interval (months)
model {
for (i in 1:N) { # N=number of datapoints in dataset
#likelihood
r[i] ~dbin(p[i],n[i])
p[i]<-1- exp(-h[i]*dt[i]) # hazard h over interval [t,t+dt] expressed as deaths per unit person-time (e.g. months)
#fixed effects model
log(h[i]) <- nu[i]+log(time[i])*theta[i]
nu[i]<-mu[s[i],1]+d[s[i],1]*(1-equals(t[i],b[i]))
theta[i]<-mu[s[i],2]+ d[s[i],2]*(1-equals(t[i],b[i]))
}
# priors
d[1,1]<- 0
d[1,2]<- 0
for(j in 2 :NT){ # NT=number of treatments
d[j,1:2] ~ dmnorm(mean[1:2],prec2[,])
}
for(k in 1:NS) {
mu[k,1:2] ~ dmnorm(mean[1:2],prec2[,])
}
}
#Winbugs data set
list(N=176, NS=5, NT=3, mean=c(0,0),
prec2 = structure(.Data = c(0.0001,0,0,0.0001), .Dim = c(2,2)))
# initials 1
list(
d=structure(.Data=c(NA,NA,0,0,0,0,0,0), .Dim = c(4,2)),
mu = structure(.Data=c(1,1,1,1,1,1,1,1), .Dim = c(4,2)))
# initials 2
list(
d=structure(.Data=c(NA,NA,0.5,0.5,0.5,0.5,0.5,0.5), .Dim = c(4,2)),
mu = structure(.Data=c(0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5), .Dim = c(4,2)))
s[] r[] n[] t[] b[] time[] dt[]
1 1 62 1 1 3 2
1 2 59 1 1 7 4
1 6 53 1 1 11 2
1 2 51 1 1 13 2
1 3 48 1 1 15 2
1 2 45 1 1 17 2
1 5 40 1 1 19 2
1 2 37 1 1 23 4
1 2 35 1 1 25 2
1 2 32 1 1 27 2
1 1 31 1 1 29 2
1 2 28 1 1 31 2
1 2 26 1 1 33 2
1 2 23 1 1 35 2
1 1 21 1 1 39 4
1 1 14 1 1 51 12
1 2 55 2 1 5 4
1 1 54 2 1 7 2
1 2 52 2 1 9 2
1 1 51 2 1 11 2
1 5 46 2 1 13 2
1 2 44 2 1 15 2
1 3 41 2 1 17 2
1 3 37 2 1 19 2
1 2 35 2 1 21 2
1 1 34 2 1 23 2
1 1 33 2 1 25 2
1 1 32 2 1 31 6
1 3 29 2 1 33 2
1 1 28 2 1 35 2
1 1 26 2 1 39 4
1 1 24 2 1 41 2
1 1 22 2 1 43 2
1 2 19 2 1 45 2
2 8 169 1 1 3 4
2 10 148 1 1 5 2
2 8 137 1 1 7 2
2 6 127 1 1 9 2
2 8 118 1 1 11 2
2 7 109 1 1 13 2
2 3 105 1 1 15 2
2 4 95 1 1 17 2
2 3 84 1 1 19 2
2 3 76 1 1 21 2
2 4 68 1 1 23 2
2 4 60 1 1 25 2
2 4 50 1 1 27 2
2 1 35 1 1 31 4
2 2 29 1 1 33 2
2 1 25 1 1 35 2
2 3 21 1 1 37 2
2 1 18 1 1 39 2
2 2 11 1 1 43 4
2 1 180 2 1 1 2
2 11 162 2 1 3 2
2 9 147 2 1 5 2
2 9 135 2 1 7 2
2 6 125 2 1 9 2
2 6 116 2 1 11 2
2 6 106 2 1 13 2
2 7 95 2 1 15 2
2 1 92 2 1 17 2
2 5 84 2 1 19 2
2 3 77 2 1 21 2
2 2 67 2 1 23 2
2 1 59 2 1 25 2
2 4 49 2 1 27 2
2 1 40 2 1 29 2
2 2 34 2 1 31 2
2 3 23 2 1 37 6
2 1 19 2 1 39 2
4 1 62 1 1 3 2
4 2 59 1 1 7 4
4 6 53 1 1 11 2
4 2 51 1 1 13 2
4 3 48 1 1 15 2
4 2 45 1 1 17 2
4 5 40 1 1 19 2
4 2 37 1 1 23 4
4 2 35 1 1 25 2
4 2 32 1 1 27 2
4 1 31 1 1 29 2
4 2 28 1 1 31 2
4 2 26 1 1 33 2
4 2 23 1 1 35 2
4 1 21 1 1 39 4
4 1 14 1 1 51 12
4 2 55 2 1 5 4
4 1 54 2 1 7 2
4 2 52 2 1 9 2
4 1 51 2 1 11 2
4 5 46 2 1 13 2
4 2 44 2 1 15 2
4 3 41 2 1 17 2
4 3 37 2 1 19 2
4 2 35 2 1 21 2
4 1 34 2 1 23 2
4 1 33 2 1 25 2
4 1 32 2 1 31 6
4 3 29 2 1 33 2
4 1 28 2 1 35 2
4 1 26 2 1 39 4
4 1 24 2 1 41 2
4 1 22 2 1 43 2
4 2 19 2 1 45 2
5 8 169 1 1 3 4
5 10 148 1 1 5 2
5 8 137 1 1 7 2
5 6 127 1 1 9 2
5 8 118 1 1 11 2
5 7 109 1 1 13 2
5 3 105 1 1 15 2
5 4 95 1 1 17 2
5 3 84 1 1 19 2
5 3 76 1 1 21 2
5 4 68 1 1 23 2
5 4 60 1 1 25 2
5 4 50 1 1 27 2
5 1 35 1 1 31 4
5 2 29 1 1 33 2
5 1 25 1 1 35 2
5 3 21 1 1 37 2
5 1 18 1 1 39 2
5 2 11 1 1 43 4
5 1 180 2 1 1 2
5 11 162 2 1 3 2
5 9 147 2 1 5 2
5 9 135 2 1 7 2
5 6 125 2 1 9 2
5 6 116 2 1 11 2
5 6 106 2 1 13 2
5 7 95 2 1 15 2
5 1 92 2 1 17 2
5 5 84 2 1 19 2
5 3 77 2 1 21 2
5 2 67 2 1 23 2
5 1 59 2 1 25 2
5 4 49 2 1 27 2
5 1 40 2 1 29 2
5 2 34 2 1 31 2
5 3 23 2 1 37 6
5 1 19 2 1 39 2
3 2 179 1 1 1 2
3 4 172 1 1 3 2
3 3 168 1 1 5 2
3 6 157 1 1 7 2
3 4 151 1 1 9 2
3 9 142 1 1 11 2
3 10 130 1 1 13 2
3 7 123 1 1 15 2
3 3 119 1 1 17 2
3 5 112 1 1 19 2
3 3 108 1 1 21 2
3 3 103 1 1 23 2
3 12 91 1 1 25 2
3 2 68 1 1 27 2
3 2 46 1 1 29 2
3 8 29 1 1 31 2
3 2 23 1 1 33 2
3 3 8 1 1 35 2
3 5 175 3 1 3 4
3 7 163 3 1 5 2
3 12 151 3 1 7 2
3 12 139 3 1 9 2
3 4 132 3 1 11 2
3 9 122 3 1 13 2
3 7 114 3 1 15 2
3 4 108 3 1 17 2
3 7 101 3 1 19 2
3 5 96 3 1 21 2
3 7 89 3 1 23 2
3 2 87 3 1 25 2
3 4 68 3 1 27 2
3 4 50 3 1 29 2
3 3 40 3 1 31 2
3 3 22 3 1 33 2
3 1 8 3 1 35 2
END

You have set NT = 3 while the indexed s vector ranges from 1 to 5.
Set NT = 5 or NT = length(unique(s)).

Related

how to shift column labels to left python

I have dataframe i want to move column name to left from specific column. original dataframe have many columns can not do this by rename columns
df=pd.DataFrame({'A':[1,3,4,7,8,11,1,15,20,15,16,87],
'H':[1,3,4,7,8,11,1,15,78,15,16,87],
'N':[1,3,4,98,8,11,1,15,20,15,16,87],
'p':[1,3,4,9,8,11,1,15,20,15,16,87],
'B':[1,3,4,6,8,11,1,19,20,15,16,87],
'y':[0,0,0,0,1,1,1,0,0,0,0,0]})
print((df))
A H N p B y
0 1 1 1 1 1 0
1 3 3 3 3 3 0
2 4 4 4 4 4 0
3 7 7 98 9 6 0
4 8 8 8 8 8 1
5 11 11 11 11 11 1
6 1 1 1 1 1 1
7 15 15 15 15 19 0
8 20 78 20 20 20 0
9 15 15 15 15 15 0
10 16 16 16 16 16 0
11 87 87 87 87 87 0
Here i want to remove label N first dataframe after removing label N
A H p B y
0 1 1 1 1 1 0
1 3 3 3 3 3 0
2 4 4 4 4 4 0
3 7 7 98 9 6 0
4 8 8 8 8 8 1
5 11 11 11 11 11 1
6 1 1 1 1 1 1
7 15 15 15 15 19 0
8 20 78 20 20 20 0
9 15 15 15 15 15 0
10 16 16 16 16 16 0
11 87 87 87 87 87 0
Rrquired output:
A H P B y
0 1 1 1 1 1 0
1 3 3 3 3 3 0
2 4 4 4 4 4 0
3 7 7 98 9 6 0
4 8 8 8 8 8 1
5 11 11 11 11 11 1
6 1 1 1 1 1 1
7 15 15 15 15 19 0
8 20 78 20 20 20 0
9 15 15 15 15 15 0
10 16 16 16 16 16 0
11 87 87 87 87 87 0
Here last column can be ignore
Note: in original dataframe have many columns , can not rename columns , so need some auto method to shift column names lef
You can do
df.columns=sorted(df.columns.str.replace('N',''),key=lambda x : x=='')
df
A H p B y
0 1 1 1 1 1 0
1 3 3 3 3 3 0
2 4 4 4 4 4 0
3 7 7 98 9 6 0
4 8 8 8 8 8 1
5 11 11 11 11 11 1
6 1 1 1 1 1 1
7 15 15 15 15 19 0
8 20 78 20 20 20 0
9 15 15 15 15 15 0
10 16 16 16 16 16 0
11 87 87 87 87 87 0
Replace the columns with your own custom list.
>>> cols = list(df.columns)
>>> cols.remove('N')
>>> df.columns = cols + ['']
Output
>>> df
A H p B y
0 1 1 1 1 1 0
1 3 3 3 3 3 0
2 4 4 4 4 4 0
3 7 7 98 9 6 0
4 8 8 8 8 8 1
5 11 11 11 11 11 1
6 1 1 1 1 1 1
7 15 15 15 15 19 0
8 20 78 20 20 20 0
9 15 15 15 15 15 0
10 16 16 16 16 16 0
11 87 87 87 87 87 0

How to copy values from one column in df1 to df2 based on specific values in other three columns?

I have two dataframes with similar shapes and column names and would like to copy the values of df1['property'] and paste them in df2['property'], but there is a condition.
df1:
i j k property
1 1 1 10
1 1 2 20
1 1 3 30
1 2 1 40
1 2 2 50
1 2 3 60
1 3 1 70
1 3 2 80
1 3 3 90
2 1 1 100
2 1 2 110
2 1 3 120
2 2 1 130
2 2 2 140
2 2 3 150
2 3 1 160
2 3 2 170
2 3 3 180
3 1 1 190
3 1 2 200
3 1 3 210
3 2 1 220
3 2 2 230
3 2 3 240
3 3 1 250
3 3 2 260
3 3 3 270
df2:
i j k property
1 1 1 100
2 1 1 100
3 1 1 100
1 2 1 100
2 2 1 100
3 2 1 100
1 3 1 100
2 3 1 100
3 3 1 100
1 1 2 100
2 1 2 100
3 1 2 100
1 2 2 100
2 2 2 100
3 2 2 100
1 3 2 100
2 3 2 100
3 3 2 100
1 1 3 100
2 1 3 100
3 1 3 100
1 2 3 100
2 2 3 100
3 2 3 100
1 3 3 100
2 3 3 100
3 3 3 100
The other three columns (i, j, k) represent different positions and the copied value of df1['property'] must replace df2['property'] only where df1[['i','j','k']] is the same as df2[['i','j','k']]. Anyone could help me with this?
In my mind, I should use map function but I do not know how to do this for three columns condition.
IIUC you want DatFrame.merge:
df2['property']=( df2.drop('property',axis=1)
.merge(df1,on=['i','j','k'],how = 'left')['property']
.fillna(df2['property']) )
print(df2)
#or this:
#df2['property']=( df2.merge(df1,on=['i','j','k'],how = 'left')['property_y']
# .fillna(df2['property']) )
We could also use DataFrame.update:
df2_update=df2.set_index(['i','j','k'])
df2_update.update(df1.set_index(['i','j','k']))
df2_update = df2_update.reset_index()
print(df2_update)
Output
i j k property
0 1 1 1 10
1 2 1 1 100
2 3 1 1 190
3 1 2 1 40
4 2 2 1 130
5 3 2 1 220
6 1 3 1 70
7 2 3 1 160
8 3 3 1 250
9 1 1 2 20
10 2 1 2 110
11 3 1 2 200
12 1 2 2 50
13 2 2 2 140
14 3 2 2 230
15 1 3 2 80
16 2 3 2 170
17 3 3 2 260
18 1 1 3 30
19 2 1 3 120
20 3 1 3 210
21 1 2 3 60
22 2 2 3 150
23 3 2 3 240
24 1 3 3 90
25 2 3 3 180
26 3 3 3 270
I'd do this:
import pandas as pd, numpy as np
df1 = pd.DataFrame(dict(i=np.repeat([1,2,3],9), j=np.repeat([[1,2,3],[1,2,3],[1,2,3]],3), k=[1,2,3]*9,\
property=range(10,280,10)))
df2 = pd.DataFrame(dict(k=np.repeat([1,2,3],9), j=np.repeat([[1,2,3],[1,2,3],[1,2,3]],3), i=[1,2,3]*9,\
property=100))
df = pd.concat([df1,df2.rename(columns={"i":"ii","j":"jj","k":"kk","property":"property2"})],axis=1)
df.property2 = np.where((df.i==df.ii)&(df.j==df.jj)&(df.k==df.kk),df.property,df.property2)
df=df[["ii","jj","kk","property2"]]
print(df)
Gives:
ii jj kk property2
0 1 1 1 10
1 2 1 1 100
2 3 1 1 100
3 1 2 1 40
4 2 2 1 100
5 3 2 1 100
6 1 3 1 70
7 2 3 1 100
8 3 3 1 100
9 1 1 2 100
10 2 1 2 110
11 3 1 2 100
12 1 2 2 100
13 2 2 2 140
14 3 2 2 100
15 1 3 2 100
16 2 3 2 170
17 3 3 2 100
18 1 1 3 100
19 2 1 3 100
20 3 1 3 210
21 1 2 3 100
22 2 2 3 100
23 3 2 3 240
24 1 3 3 100
25 2 3 3 100
26 3 3 3 270

Getting a number of quarter from numeric week number and the week number within the quarter in python?

I've a list of number from 1 to 53. I am trying to calculate 1) the quarter of a week and 2) the number of that week within that quarter using numeric week numbers. (if 53, needs to be qtr 4 wk 14, if 27 needs to be 3rd quarter wk 1). Got this working in excel, but not in python? Any thoughts?
tried the following, but at each try I've an issue with the wk's like 13 or 27 depending on the method I'm using.
13 -> should be qtr 1 , 27 -> should be 3 qtr.
df['qtr1'] = df['wk']//13
df['qtr2']=(np.maximum((df['wk']-1),1)/13)+1
df['qtr3']=((df1['wk']-1)//13)
df['qtr4'] = df['qtr2'].astype(int)
Results are awkward
wk qtr qtr2 qtr3 qtr4
1.0 0 1.076923 -1.0 1
13.0 1(wrong) 1.923077 0.0 1
14.0 1 2.000000 1.0 2
27.0 2 3.000000 1.0 2 (wrong)
28.0 2 3.076923 2.0 3
You can convert your weeks to integers, by using astype:
df['wk'] = df['wk'].astype(int)
You should subtract it with one first, like:
df['qtr'] = ((df['wk']-1) // 13) + 1
df['weekinqtr'] = (df['wk']-1) % 13 + 1
since 13//13 will be 1, not zero. This gives us:
>>> df
wk qtr weekinqtr
0 1 1 1
1 13 1 13
2 14 2 1
3 26 2 13
4 27 3 1
5 28 3 2
If you want extra columns per quarter, you can use get_dummies(..) [pandas-doc] to obtain a one-hot encoding per quarter:
>>> df.join(pd.get_dummies(df['qtr'], prefix='qtr'))
wk qtr weekinqtr qtr_1 qtr_2 qtr_3
0 1 1 1 1 0 0
1 13 1 13 1 0 0
2 14 2 1 0 1 0
3 26 2 13 0 1 0
4 27 3 1 0 0 1
5 28 3 2 0 0 1
Using div // and modulo % work for what you want I think
In [254]: df = pd.DataFrame({'week':range(52)})
In [255]: df['qtr'] = (df['week'] // 13) + 1
In [256]: df['qtr_week'] = df['week'] % 13
In [257]: df.loc[(df['qtr_week'] ==0),'qtr_week']=13
In [258]: df
Out[258]:
week qtr qtr_week
0 1 1 1
1 2 1 2
2 3 1 3
3 4 1 4
4 5 1 5
5 6 1 6
6 7 1 7
7 8 1 8
8 9 1 9
9 10 1 10
10 11 1 11
11 12 1 12
12 13 2 13
13 14 2 1
14 15 2 2
15 16 2 3
16 17 2 4
17 18 2 5
18 19 2 6
19 20 2 7
20 21 2 8
21 22 2 9
22 23 2 10
23 24 2 11
24 25 2 12
25 26 3 13
26 27 3 1
27 28 3 2
28 29 3 3
29 30 3 4
30 31 3 5
31 32 3 6
32 33 3 7
33 34 3 8
34 35 3 9
35 36 3 10
36 37 3 11
37 38 3 12
38 39 4 13
39 40 4 1
40 41 4 2
41 42 4 3
42 43 4 4
43 44 4 5
44 45 4 6
45 46 4 7
46 47 4 8
47 48 4 9
48 49 4 10
49 50 4 11
50 51 4 12

Process by rows adding values

I'm trying to transpose and sum with the following criteria: I have to create a row for each LOGIN and DATE and a column with the ACT values and the sum of their respective MAP values. In the middle separated by : I have to create the sum of all the MAP values, as follows:
LOGIN DATE ACT MAP
1 11/02/2008 149 3
1 11/02/2008 18 1
1 11/02/2008 18 1
1 11/02/2008 18 5
1 13/02/2008 145 2
1 13/02/2008 43 3
2 13/02/2008 19 0
2 13/02/2008 18 1
2 14/02/2008 18 1
2 14/02/2008 18 1
3 14/02/2008 39 1
3 15/02/2008 149 0
3 15/02/2008 43 0
3 15/02/2008 19 1
3 15/02/2008 19 1
1 11/02/2008 149 18 : 10: 3 7 This is the first row that I should create because 149 and 18 are the ACT values for this LOGIN and DATE, 3 = MAP value for ACT 149 and 7 is the sum of the MAP values for ACT 18, 7=1+1+5, in the middle the 10 value = 3+7
1 13/02/2008 145 43 : 5: 2 3
2 13/02/2008 19 18 : 1: 1 0
2 14/02/2008 18 : 2 : 2
3 14/02/2008 39 : 1 : 1
3 15/02/2008 149 43 19 : 2 : 0 0 2
I grouped and added to obtain this but need to process by rows
LOGIN MAP
1 15
11/02/2008 10
13/02/2008 5
2 3
13/02/2008 1
14/02/2008 2
3 3
14/02/2008 1
15/02/2008 2
I transformed the input file and now it looks like this, now I need to concatenate the values of the ACT column until I find a blank row. For example I need to create 18 149 10 7 3 for the first group until the first blank. For the second blank I need to create 43 145 5 3 2
LOGIN ACT Total
1 18 7
1 149 3
1 10
1 43 3
1 145 2
1 5
2 18 1
2 19 0
2 1
2 18 2
2 2
3 39 1
3 1
3 19 2
3 43 0
3 149 0
3 2

How to find the numver of duplicate lines, each line contains a few numbers seperated by spaces

Suppose i have a file like this...
4 2 8 2 12 3 18 2 22 2 26 2 28 3 30 2
4 3 10 2 14 2 18 2 20 3 22 2 28 2 32 2
2 3 10 3 12 2 16 2 18 3 20 2 24 2 26 3
1 3 3 3 17 3 19 3 26 2 28 2 30 2 32 2
4 2 8 2 12 3 18 2 22 2 26 2 28 3 30 2
the first and the last line are the same in the input...
I want the output to be like ...
4 2 8 2 12 3 18 2 22 2 26 2 28 3 30 2 2
4 3 10 2 14 2 18 2 20 3 22 2 28 2 32 2 1
2 3 10 3 12 2 16 2 18 3 20 2 24 2 26 3 1
1 3 3 3 17 3 19 3 26 2 28 2 30 2 32 2 1
The extra last coloum in the output simply specifies the extra number of lines.....
how can i do this in bash...
i know the sort command but it only works with one number per line....
Coming from sehe's suggestion, what about this?
sort your_file | uniq -c | awk '{for(i=2;i<=NF;i++) printf $i"\t"; printf $1"\n"}'
Output:
1 3 3 3 17 3 19 3 26 2 28 2 30 2 32 2 1
2 3 10 3 12 2 16 2 18 3 20 2 24 2 26 3 1
4 2 8 2 12 3 18 2 22 2 26 2 28 3 30 2 2
4 3 10 2 14 2 18 2 20 3 22 2 28 2 32 2 1

Resources