Error in Proc Freq - statistics

I am have a data set with multiple visits with 2 treatment arms and a Vehicle group. Also i have a variable say "SSA" having two values 1 and 0. here 1 stands for responder and 0 for non-responder subjects. while performing Proc Freq for chi-square statistics i am getting the following error. Here is the code i used
PROC FREQ DATA=P&V1;
TABLE TREATMENT*SSA/CHISQ ;
WHERE TREATMENT IN (1 &TR1); *(&tri for treatment 2 and treatment 3);
RUN;
NOTE: No statistics are computed for TREATMENT * SSA since SSA has less than 2 no missing levels.
WARNING: No OUTPUT data set is produced because no statistics can be computed for this table, which has a row or column variable with less than 2 no missing levels.
this error is for my last visit where i am having a single value 0 in all treatment groups and Vehicle.

Related

How to return count of number in sequence after validating three values from 3 different columns even though there will be matched data found?

I have a range of cells in excel. How to increment numbers when meeting data validation from three different columns?
I tried using formula COUNTIF($A$2:A2,A2) which creates a number sequence. But I have other data to validate from another column for it to return the correct number sequence.
First validation: count the emp no in column range A1:A5 which return a result under Hierarchy column.
Second validation: check the % value under column L as per below level of hierarchy in which the problem comes from.
1 - 0.25
2 - 0.25
3 - 0.5
4 - 0.5
5 - 1
Third validation: check the type of Relation (see Relation column) that needs to check when returning number of sequence too. Below is the Relation Level table.
I don't know on how to join these three conditions for the result to be as below.
My really problem here is on how will i get a sequence number if a person does have 3 children and should be tagged as 2,3,4 (next to spouse which is 1) then the next relation which is parent will be tagged then as next number sequence from the last count of child wherein will be 5 given that as per relation table, Parent level is 3 but it will be adjusted as per count of relations a person has. And for this specific instance, if Parent count will be 5, it still should have 0.5 EE % (see relation table level vs % hierarchy level) even though the count of number is 5. I hope this will make sense. But let me know if you have any questions.
Hope someone could help me on this coz I am not that expert when it comes to excel formula. Thank you!

Survival rates >1 when using mrOpen in the package FSA

I am currently doing some population analyses with the package "FSA" in R.
By using the mrOpencommand, I want to get the survival rate.
My rawdata is a simple table with one row per indidivual, one column per sample date and values of 0 and 1 (for not capured or captured during that respective sampling).
id
total.captures
date1
date2
date3
etc
1
3
1
1
1
...
2
1
1
0
0
...
The first two columns contain the individual id and the aggregated number of captures which is why I excluded them in the analysis.
This is the exact code:
hold.data<-capHistSum(data, cols2use = c(3:13))
est.data<-mrOpen(hold.data)
summary(est.data)
confint(est.data)
It seems to work out, as I get the tables and summaries with all the parameters. See here as an example:
Screenshot_Results
However, there's a problem with the survival estimate phi.
The phi value is not between 0 and 1, but in some cases, exceeds 1.
Any idea, what went wrong here?
Thanks,
Pia

What if my dataset contains only 0 and 1? Can I check for correlation for them and get the significant results in Excel?

My data set looks like this:
P T O
1 1 0
1 0 1
1 1 1
0 1 0
1 1 0
My doubt is that we only have two values i.e. zero and one. That would logically mean that correlation can not compute the level of significance. My assumption is in this case it would be than calculating the coefficient based on occurrence of 4 combination i.e. {(1,1),(1,0),(0,1),(0,0)} rather than calculating in the magnitude of change in variables. But conversely if coefficient works only on magnitude of change, than is this the right method for my data set?
Could anyone tell me if I am on right track of thoughts or calculating such coefficient yields no significance?

Rank, group and set a label in Spotfire table

I have two column [Clients] and [Sales], I'd like to create a new column containing a scoring (from 1 to 5) for each "Client" in terms of Sales.
I want to make a ranking of [Sales] and split it, uniformly in 5 groups, then set a label 1 for the highest [Sales], 2 for the second group, etc...
Does someone have an idea of an expression to use ?
You can use the percentile function in conjunction with a case statement. In the screenshot below, I created 5 calculations to find the 20th, 40th, 60th, and 80th percentiles and then created a case statement to rank based on those values.
Percentile calculation:
Percentile([Sales],20)
Case statement:
case
when [Sales]<[20th Percentile] then 1
when ([Sales]>=[20th Percentile]) and ([Sales]<[40th Percentile]) then 2
when ([Sales]>=[40th Percentile]) and ([Sales]<[60th Percentile]) then 3
when ([Sales]>=[60th Percentile]) and ([Sales]<[80th Percentile]) then 4
when [Sales]>=[80th Percentile] then 5
else NULL
end
See attached screenshot
Data sample and calcs

Excel IF OR Statement

I am having trouble determining the correct way to calculate a final rank order for four categories. Each of the four metrics make up a higher group. A Top 10 of each category is applied to the respective product to risk analysis.
CURRENT LOGIC - Assignment of 25% max per category.
Columns - Y4
Parts
0.25
25
=IF(L9=1,$Y$4,IF(L9=2,$Y$4*0.9, IF(L9=3,$Y$4*0.8, IF(L9=4,$Y$4*0.7, IF(L9=5,$Y$4*0.6, IF(L9=6,$Y$4*0.5, IF(L9=7,$Y$4*0.4, IF(L9=8,$Y$4*0.3, IF(L9=9,$Y$4*0.2, IF(L9=10,$Y$4*0.1,0))))))))))
DESIRED...
I would like to use a statement to determine three criteria in order to apply a score (1=100, 2=90, 3=80, etc..).
SUM the rank positions of each of the four categories-apply product rank ascending (not including NULL since it's not in the Top 10)
IF a product is identified in more than one metric-apply a significant contribution weight of (*.75),
IF a product has the number 1 rank in any of the four metrics-apply a score of (100).
Data - UPDATED EXAMPLE
(Product) Parts Labor Overhead External Final Score
"XYZ" 3 1 7 7 100
"ABC" NULL 6 NULL 2 100
"LMN" 4 NULL NULL NULL 70
This is way beyond my capability. ANY assistance is appreciated greatly!!!
Jim
I figured this is a good start and I can alter the weight as needed to reflect the reality of the situation.
=AVERAGE(G28:I28)+SUM(G28:I28)*0.25
However, I couldn't figure out how to put a cap on the score of no more than 100 points.
I am still unclear of what exactly you are attempting and if this will work, but how about this simple matrix using an array formula and some conditional formatting.
Array Formula in F2 (make sure to press Ctrl+Shift+Enter when exiting formula edit mode)
=MIN(100,SUM(IF(B2:E2<>"NULL",CHOOSE(B2:E2,100,90,80,70,60,50,40,30,20,10))))
Conditional Formatting defined as shown below.
Red = 100 value where it comes from a 1
Yellow = 100 value where it comes from more than 1 factor, but without a 1.

Resources