which statistics test I need to find the significance - statistics

I am not sure which statistics testing technique I need to evaluate my results :( I tested 30 samples by three different techniques. How to find statistically which technique is better in the following cases?
comparing the three in one test?
compare pair in three different tests, eg tech1 and tech2, tech1 and tech3, tech2 and tech3?
Thanks alot.

so i applied ttest for 1 & 2, 2 & 3 and 1 & 3 in Microsoft excel.
Because the same samples were evaluated using 3 different techniques I used the formula
To do TTest for A, B
=ttest( 30 values of A, 30 values of B, 2, 1)

Related

how to compute t-test step by step

I am trying to compute the t-test using Excel, without the macro included in the software.
Specifically, given a dataset, for example
var1
21
34
23
32
21
42
32
12
53
31
21 - from here
41
12
14
24 - to here
I am interesting in analysing the change in the last five rows (from 21 to 24).
What I did is to compute the mean of the two samples of data, i.e. of the set1 (from 21 to 31) and of the set2 (from 21 to 24).
Then, I compute the variance of these sets, using var.S.
Once I did it, I used the formula for unequal sample sizes, unequal variances to determine the degree of freedom.
Now, what I should do is using the t.dist function in Excel to get the final result. However, I cannot understand the parameters to insert.
Could you please tell me what I should do and if it is ok what I have done until now?
Thank you.
Having 21 at the start of both sets of data led to some confusion about which points were being compared with which; a different example (or explicitly identifying both sets in the diagram, not just the second one) could avoid that ambiguity.
Note that if these are time series data, the assumption of independence will usually not be tenable.
You don't need a macro, Excel has a built in t-test function (T.TEST) that is capable of doing what you ask for.
T.TEST(array1,array2,tails,type)
array1 and array2 should be two non-overlapping sets of observations.
tails should be 1 or 2 depending on whether you need a 1- or 2-tailed test. I presume you want two tailed. (If it's one-tailed, beware; it actually just reports half the two-tailed p-value, which is not correct if the sample means are in the opposite direction to the one hypothesized in the one-tailed alternative.)
type is 1,2 or 3 depending on whether you want paired, independent with equal variance or independent with unequal variance. (It's possible to make the paired test option do a one-sample t-test as well)
There's an explicit example of using this function at the link above.
With your approach you next need to compute the t-statistic before trying to use the t.dist function; that's the first argument.

count how often a piece of information appears and calculate that average

I do not want to know the traditional frequency or the traditional averages; so I'll give an example below:
I have this data:
1
3
5
5
2
3
5
5
1
3
The analysis that I would like to obtain is the following:
for example number 1 appears once every eight rows, number 3 appears once every four rows, number 5 appears twice every two rows....
I did it by hand, but now I have more than 21000 rows of data and I'm stuck.
I searched but I can not find a function that does it; But before I started developing my own, I decided to ask for a guide on how to achieve it.
I believe that I was able to achieve the desired result:
The formula is:
Or, if you want to copy/paste:
=IF(CONCATENATE("1-",MATCH(D1,INDIRECT(ADDRESS(MATCH(D1,A1:A17,0)+1,1,4)&":A17"),0))="1-1",CONCATENATE("2-",MATCH(D1,INDIRECT(ADDRESS(MATCH(D1,A1:A17,0)+2,1,4)&":A17"),0)-1),CONCATENATE("1-",MATCH(D1,INDIRECT(ADDRESS(MATCH(D1,A1:A17,0)+1,1,4)&":A17"),0)))
Note that the IF function solves the duplicates (like the number 5). In case you have triplicates you will have to add another instance of IF and adjust the formula accordingly.
Hope that helps!
Well this doesn't exactly reproduce your results, but you could start by looking at the max and min separation of the numbers:
=IF(COUNTIF(A$1:A$10,C2)<=1,"",MIN(IF((ROW(A$1:INDEX(A$1:A$10,COUNTIF(A$1:A$10,C2)+1))>1)
*(ROW(A$1:INDEX(A$1:A$10,COUNTIF(A$1:A$10,C2)+1))<=COUNTIF(A$1:A$10,C2)),
FREQUENCY(IF(A$1:A$10<>C2,ROW(A$1:A$10)),IF(A$1:A$10=C2,ROW(A$1:A$10)))))+1)
=IF(COUNTIF(A$1:A$10,C2)<=1,"",MAX(IF((ROW(A$1:INDEX(A$1:A$10,COUNTIF(A$1:A$10,C2)+1))>1)
*(ROW(A$1:INDEX(A$1:A$10,COUNTIF(A$1:A$10,C2)+1))<=COUNTIF(A$1:A$10,C2)),
FREQUENCY(IF(A$1:A$10<>C2,ROW(A$1:A$10)),IF(A$1:A$10=C2,ROW(A$1:A$10)))))+1)
This gives the min or max number of rows between each occurrence of the particular number.
Must be entered as an array formula using CtrlShiftEnter
You could add other statistics (like mean, standard deviation) the same way although the average could be calculated just by (lastrow-firstrow)/(count-1) e.g. for 5 it would be (8-3)/(4-1)=5/3.

Excel: Returning a specific value if found

I want to be able to calculate the average a student, however, if it finds a specific number between the averaged subjects, then it will return that specific number AND NOT calculate the average.
Example:
If Jack has the following averaged grades in some amount of subjects - 2 3 4 5 5 - then I would want Excel to calculate the average. The answer would be 3.8.
However, if Josh has the following averaged grades - 1 2 3 4 5 - then I would want Excel to return 1 has Josh's average because it found a specific given number in his averaged grades, in this example that would be 1. I want it to return 1 as an answer, and not 3.
I don't know if it makes any sense. I tried to make it understandable. I tried mixing up varies functions, but with no results. Do I have to use VBA for that?
Use this formula:
=IFERROR(INDEX(A:A,MATCH(1,A:A,0)),AVERAGE(A1:A5))
Try,
=if(countif(A:A, 1), 1, average(A:A))
=IF(ISNUMBER(MATCH(1,A1:A5)), 1, AVERAGE(A1:A5))
=IF(SUMPRODUCT(--(A1:A5=1)), 1, AVERAGE(A1:A5))

Creating randomized lab partner matrix in Excel

I'd like to use Excel to generate a randomized lab partner list, without using VB (due to security settings on the PCs).
Parameters are as follows:
Number of students: 10-30, one worksheet per total number desired
Number of partners: Three for first two labs, and two for the other four-five.
Number of lab stations: 10
Repeats: Ideally none, but it is permissible for a student to have a repeat partner from one of the first two labs.
Excel version: 2007
To clarify, each student will have two labs where they share a lab station with up to two other students, giving a maximum lab size of 30 students. After that, they will be strictly limited to two students per station, giving a maximum of 20 students. Each student will have four of these limited labs, with there being a total of five such labs presented, to allow for either odd-numbered classes, or a class size between 21-30.
Each student is simply numbered from 1-30, so a cell could, for instance, state "5, 24" as the two students for that lab station.
True RNG is not important, and in fact, only needs to be performed once to make these matrices.
I think this is a bit tricky without using VBA, but here is one approach that is OK for small groups. I have tried it using a group of just nine so that the screen shot should be readable.
The method is basic Fisher-Yates
A Start with a group of students size n represented by a list of numbers 1 to n.
B Generate a random number r in range 1 to n
C Pick the rth element from the list
D Remove the rth element from the list
E Reduce n by 1
F Repeat from B until n=1.
In Excel:-
Fill A2:A10 and D2:L2 with numbers 1-9
Put the following in B2 and pull down:-
=RANDBETWEEN(1,10-A2)
Put this in C2 and pull down:-
=OFFSET(D2,0,B2-1)
Put this in D3 and pull down and across:-
=IF(D2>=$C2,E2,D2)
The ID's will be in column C so the first three would be in group 1, the next three in group 2 etc.
By the way, your question is a special case of generating non-repeating random numbers - see
Generating unique random numbers without VBA
The array formula described here does it in one step - modified slightly for this problem it would look like
=SMALL(IF(COUNTIF(C$1:C1,ROW(INDIRECT("1:9")))=0,ROW(INDIRECT("1:9"))),RANDBETWEEN(1,(9-ROWS(C$2:C2)+1)))

How to determine statistical significance using T-Test in Excel?

I have two groups of data sets, A and B. I would like to know weither the average value of A
significantly differs then B's average. How to do that in Excel 2007?
(I know there's a TTEST formula in excel, I also know I don't need to use the paired version of it, what other parameters do I need to set and how to interpert the result?)
Thanks,
Jon
=ttest(array1,array2,tails,type)
array1 is data set A
array2 is data set B
tails: 1= one tailed, 2 = two tailed. Use one tailed if you are testing whether A is higher than B, or whether A is lower than B. Use two tailed if you are testing whether A is either higher or lower than B. (Probably 1 for your situation.)
type: You said you don't need paired, which is Type 1. Type 2 is if your data sets have equal variance, and Type 3 is if they have unequal variance. For example if the data points in A are all pretty close, but in B they are wildly different, use Type 3.

Resources