Data analysis for behavioural data - statistics

I have an organism x with n number of individuals tested. One individual is only tested once. The individuals are trained and tested in a choice array with multiple possible choices. The test choices are of two types p and q with 10 each, amounting to a total of 20 possible choices. The individual is allowed to choose as many times as they want till they are done making choices. What do i do with this data set? How do i analyse the preference?

Related

How to deal with non triplicated data in triplicated dataset

My problem is the following. I have a dataset with 10 variables and 8 samples. Each sample has been analysed for triplicate, therefore I have a dataset of 24 rows. However, some analyses (variable) were not performed in triplicate. In the case where the analysis was only done once, I have to introduce NA in order to fill the blanks. In the cases where the analysis was performed more than three times, I have to introduce new rows that add NA to the analysis which were in fact done three times.
My ulterior goal is to apply ANOVA to this dataset.I have thought about repeating the value in the case where I only have 1 analysis, and randomly eliminating values in the cases where I have more than 3 analysis, but I have the feeling this is not the most orthodox way to proceed.
I hope it is clear enough.
Thanks in advance!

Excel logical numeric evaluation

So I have a sheet where I am assigning levels to individuals based on their training, IE: Level 4 SME, Level 3 trained and can train others, level 2 trained, and level 1 untrained. For each shift, I want at least 2 level 2 individuals if so readiness is 100% anything above that is over percentage(which is fine) but anything less I want it to be less than 100%. I am trying to do this with formulas but it is not working the way I want.
Table Layout
Formulas
The above example would show more than 100% becuse there is more then two people at level 2 I wish there was a way to loop in excel to allow for me to increment and number for every count of 2.
How about this
=IF(COUNTIF(B2:E2,">=2")>=2,">=100%","not ready")
you can do a certain amount of looping in the UI with the LAMBDA function, if you have Office 365, although I'm not clear as to your requirement.
=COUNTIF(B2:E2,">=2") * 0.5 --> format as percent

I am wondering if the statistical analysis I did makes any sense

I am helping with a retrospective study and the data isn't very well organized. Also, I am new to statistics, so I took a stab at analyzing the data myself. We will be getting the help of a statistician later on, but not sure when yet.
We are looking at about 100 patients and each patient was followed up with for a variable amount of time. Throughout each patient's follow-up, there were a variable amount of observations made at various timepoints. The observations included a set of lab values, anthropometric data, and demographic data. So to conduct the analysis, we split up the observations into time bins (eg. 6 months follow-up, 1 year follow up, etc). Then for each time point, we categorized each patient in one of 3 groups based on the outcome of interest. Also, for each time point, we selected one observation to represent one patient during that timepoint (since there could be many within the same time bin). For the analysis, we did the following:
1 . ANOVA within each timepoint to compare the 3 groups of outcomes . Looking at select independent variables of interest.
2 . For the same variables of interest above, do a repeated measures ANOVA to see if it's changing over time.
3 . Test for correlations between the variables of interest mentioned above and other independent variables.
4 . Test each independent variable in a univariate binomial logistic regression to see if it predicts outcome. There were 3 groups, so we did pairwise regressions (eg. (outcome 1 + 2) vs (outcome 3), and (outcome 1) vs (outcome 2 + 3)).
5 . Do a multivariate binomial logistic regression with forward elimination using only the significant independent variables retained from step 4.
6 . If any independent variables of interest are retained in the MV regression, run it again testing for potential interactions with any variables it was correlated with from step 3. We tried to do this by making a new variable that is the product of the two variables and putting it into the regression.
What I'm trying to show with this analysis is that one key independent variable explains the difference in outcomes among the patients. So far my analysis seems to be doing this, as it seems to be one of the few variables retained at step 6 and with a good significance value. So sorry if this is very confusing to read.

Generate permutations

I have n players to assign to n games. 10 <= n <= 20. Each player can sign up for up to 3 games but will only get one. Different players have different score for each game they sign up for.
Example with 10 players:
It's always possible to assign players x to game x but it will not always give the highest score in total.
My goal is to get as high score as possible and I therefore want to test the different permutations. I could teoretically test all permutations and throw away the unfeasible ones but it will give me a hughe number of possibilities (n!).
Is it possible to reduce the problem with the sign up limit of max 3 games? Maybe this can be done more easily than my approach? Any thoughts?
I'm working in Excel VBA.
I hope you find this as interesting as I do ...
Sorry if you find this unclear! My question is if it's possible to generate a subset of all the permutations. More precise only the feasible ones (which are the ones without any zero score).
Well, just set this up in the solver using Linear Programming as you can see in the image. Have shown the formulae so you can build it as well, along with the solver settings.
Won't give the permutations, but does solve for the highest combination.
Edit, updated image... it now shows correct ranges for the calculations, after trying to make it fit a reasonable size...

Creating randomized lab partner matrix in Excel

I'd like to use Excel to generate a randomized lab partner list, without using VB (due to security settings on the PCs).
Parameters are as follows:
Number of students: 10-30, one worksheet per total number desired
Number of partners: Three for first two labs, and two for the other four-five.
Number of lab stations: 10
Repeats: Ideally none, but it is permissible for a student to have a repeat partner from one of the first two labs.
Excel version: 2007
To clarify, each student will have two labs where they share a lab station with up to two other students, giving a maximum lab size of 30 students. After that, they will be strictly limited to two students per station, giving a maximum of 20 students. Each student will have four of these limited labs, with there being a total of five such labs presented, to allow for either odd-numbered classes, or a class size between 21-30.
Each student is simply numbered from 1-30, so a cell could, for instance, state "5, 24" as the two students for that lab station.
True RNG is not important, and in fact, only needs to be performed once to make these matrices.
I think this is a bit tricky without using VBA, but here is one approach that is OK for small groups. I have tried it using a group of just nine so that the screen shot should be readable.
The method is basic Fisher-Yates
A Start with a group of students size n represented by a list of numbers 1 to n.
B Generate a random number r in range 1 to n
C Pick the rth element from the list
D Remove the rth element from the list
E Reduce n by 1
F Repeat from B until n=1.
In Excel:-
Fill A2:A10 and D2:L2 with numbers 1-9
Put the following in B2 and pull down:-
=RANDBETWEEN(1,10-A2)
Put this in C2 and pull down:-
=OFFSET(D2,0,B2-1)
Put this in D3 and pull down and across:-
=IF(D2>=$C2,E2,D2)
The ID's will be in column C so the first three would be in group 1, the next three in group 2 etc.
By the way, your question is a special case of generating non-repeating random numbers - see
Generating unique random numbers without VBA
The array formula described here does it in one step - modified slightly for this problem it would look like
=SMALL(IF(COUNTIF(C$1:C1,ROW(INDIRECT("1:9")))=0,ROW(INDIRECT("1:9"))),RANDBETWEEN(1,(9-ROWS(C$2:C2)+1)))

Resources