What type of ANOVA - statistics

what type of anova is 5 treatment group, I have data for the number of cold reported as a function of vitamin c dose
0mg 250mg 500mg 100mg 2000mg
5 6 4 6 3
6 5 6 6 0
2 4 2 3 1
5 4 5 0 3

This is a pretty simple one-way ANOVA, one factor with five treatments. Be aware that you have pretty low sample size in each group, so your power is low.
Also be aware that your data are integers (and not continuous), so you may need to log-transform the response or use a Poisson model.

Related

Excel: SUMPRODUCT calculating shared workload in hours with multiple percentages

This is the same problem as (Excel: SUMPRODUCT calculating shared workload in hours with percentages) with an addition.
I'm trying to calculate the workload/hours for each employee for certain projects. In column B you can see the responsible (100% of the workload), in C you can see which employee is taking 50% or 25% of the workload off the responsible employee. So I need the sum of all hours, while deducing of adding the 50% or 25% in case the workload is shared and giving it to the employee helping.
The latest formula adding/subtracting only the 50% in C:
=SOMPRODUCT(($A6=$B$2:$B$5)*($B$1-$C$1*(""<>$C$2:$C$5))*E$2:E$5+($A6=$C$2:$C$5)*$C$1*E$2:E$5)
What would be the most elegant solution? I need the option to change the percentage.
100% 50% 25% Day 1 Day 2 Day 3
Project 1 Mark Peter 6 2 6
Project 2 Peter Lily 2 8 2
Project 3 Peter Lily 0 4 8
Project 4 Lily Mark 4 0 2
Mark 8 2 7
Peter 1 8 9
Lily 3 4 2
Perhaps:
= SUM(($B$2:$D$5=$A6)*$B$1:$D$1*E$2:E$5)
- SUM( ($B$2:$B$5=$A6)*NOT(ISBLANK($C$2:$D$5))*($C$2:$D$5<>$A6)*E$2:E$5*$C$1:$D$1 )

generate normalized discrete values for feature engineering

There is a dataframe, with one columns store the discrete values, shown as follows. I would like to create another column storing the normalized values. For instance, for 4050, the corresponding entry will be 4. Are there any efficient ways to do that instead of writing my own function? In Sklearn, are there any functions to generating normalized values?
Based on your comment:
there are around 20 different values, and the range is from 1000 to 9999, so I would like to use every 1000 as a category
This isn't really normalization in the strict sense of the word. However, to do that, you can easily use floor division (//):
df['new_column'] = df['values']//1000
For example:
>>> df
values
0 2021
1 8093
2 9870
3 4508
4 2645
5 1441
6 8888
7 8921
8 7292
9 8571
df['new_column'] = df['values']//1000
>>> df
values new_column
0 2021 2
1 8093 8
2 9870 9
3 4508 4
4 2645 2
5 1441 1
6 8888 8
7 8921 8
8 7292 7
9 8571 8

Using Excel to allocate values based off their rank while remaining within constraints

I am trying to create a resource calculator that can tell me how many people i need to put on each section depending on the current work waiting and work coming in. Prioritizing sections which have the most work waiting first.
Upper Limit Allocation Prod Ranking
12 [to calc] 28% 1
15 18% 2
5 17% 3
4 8% 4
2 6% 5
3 .2% 6
4 .2% 6
Similar to the other question I have a constraint that i only have so much to allocate. For this example we will use 38 as the amount that is to be allocated.
I have used the formula from the other answer:
=MIN(A2,$E$1-SUMIF($D$2:$D$8,"<"&D2,$B$2:$B$8))
Where E1 contains the total to be allocated.
I have two issues with this formula:
1)The issue that I am having is that I require a minimum value of atleast 1 person in each of these sections.
I have tried using a max function to simply set this value, however this leads to the resources allocated going over the total amount.
What equation would I need to use to make it account for both the total available to allocate, the minimum requirement for each fund and the maximum limit for each fund.
2) It only returns solid integers, would there be a way to retreive more precise results, maybe by changing it to a % distribution?
UL Alloc Rank Capacity Lower Limit
2 1 15 93 1
3 1 15
4 1 15
6 6 8
1 1 15
2 1 15
4 4 9
2 2 7
4 4 4
15 15 2
12 12 10
12 12 1
1 1 11
13 13 5
6 6 6
5 1 15
5 5 3
1 1 14
2 2 13
3 3 12
3 1 15
Reference: Using the Excel's Rank() function to calculate allocations based on ranking and constraints
Simply subtract the 100 on all sides and add them separately:
=MIN(A2-100,($E$1-100*COUNTA($A$2:$A$8))-(SUMIF($D$2:$D$8,"<"&D2,$B$2:$B$8)-COUNTIF($D$2:$D$8,"<"&D2)*100))+100
What is returned depends on your entries in Column A and in E1. You can change Column A based on a percentage distribution and the formula will return the corresponding values.
Edit:
If you set your lower threshold into F2, your Constraint into E2, using this formula
=MIN(A2-$F$2,($E$2-$F$2*COUNTA($A$2:$A$8))-(SUMIF($D$2:$D$8,"<"&D2,$B$2:$B$8)-COUNTIF($D$2:$D$8,"<"&D2)*$F$2))+$F$2
the result looks like this:

Predictive formula

I would like to predict how much we should keep in a box "Magical-box"
"Magical-box" Should have the ability to predict the value of the next deposit in it :
For Example :
DepositNbr Coins MagicBox
1 6 6
2 4 4
3 10 13 <==> the prediction process may starts from the third deposit
4 13 8
5 8 23
6 23 2
7 2 ...
is there any way to perform this prediction based on the past or the present ?,any formula ( markov Chain , normal distribution,Regression... ) is welcomed

slicing table into two parts and box it afterwards

I have a table like the following
0 1 2 3
4 5 6 7
8 9 10 11
and I want to make the following structure.
┌──────┬──┐
│0 1 2│ 3│
│4 5 6│ 7│
│8 9 10│11│
└──────┴──┘
Could anyone please help me?
And in J there is always another way!
]a=. i. 3 4
0 1 2 3
4 5 6 7
8 9 10 11
('' ;1 0 0 1) <;.1 a
┌──────┬──┐
│0 1 2│ 3│
│4 5 6│ 7│
│8 9 10│11│
└──────┴──┘
This uses the dyadic cut conjunction (;.) with the general form of x u ;. n y
y is the argument that we would like to partition, x specifies where the partitions are to be put, n is positive if we would like the frets (the partition positions) included in the result and a value of 1 means that we work from left to right, and u is the verb that we would like to apply to the partition.
One tricky point:
x is ('';1 0 0 1) because we want the entire first dimension of the array (rows) after which the 1's indicate the partition start. In this case we take all the rows and make the first partition the first 3 columns, and the final 1 makes the last partition its own column.
There is much going on in this solution, and that allows it to be used in many different ways, depending on the needs of the programmer.
The title of your question ("slicing table into two parts and box it afterwards") suggests that the example you sketch may not reflect what you want to learn.
My impression is that you think of your resulting noun as a two-axis table boxed into two sections. The main problem with that interpretation is that boxes divide their contents very thoroughly. It takes special effort to make the numbers in your second box look like they've been trimmed from the structure in the first box. Such effort is rarely worthwhile.
If it is natural to need to take the 3 7 11 and remove it as a unit from the structure in which it occurs, there is an advantage to making it a row of the table, rather than a column. A 2-axis table is always a list of 1-axis lists. If your problem is a matter of segregating items, this orientation of the atoms makes it simpler to do.
Putting this into practice, here we deal with rows instead of columns:
aa=: |:i.3 4
aa
0 4 8
1 5 9
2 6 10
3 7 11
(}: ; {:) aa
+------+------+
|0 4 8|3 7 11|
|1 5 9| |
|2 6 10| |
+------+------+
The program, in parentheses, can be read literally as "curtail link tail". This is the sort of program I'd expect from the title of your question.
Part of effective J programming is orienting the data (nouns) so that they are more readily manipulated by the programs (verbs).
Here is one way:
]a=: i. 3 4
0 1 2 3
4 5 6 7
8 9 10 11
3 ({."1 ; }."1) a
┌──────┬──┐
│0 1 2│ 3│
│4 5 6│ 7│
│8 9 10│11│
└──────┴──┘
In other words "take the first 3 items in each row of a and Link (;) with the result of dropping the first 3 items in each row of a"
Other methods and/or structures may be more appropriate depending on the exact use case.

Resources