Create a Histogram in Excel with three different level - excel

Based on the picture that I have uploaded, how should I create it in Excel with three different level?
Thanks!
Grades Bins Frequency Intervals
9 9 0 0-9
6 19 2 10-19
1 29 1 20-29
7 39
5
5
2
4
6
2
10
11
15
18
20
21
23
25
26
27
29

To create a histogram in Excel you would use FREQUENCY
This is no diferent in using something such as the groups for colours.
The difference is that you would use an IF statement referring to the group column.
=FREQUENCY(IF(GroupRange="GroupName",DataRange),BinRange)
So if your data was in B2:B40 and your group delimiter in C2:C40 and your bin sizes was in E2:E12 you would use a formula such as:
=FREQUENCY(IF($C$2:$C$40="B",$B$2:$B$40),$E$2:$E$12)
Then pop each group next to each other changing the "B" (or whatever) as you go.
Hopefully this will get oyu on the right track.
(note: with FREQUENCY you must array enter into all cells in line with the bin range... [ctrl]+[shift]+[enter])

Related

How to list all possible combinations of the values in FIVE columns in excel?

There is a question and answer already out there, "How to list all possible combinations of the values in three columns in excel?" This formula works exactly how I want it to, but I need added two additional columns, but I am not able to fully understand the current formula to add an additional two new columns to the list.
Current Formula works for 3 columns. It needs to be updated to include 5. =IFERROR(INDEX($A:$A,IF(INT((ROW(1:1)-1)/(((COUNTA(B:B)-1)*((COUNTA(C:C)-1)))))+2>COUNTA(A:A),-1,INT((ROW(1:1)-1)/(((COUNTA(B:B)-1)*((COUNTA(C:C)-1)))))+2))&" "&INDEX(B:B,MOD(INT((ROW(1:1)-1)/(COUNTA(C:C)-1)),(COUNTA(B:B)-1))+2)&" "&INDEX(C:C,MOD((ROW(1:1)-1),(COUNTA(C:C)-1))+2),"")
Also if there is a way to explain how to add for an additional or subtract a column that would be exponentially beneficial as well.
Site Product Type Labor Hours Machine Hours Batch Size
MAR UV 2 2 100 MAR UV 2
BEL SOLVENT 5 5 300 MAR UV 5
WATER 8 8 750 MAR UV 8
13 13 1750 MAR UV 13
18 18 3750 MAR UV 18
5000 MAR SOLVENT 2
MAR SOLVENT 5
MAR SOLVENT 8
MAR SOLVENT 13
MAR SOLVENT 18
MAR WATER 2
MAR WATER 5
MAR WATER 8
MAR WATER 13
MAR WATER 18
BEL UV 2
BEL UV 5
BEL UV 8
BEL UV 13
BEL UV 18
BEL SOLVENT 2
BEL SOLVENT 5
BEL SOLVENT 8
BEL SOLVENT 13
BEL SOLVENT 18
BEL WATER 2
BEL WATER 5
BEL WATER 8
BEL WATER 13
BEL WATER 18
This is what I am seeing right now based on the current formula. It is only including the first 3 columns. I need it to include the next 2 as well. I also like this formula because it doesn't care how many additional rows will be in each column which may change dramatically in the future.
Below is the original question that has only 3 columns in the formulas
How to list all possible combinations of the values in three columns in excel?
Here's a way to do it without a formula:
Create a Pivot Table for your columns.
Set your columns, in order, in the "Rows" field of the PivotTable.
Change your Layout to "Tabular Form", and "Repeat Item Labels"
Remove all Totals and Subtotals
In your filters, untick (blank)
Change your Fields to have the following settings:
Include new items in manual filter
Show items with no data
This will automatically give you all items. If you add items to the list, just right-click on your PivotTable and Refresh.
I'm answering this part only :
a way to explain how to add for an additional or subtract a column
According to your first 3 columns "Site-Product-Type" there is "2-3-5" items in each column. Separating the original formula into 3 lines :
=IFERROR( INDEX($A:$A,IF(INT((ROW(1:1)-1)/(((COUNTA(B:B)-1)*((COUNTA(C:C)-1)))))+2>COUNTA(A:A),-1,INT((ROW(1:1)-1)/(((COUNTA(B:B)-1)*((COUNTA(C:C)-1)))))+2))
&" "& INDEX(B:B,MOD(INT((ROW(1:1)-1)/(COUNTA(C:C)-1)),(COUNTA(B:B)-1))+2)
&" "& INDEX(C:C,MOD((ROW(1:1)-1),(COUNTA(C:C)-1))+2),"")
So each "Site" item need to be repeated 3 * 5 times (15 times - Eg. MAR & BEL).
[line 1]
And for each "Site" item, each "Product" item need to be repeated 5 times ( Eg. UV, SOLVENT, WATER ) [line 2]
And for each "Product" item, each "Type" item need to be repeated 1 time ( Eg. 2,5,8,13,18 ) [line 3]
So the total number of output = 235 = 30 . [ this part was executed by iferror(... , "") of the formulae (no output after 30 ( or 325) lines) ]
In the cited formula.. it was done by relating the row number (as a counter, using row() ), counta() (to count the number of elements in each column, mod() (to get the repetition), and index() (to call each column item, depending on the row number processed - more info : last formulae in this link ).
Taking it to 5 columns, "Site-Product-Type-Labor-Hours" :
Get the number of elements/items for each column. (You should get 2-3-5-5-6 )
So the total number of output = 23556 = 900 .
each "Site" item need to be repeated 355*6 times
for each "Site" item, each "Product" item need to be repeated 556 times
for each "Product" item, each "Type" item need to be repeated 5*6 times
for each "Type" item, each "Labor" item need to be repeated 6 times
for each "Labor" item, each "Hours" item need to be repeated 1 time
If you remove a column.. just use the same pattern.
I hope you get the logic. ( :

Percentage Greater than/Less than in a table

I have a table that I want to find the percentage greater than and percentage less than compared to a baseline, for the total group based on the weights of each group.
Here is my example table:
Benchmark GRP 1 GRP 2 GRP 3 GRP 4
10 10 11 10 12
14 12 15 11 15
17 11 17 13 16
18 14 15 14 17
Poulation 40 45 30 80
What I want to do is find out for each level of the benchmark what % of the total population of all four groups is above or below the bench mark value.
I have tried various sumproducts and sumifs but can't seem to get it work.
Let me know your thoughts!
Thanks as always!
Assuming your sample data is in A1:E7 put the following formula into B9 and use Ctrl+Shift+Enter to record it as an array formula:
=SUM(IF(B$2:B$5>$A$2:$A$5,1,0))/COUNTA($A$2:$A$5)
This can then be copied across under the other groups. Below is showing how it works for me.
Note: The array formula will display with braces ({...}) around it but you do not type these.

Working with output of PI break sets of numbers, remove numbers

I have a list of numbers outputted from a review of PI (the number 3.1415926535897932384......)
It looks like this:
8850838032 0621312483 8327044318 1257233570 9958940293 1391776730 2923888859 5836058683 5192760238 4694561699 : 110000000001
9312900154 4838526183 9375914106 9846458403 5847003707 2451543553 9394699328 5157228504 5434270590 6509736487 : 110000000002
1284090545 3175919151 4159855781 3410862263 2549812643 7600394225 7109902021 0694219181 6542482795 7164656581 : 110000000003
1367977800 8915483236 6072599505 1466161901 1090687303 7608155585 3289637107 6490574006 0401938787 7258319674 : 110000000004
the list is in notepad (.txt file)
set of 10 digits per number
set of 10 numbers per line (10 digits per x 10 numbers per line)
reference number at the end ( the : 110000000001 : 110000000002 etc numbers)
Heres what I would like to do:
to remove the reference numbers first ( the : 110000000001 : 110000000002 etc numbers)
to break each set of 10 digits into 5 sets of 2 digit numbers, with each 2 digit number appearing on a seperate line of its own - 0621312483 to 06 21 31 24 83
from this I want to remove all 2 digit numbers over 26, so that I am only left with numbers from 01 - 26 - from 06 21 31 24 83 to 06 21 24
Seems might only be (have been?) required the one time so may be simpler to give the results:
6 21 24 4 18 12 23 2 13 23 5 2 16
12 1 6 3 0 7 24 22 4 5 9
12 9 5 10 22 25 26 0 25 9 20 21 6 21
13 0 15 5 14 16 19 1 10 3 8 15 7 6 4 1
(which don't end in the way indicated in the question) but the process was, assuming 8850838032 .... in A1:
Text to Columns with : as the delimiter, delete ColumnB.
="'"&TRIM(SUBSTITUTE(A1," ","")) in ColumnB copied down to B4.
Select ColumnB, Copy, Paste Special, Values over the top.
Text to Columns on ColumnB with Fixed width (first one character then every second).
Delete ColumnB.
Select B1:AY4, HOME > Styles - Conditional Formatting, New Rule..., Use a formula to determine which cells to format and Format values where this formula is true::
=B1>26
Format..., select a colour fill, OK, OK.
Select and Copy B1:AY4, say to AZ2, with Paste Special, Transpose.
Filter ColumnAZ by colour to select those coloured and blank out each selection then reselecting (Select All).
Repeat for ColumnsBA, BB and BC.
Tidy up to suit, eg by selecting AZ>BC, HOME > Editing - Find & Select, Go To Special, Blanks, select, Delete..., Shift cells up and then moving the contents of BA under AZ, etc and deleting what is not required.
.

Find duplicates and count numbers at the same time

I have rows of data that contains numbers from 1 to 15, however these numbers can be in any sequence. For example here:
3 2 1 12 13 5 6 7 9 15 10 8 4 15 11
I know from a visual count these numbers above are all correct; as there are no duplicates, and all the numbers have values from 1 to 15. An example of a row of data I found to be wrong:
3 2 1 12 12 5 6 7 9 15 10 8 4 15
You can see this line has duplicated numbers i.e. 12, and number 11 is missing, so this row only has 14 elements in all.
However, I have many rows of data and it is impossible to visually check each row. I need to ensure in each row: there are 15 elements; there are no duplicates, and that the row contains values from 1 to 15 and find which rows are faulty to check these against the original paper data.
Is there a command or function that I can use in Excel to make this process easier?
You could find a set of conditions, each of which is true for rows that contain exactly those 15 numbers in any order and then test several of them. For example, if the row is in A5:O5:
=AND(COUNT(A5:O5)=15,SUM(A5:O5)=120,MIN(A5:O5)=1,MAX(A5:O5)=15,
AVERAGE(A5:O5)=8,ROUND(STDEV(A5:O5),3)=4.472)
This will show TRUE for a row that contains the integers 1 to 15 in any order, and is very unlikely (it could very well be impossible - I haven't checked) to show TRUE for a row that contains any different set of integers.
I'm pretty sure that the only way 15 positive integers less than 16 can add up to 120 other than by all being different is with duplication, so :
Check there are 15 numbers
Check their total is 120
Check the maximum is 15
Check not negative (nor zero):
=IF(OR(COUNT(A5:O5)<>15,SUM(A5:O5)<>120,MAX(A5:O5)>15,MIN(A5:O5)<1),"Error","Plausible")
then check for duplication with Conditional Formatting using a rule such as :
=COUNTIF($A5:$O5,A5)>1
and a distinctive format. Filter to select "Plausible" and then anything with a distnctive format is non compliant.

A problem with connected points and determining geometry figures based on points' location analysis

In school we have a really hard problem, and still no one from the students has solved it yet. Take a look at the picture below:
http://d.imagehost.org/0422/mreza.gif
That's a kind of a network of connected points, which doesn't end and each point has its own number representing it. Let say the numbers are like this: 1-23-456-78910-etc. etc.. (You can't see the number 5 or 8,9... on the picture but they are there and their position is obvious, the point in middle of 4 and 6 is 5 and so on).
1 is connected to 2 and 3, 2 is connected to 1,3,5 and 4 etc.
The numbers 1-2-3 indicate they represent a triangle on the picture, but the numbers 1-4-6 do not because 4 is not directly connected with 6.
Let's look at 2-3-4-5, that's a parallelogram (you know why), but 4-6-7-9 is NOT a parallelogram because the in this problem there's a rule which says all the sides must be equal for all the figures - triangles and parallelograms.
Also there are hexagons, for ex. 4-5-7-9-13-12 is a hexagon - all sides must be equal here too.
12345 - that doesn't represent anything, so we ignore it.
I think i explained the problem well. The actual problem which is given to us by using an input of numbers like above to determine if that's a triangle/parallelogram/hexagon(according to the described rules).
For ex:
1 2 3 - triangle
11 13 24 26 -parallelogram
1 2 3 4 5 - nothing
11 23 13 25 - nothing
3 2 5 - triangle
I was reading computational geometry in order to solve this, but i gave up quickly, nothing seems to help here. One friend told me this site so i decided to give it a try.
If you have any ideas about how to solve this, please reply, you can use pseudo code or c++ whatever. Thank you very much.
Let's order the points like this:
1
2 3
4 5 6
7 8 9 10
11 12 13 14 15
16 17 18 19 20 21
22 23 24 25 26 27 28
You can store this in a matrix. Now let row[i] = the row number i is on and col[i] = the column number i is on. These can be computed more or less efficiently for each i.
First, sort your given numbers ascendingly. You will need exactly 3 points for a triangle, 4 for a parallelogram and 6 for a hexagon - anything else and you can dismiss it as no-figure.
Notice that we can only have right-angled triangles in this matrix, according to your rules. Label the three points A, B, C. You can check if these form a triangle by iterating from row[A] to row[B], then from col[B] to col[C] and then diagonally from row[C] to row[A] and checking to see if the distances are the same and if you get to the right positions. You can terminate this early, for example if B is 8 and A is 1, then you can tell you won't find it once you hit 11 on column 1.
For parallelograms a similar reasoning can be made. Label the 4 points A, B, C, D and remember to sort them ascendingly (remember, your points here are actually numbers). See if you can get from col[A] to col[B] on the same line, then from col[C] to col[D] on the same line and then diagonally or vertically-down from row[A] to row[C] and then (in the same direction you went the previous diagonal!) from row[B] to row[D].
Hexagons are also have a specific format you must test for. Here's how hexagons look like in this representation:
1
2 3
4 5 6
7 8 9 10
11 12 13 14 15
16 17 18 19 20 21
22 23 24 25 26 27 28
1
2 3
4 5 6
7 8 9 10
11 12 13 14 15
16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 31 32 33 34 35 36
You can notice that every two pairs of points share the same column, and that the horizontal distance between the two middle points is twice the vertical distance between any two points and also twice the horizontal distance between any other two points.
You will also want to consider rotations, so you'll need to do more tests for each case.
You don't even really need the row and col arrays unless you plan on computing them efficiently. Just walk over your matrix until you identify the first point in sorted order and try to get to the others while following each of the rules.
Not exactly a nice way, but you will only need a 256x256 matrix for this, so while this does result in quite a lot of code, it's pretty efficient. I hope I made myself clear, if not please say what isn't clear. Anyway, maybe someone else will post a better solution, so wait a while longer if you can..

Resources