Draw 16 players into 4 groups where no one comes from same earlier group - excel

I got between 16 and 40 players in groups of 4/5 players. meaning 4 to 10 groups.
I want to draw top16 (i got this list already formatted as:)
Name | Former Group (Ex. With 4 Groups)
Player1 | 1
Player2 | 2
Player3 | 3
Player4 | 4
Player5 | 4
Player6 | 3
Player7 | 2
Player8 | 1
Player9 | 2
Player10 | 1
Player11 | 3
Player12 | 4
Player13 | 1
Player14 | 2
Player15 | 4
Player16 | 3
This list i want to put into 4 Groups with a click of a button. Where no one comes from the same earlier group. This group is listed with players getting 1st in their former group first and then 2nd place and so on. So if its 10 groups of 4 my list could look like. As its 10 1st places and 6 2nd places in.
Name | Former Group (Ex. With 10 Groups)
Player1 | 1
Player2 | 2
Player3 | 3
Player4 | 4
Player5 | 5
Player6 | 6
Player7 | 7
Player8 | 8
Player9 | 9
Player10 | 10
Player11 | 3
Player12 | 4
Player13 | 7
Player14 | 5
Player15 | 9
Player16 | 1
I want to draw these top16 players into 4 new groups where they dont come into a group with a player they already played in the first round.
So i thought i would create a function and call that on a button click.
onClick i want to collect these players from
AA6 to AA21 (is there name)
AB6 to AB21 (is there former group number)
run them thru my function.
Private Sub CommandButton11_Click()
ReDim playerNames(0 To 16) As String
ReDim playerGroups(0 To 16) As Integer
For i = 1 To 16
playerNames(i) = Cells(i + 5, 27).Value
playerGroups(i) = Cells(i + 5, 28).Value
Next i
Dim txt As String
Dim txt2 As String
Dim ii As Long
For ii = LBound(playerNames) To UBound(playerNames)
txt = txt & playerNames(ii) & vbCrLf
txt2 = txt2 & playerGroups(ii) & vbCrLf
Next ii
MsgBox txt + txt2
End Sub
How can i create a logic that never gives me groups where players come from the same group ?
and then i want to past these into
AC6->AC9 (GroupA)
AD6->AD9 (GroupB)
AE6->AE9 (GroupC)
AF6->AF9 (GroupD)

Related

remove duplicates but retain the first position in excel vba macro

I am looking to remove duplicate rows but leave the first line
Using vba macros in excel 2010.
This is the initial information
A | B
1. A | 1
2. A | 1
3. A | 1
4. A | 1
5. B | 2
6. B | 2
7. B | 2
after running the macro
A | B
1. A | 1
2. | 1
3. | 1
4. | 1
5. B | 2
6. | 2
7. | 2
Can you help me,please!
Not elegant, but quick and dirty:
Dim iLastRow As Integer
iLastRow = 13
Range("h1:h" & iLastRow).Formula = "=if(countif(a$1:a1,a1)>1,"""",a1)"
Range("a1:a" & iLastRow).Value = Range("h1:h" & iLastRow).Value
Range("h1:h" & iLastRow).Clear

How can I get the count of sequential events pairs from a Pandas dataframe?

I have a dataframe that looks like this:
ID EVENT DATE
1 1 142
1 5 167
1 3 245
2 1 54
2 5 87
3 3 165
3 2 178
And I would like to generate something like this:
EVENT_1 EVENT_2 COUNT
1 5 2
5 3 1
3 2 1
The idea is how many items (ID) go from one event to the next one. Don't care about previous states, I just want to consider the next state from the current state (e.g.: for ID 1, I don't want to count a transition from 1 to 3 because first, it goes to event 5 and then to 3).
The date format is the number of days from a specific date (sort of like SAS format).
Is there a clean way to achieve this?
Let's try this:
(df.groupby([df['EVENT'].rename('EVENT_1'),
df.groupby('ID')['EVENT'].shift(-1).rename('EVENT_2')])['ID']
.count()).rename('COUNT').reset_index().astype(int)
Output:
| | EVENT_1 | EVENT_2 | COUNT |
|---:|----------:|----------:|--------:|
| 0 | 1 | 5 | 2 |
| 1 | 3 | 2 | 1 |
| 2 | 5 | 3 | 1 |
Details: Groupby on 'EVENT' and shifted 'EVENT' within each ID, then count.
You could use groupby and shift. We'll also use rename_axis and reset_index to tidy up the final output:
(pd.concat([f.groupby([f['EVENT'], f['EVENT'].shift(-1).astype('Int64')]).size()
for _, f in df.groupby('ID')])
.groupby(level=[0, 1]).sum()
.rename_axis(['EVENT_1', 'EVENT_2']).reset_index(name='COUNT'))
[out]
EVENT_1 EVENT_2 COUNT
0 1 5 2
1 3 2 1
2 5 3 1

How can I drop consecutive row duplicates with condition of a column? [duplicate]

This question already has answers here:
Get rows based on distinct values from one column
(2 answers)
Closed 3 years ago.
This post has been really helpful for getting the basis of what I want to do, however, I'm stuck with how to get to the finish line.
I have large dataframe (approx. 10k rows) with the first few rows looking like what I'll call df_a:
zone | value
0 | 12
1 | 12
2 99
3 12
0 12
1 12
2 12
3 99
I am looking to drop consecutive duplicates within 'value', however, based on the condition of zone. For example, in the above snippet I would want the second '12' to be dropped for zone = 1. So that I end up with:
zone | value
0 | 12
1 | 12
2 99
3 12
2 12
3 99
My initial idea was to use a loop across a list of zones, create new variables for each created zone automatically based on the zone name, and the run my drop duplicates code (based on this answer. However, this doesn't work:
data_category_range = df_a['zone'].unique()
data_category_range = data_category_range.tolist()
for i,value in enumerate(data_category_range):
data_category_range['zone_{}'.format(i)] = df_a[df_a['zone'] == value]
# de-duplicate
cols = ["zone","value"]
de_dup = df_a[cols].loc[(df_a[cols].shift() != df_a[cols]).any(axis=1)]
(This loop is within another loop which will iterate across dataframes with different 'zone' values, so variable needs to be dynamic - open to alternatives as I understand this isnt best practice).
Thanks!
You can use drop_duplicates
import pandas as pd
data = pd.DataFrame(
{"zone": [0, 1, 2, 3, 0, 1, 2, 3], "value": [12, 12, 99, 12, 12, 12, 12, 99]}
)
data.drop_duplicates(["zone", "value"])
This will give you
| | zone | value |
|---:|-------:|--------:|
| 0 | 0 | 12 |
| 1 | 1 | 12 |
| 2 | 2 | 99 |
| 3 | 3 | 12 |
| 6 | 2 | 12 |
| 7 | 3 | 99 |

Creating A new column based on other columns' values with specific requirement in Python Dataframe

I want to create a new column in Python dataframe with specific requirements from other columns. For example, my python dataframe df:
A | B
-----------
5 | 0
5 | 1
15 | 1
10 | 1
10 | 1
20 | 2
15 | 2
10 | 2
5 | 3
15 | 3
10 | 4
20 | 0
I want to create new column C, with below requirements:
When the value of B = 0, then C = 0
The same value in B will have the same value in C. The same values in B will be classified as start, middle, and end. So for values 1, it has 1 start, 2 middle, and 1 end, for values 3, it has 1 start, 0 middle, and 1 end. And the calculation for each section:
I specify a threshold = 10.
Let's look at values B = 1 :
Start :
C.loc[2] = min(threshold, A.loc[1]) + A.loc[2]
Middle :
C.loc[3] = A.loc[3]
C.loc[4] = A.loc[4]
End:
C.loc[5] = min(Threshold, A.loc[6])
However, the output value of C will be the sum of the above calculations.
When the value of B is unique and not 0. For example when B = 4
C[10] = min(threshold, A.loc[9]) + min(threshold, A.loc[11])
I can solve point 0 and 3. But I'm struggling to solve point 2.
So, the final output will be:
A | B | c
--------------------
5 | 0 | 0
5 | 1 | 45
15 | 1 | 45
10 | 1 | 45
10 | 1 | 45
20 | 2 | 50
15 | 2 | 50
10 | 2 | 50
5 | 3 | 25
10 | 3 | 25
10 | 4 | 20
20 | 0 | 0

I have data stored in excel where I need to sort that data

In excel, I have data divided into
Year Code Class Count
2001 RAI01 LNS 9
2001 RAI01 APRP 4
2001 RAI01 3
2002 RAI01 BPR 3
2002 RAI01 BRK 3
2003 RAI01 URE 3
2003 CFCOLLTXFT APRP 2
2003 CFCOLLTXFT BPR 2
2004 CFCOLLTXFT GRL 2
2004 CFCOLLTXFT HDS 2
2005 RAI HDS 2
where I need to find the top 3 products for that particular customer for that particular year.
The real trick here is to rank each row based on a group.
Your rank is determined by your Count column (Column D).
Your group is determined by your Year and Code (I think) columns (Column A and B respectively).
You can use this gnarly sumproduct() formula to get a rank (Starting at 1) based on the Count for each Group.
So to get a ranking for each Year and Code from 1 to whatever, in a new column next to this data:
=SUMPRODUCT(($A$2:$A$50=A2)*(B2=$B$2:$B$50)*(D2<$D$2:$D$50))+1
And copy that down. Now you can AutoFilter on this to show all rows that have a rank less than 4. You can sort this on Customer, then Year and you should have a nice list of top 3 within each year/code.
Explanation of sumproduct.
Sumproduct goes row by row and applies the math that is defined for each row. When it is done it sums the results.
As an example, take the following worksheet:
+---+---+---+
| | A | B |
+---+---+---+
| 1 | 1 | 1 |
| 2 | 1 | 4 |
| 3 | 2 | 2 |
| 4 | 4 | 1 |
| 5 | 1 | 2 |
+---+---+---+
`=SUMPRODUCT((A1:A5)*(B1:B5))`
This sumproduct will take A1*B1, A2*B2, A3*B3, A4*B4, A5*B5 and then add those five results up to give you a number. That is 1 + 4 + 4 + 4 + 1 = 15
It will also work on conditional/boolean statements returning, for each row/condition a 1 or a 0 (for True and False, which is a "Boolean" value).
As an example, take the following worksheet that holds the type of publication in a library and a count:
+---+----------+---+
| | A | B |
+---+----------+---+
| 1 | Book | 1 |
| 2 | Magazine | 4 |
| 3 | Book | 2 |
| 4 | Comic | 1 |
| 5 | Pamphlet | 2 |
+---+----------+---+
=SUMPRODUCT((A1:A5="Book")*(B1:B5))
This will test to see if A1 is "Book" and return a 1 or 0 then multiple that result by whatever is B1. Then continue for each row in the range up to row 5. The result will 1+0+2+0+0 = 3. There are 3 books in the library (it's not a very big library).
For this answer's sumproduct:
So ($A$2:$A$50=A2) says to return a 1 if A2=A2 or a 0 if A2<>A2. It does that for A2 through A50 comparing it to A2, returning a 1 or a 0.
(B2=$B$2:$B$50) will test each cell B2 through B50 to see if it is equal to B2 and return a 1 or 0 for each test.
The same is true for (D2<$D$2:$D$50) but it's testing to see if the count is less than the current cells count.
So... essentially this is saying "For all the rows 1 through 50, test to find all the other rows that have the same value in Column A and B AND have a count less than this rows count. Count all of those rows up that meet that criteria, and add 1 to it. This is the rank of this row within its group."
Copying this formula has it redetermine that rank for each row allowing you to rank and filter.

Resources