There is lotto draw (5 numbers) on each row. I have formula which calculates the most frequient numbers with their number of draws. Is it possible in end result to sort same number of draws results by row value. This means that if number is drawn on top rows will have grater value than those on bottom rows. Considering number of row to be a value. How is that possible?
Formula used:
=LET(flatten, TEXTSPLIT(TEXTJOIN(";",,A1:F27),,";"), numUq, UNIQUE(flatten), matches, XMATCH(flatten,numUq),SORT(HSTACK(numUq, DROP(FREQUENCY(matches, UNIQUE(matches)),-1)),2,-1))
In the example screenshot number 35 and number 13 have equal draws count, but 13 should be before 35.
Data:
A
B
C
D
E
F
18
35
31
13
37
10
43
47
36
13
6
19
6
12
6
35
14
1
43
24
45
7
21
16
37
39
44
24
12
40
39
8
34
28
49
46
27
44
15
46
45
12
22
0
10
5
28
28
4
7
23
6
44
41
30
22
47
13
29
29
37
9
26
44
39
10
30
17
21
20
41
22
43
35
0
22
13
9
14
22
42
20
32
21
13
38
48
6
14
2
11
47
20
20
23
6
22
26
1
25
45
31
27
39
6
44
3
24
22
45
34
17
5
13
16
23
20
7
30
16
25
21
7
34
1
35
32
34
1
9
10
32
23
35
11
3
6
12
5
30
4
20
33
15
26
10
8
28
16
11
21
14
3
38
10
42
16
3
26
48
30
28
Link to file
Here it is on a bit of the data. Here I have added a third column based on the average row of each unique number and sorted first on frequency then on row average:
=LET(range,A1:F3,uniques,UNIQUE(TOCOL(range)),rows,SEQUENCE(ROWS(range)),
avrow,BYROW(uniques,LAMBDA(uniq,SUM((range=uniq)*rows/SUM(--(range=uniq))))),
freq,DROP(FREQUENCY(range,uniques),-1),
SORTBY(HSTACK(uniques,freq,avrow),freq,-1,avrow,1))
Can 6 really occur twice in the same draw? Maybe not, but it doesn't affect the answer.
EDIT
Here is a version based on your original formula:
=LET(range,A1:F27,
flatten, TEXTSPLIT(TEXTJOIN(";",,A1:F27),,";"),
numUq, UNIQUE(flatten),
rows,SEQUENCE(ROWS(range)),
matches, XMATCH(flatten,numUq),
avrow,BYROW(numUq,LAMBDA(numUq,SUM((range=--numUq)*rows/SUM(--(range=--numUq))))),
freq,DROP(FREQUENCY(matches, UNIQUE(matches)),-1),
SORTBY(HSTACK(numUq,freq,avrow),freq,-1,avrow,1))
Full Dataset
The sorting is based on number of appearances and average row, but you could use other measures like row of first appearance if you wanted to.
Different approach:
=LET(data,A1:F27,
a,TOCOL(data),
b,MMULT(--(TRANSPOSE(a)=a),SEQUENCE(COUNTA(a),,1,0)),
c,TOCOL(IF(ISNUMBER(data),MAX(ROW(data)+1)-ROW(data)^99)),
d,MMULT(--(TRANSPOSE(a)=a),c),
s,SORTBY(HSTACK(a,b),b,-1,d,1),
UNIQUE(s))
a "flattens" the data using TOCOL.
b creates a "countif" of the drawn values in a using MMULT.
c returns the maximum row value of the data + 1 minus the row value of each value found ^99.
^99 because I want the number to be higher if it would be found in the first row only versus if it was found in each row except the first.
d returns a "sumif" of the calculated row values of c against the values of a.
We than only need a and b for the list using HSTACK, but we need them sorted by the count b descending and sorted by the sumif d ascending using SORTBY.
This will sort it as you illustrated it.
If it's a tie (36 and 19 in the data) it will show the first in row first.
Related
I have the below data
df1
Hema shiva Ishan
0 22 30 33
1 34 32 21
2 20 12 14
3 26 14 18
4 12 28 17
5 30 11 22
6 18 15 18
7 19 18 19
8 22 20 32
I wanted to take ratio of first column value with rest of the columns , eg first column should divide by 22 , 2nd column 30 and 3rd columns by 33 .
The answer is below .
Please help me if I missing something
Just divide the first row by the DF:
df.iloc[0] / df
I have a time series as a dataframe. The first column is the week number, the second are values for that week. The first week (22) and the last week (48), are the lower and upper bounds of the time series. Some weeks are missing, for example, there is no week 27 and 28. I would like to resample this series such that there are no missing weeks. Where a week was inserted, I would like the corresponding value to be zero. This is my data:
week value
0 22 1
1 23 2
2 24 2
3 25 3
4 26 2
5 29 3
6 30 3
7 31 3
8 32 7
9 33 4
10 34 5
11 35 4
12 36 2
13 37 3
14 38 10
15 39 5
16 40 7
17 41 10
18 42 11
19 43 15
20 44 9
21 45 13
22 46 5
23 47 6
24 48 2
I am wondering if this can be achieved in Pandas without creating a loop from scratch. I have looked into pd.resample, but can't achieve the results I am looking for.
I would set week as index, reindex with fill_value option:
start, end = df['week'].agg(['min','max'])
df.set_index('week').reindex(np.arange(start, end+1), fill_value=0).reset_index()
Output (head):
week value
0 22 1
1 23 2
2 24 2
3 25 3
4 26 2
5 27 0
6 28 0
7 29 3
8 30 3
I have a list of 1 column and 50 rows.
I want to divide it into 5 segments. And each segment has to become a column of a dataframe. I do not want the NAN to appear (figure2). How can I solve that?
Like this:
df = pd.DataFrame(result_list)
AWA=df[:10]
REM=df[10:20]
S1=df[20:30]
S2=df[30:40]
SWS=df[40:50]
result = pd.concat([AWA, REM, S1, S2, SWS], axis=1)
result
Figure2
You can use numpy's reshape function:
result_list = [i for i in range(50)]
pd.DataFrame(np.reshape(result_list, (10, 5), order='F'))
Out:
0 1 2 3 4
0 0 10 20 30 40
1 1 11 21 31 41
2 2 12 22 32 42
3 3 13 23 33 43
4 4 14 24 34 44
5 5 15 25 35 45
6 6 16 26 36 46
7 7 17 27 37 47
8 8 18 28 38 48
9 9 19 29 39 49
Column A are dates and B & C are Measurements
Dates Measurements
1 56 15
2 45 25
3 62 76
4 15 42
5 165 56
6 16 79
7 45 46
8 47 79
9 24 47
10 12 14
11 147 47
12 195 19
13 443 79
14 642 43
15 462 75
16 156 87
17 794 49
Start Date:2
Measurement:45
Code used to solve for the measurement
=VLOOKUP(B21,A2:C18,2,FALSE)
end date:14
Measure:642
=VLOOKUP(B22,A2:C18,2,FALSE)
I used vlookup to find me the values that I desire, but now I want to find the median values of that range from the start to end date in each column.
How can I code it so that once it selects the values, it can select the whole range and find the median values?
Since your column A values are ordered ascendingly, we can use the very efficient:
=MEDIAN(INDEX(B2:B18,MATCH(B21,A2:A18)):INDEX(B2:B18,MATCH(B22,A2:A18,0)))
Regards
I do calculations on 64 elements (for p=1:64 function end) and pull out the result values in an Excel file.
Is there any way to arrange the result values for each element row by row (the values of the first element should appear on the first row, the values of the second element should appear on the second row and so on)?
I used P=reshape(A,[],16) but Matlab pushes the values from right to the left mixing them.
For example,
If I set the loop for the calculation p=1:1 and use P=reshape(A,[],16) the result is:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
If I set p=1:2 the result becomes:
for element 1: 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31
for element 2: 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32
(the values of element 2 are: 17 18 19 20 21 22 23 24 25 ... 32)
The result for p=1:2 should be:
for element 1: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
for element 2: 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
for element 3: 33 34 35 ,etc...
Try this:
P=reshape(A,16,[])'
Is this what you need?