This is the original series. I'm trying to replace values of the non top 2 in the series with 'Other'.
Original Series(ser3):
b 8
c 6
a 5
h 4
g 2
d 2
f 2
e 1
This is my extracted top 2.
Top 2:
t2 = ((ser3.value_counts().head(2)))
b 8
c 6
Expected Output:
b 8
c 6
a Other
h Other
g Other
d Other
f Other
e Other
How can I do that? I do not want to convert to dictionary and replace the values by indexing. I prefer to do it by Series. I tried using .isin, but my code gives me an error.
a[a[~a.isin(t2)].index]='Other'
The above gives me an error.
You are close, need select t2.index and remove outer ser3[]:
ser3[~ser3.isin(t2.index)]='Other'
Related
If I had data in rows A to E as seen below in the table. Some of the values can be NA. IN column F if i wanted to input data from columns A to E in a way that if data in A exists use that otherwise if data in B exists use that otherwise until column E. If none of them have any values return NA. I would like to automate this where somewhere I just specify the order for example A, B, C, D and E OR A, C, E, D, B and the values in F update according to the reference table
Reference : C - B - A - E - D
a
b
c
d
e
f
3
4
3
2
2
7
1
7
NA
1
4
2
4
2
2
4
2
2
Use FILTER() with # operator.
=#FILTER(A2:E2,A2:E2<>"","NA")
For dynamic array approach (spill results automatically), try-
=BYROW(A2:E7,LAMBDA(x,INDEX(FILTER(x,x<>"","NA"),1,1)))
I am creating an excel file to swap excel columns which contains the number corresponding to the ASCII character such as low letter, upper letter, number, special characters.
Here is the original table and the corresponding letter to the number
A B C D E F G H
1 2 3 4 5 6 7 8
I want to swap each of the cell to the end. Meaning I need to swap 1 to 8. 2 to 7. 3 to 6.
A B C D E F G H
8 7 6 5 4 3 2 1
I would want to use the excel function to do this. Is there a way to achieve this? I have 156 columns.
what about this method.
A 1
B 2
C 3
D 4
E 5
F 6
G 7
H 8
If your data is always ordered then sorting would work.
But let's assume your data is not always alphabetically or numerically sortable, you can use a formula to reverse an unsorted list:
-
A
B
C
D
E
F
G
H
1
A
B
C
D
E
F
G
H
2
1
2
3
4
5
6
7
8
3
8
7
6
5
4
3
2
1
Rows 1 and 2 are your original data. Row 3 is the reversing formula.
Add the formula below into cell A3
=INDEX($A$2:$G$2, COLUMNS(A2:$G$2))
Note (important) that the first Range of the INDEX is absolute $A$2:$G$2, but the only the last value of the columns range is absolute A2:$G$2 (no dollars)
Either drag cell A3 across to H3 or copy cell A3 and paste over B3:H3
This has an advantage over plain sorting in that it can reverse unsorted lists for you.
There have been a lot of posts concerning splitting a single column into multiples, but I couldn't find an answer to a slight modification to the idea of splitting.
When you use str.split, it splits the string independent of order. You can modify it to be slightly more complex, such as ordering it by sorting alphabetically
e.x. dataframe (df)
row
0 a, e, c, b
1 b, d, a
2 a, b, c, d, e
3 d, f
foo = df['row'].str.split(',')
will split based on the comma and return:
0 1 2 3
0 a e c b
....
However that doesn't align the results by their unique value. Even if you use a sort on the split string, it will still only result in this:
0 1 2 3 4 5
0 a b c e
1 a b d
...
whereas I want it to look like this:
0 1 2 3 4 5
0 a b c e
1 a b d
2 a b c d e
...
I know I'm missing something. Do I need to add the columns first and then map the split values to the correct column? What if you don't know all of the unique values? Still learning pandas syntax so any pointers in the right direction would be appreciated.
Using get_dummies
s=df.row.str.get_dummies(sep=' ,')
s.mul(s.columns)
Out[239]:
a b c d e f
0 a b c e
1 a b d
2 a b c d e
3 d f
Below is a sample of the data I have. I want to match the data in Column A and B. If column B is not matching column A, I want to add a row and copy the data from Column A to B. For example, "4" is missing in column B, so I want to add a space and add "4" to column B so it will match column A. I have a large set of data, so I am trying to find a different way instead of checking for duplicate values in the two columns and manually adding one row at a time. Thanks!
A B C D
3 3 Y B
4 5 G B
5 6 B G
6 8 P G
7 9 Y P
8 11 G Y
9 12 B Y
10
11
12
11
12
I would move col B,C,D to a separate columns, say E,F,G, then using index matches against col A and col B identify which records are missing.
For col C: =IFERROR(INDEX(F:F,Match(A1,E:E,0)),"N/A")
For col D: =IFERROR(INDEX(G:G,Match(A1,E:E,0)),"N/A")
Following this you can filter for C="N/A" to identify cases where a B value is missing for an A value, and manually edit. Since you want A & B to be matching here col B is unnecessary, final result w/ removing col B and C->B, D->C:
A B C
3 Y B
4 N/A N/A
5 G B
6 B G
7 N/A N/A
Hope this helps!
I would like to get the first value (for a given key)from the vlookup that is not blank.
1 A 1 A
2 2 C
2 C => 3 B
3 B 4 W
4 W
4 X
Is it possible with vlookup or do I have to use INDEX, MATCH, CHOOSE etc?
If so can anyone provide an example? I cannot add extra columns.
Try this:
=IF(VLOOKUP($D1;$A$1:$B$6;2;FALSE)=0;VLOOKUP($D1;$A$1:$B$6;2;TRUE);VLOOKUP($D1;$A$1:$B$6;2;FALSE))
Use this formula, it looks for the first non blank:
=INDEX($B$1:$B$6,AGGREGATE(15,6,ROW($B$1:$B$6)/(($A$1:$A$6=D1)*($B$1:$B$6<>"")),1))
You should use a IF in an intermediate column, then use this intermediate in your VLOOKUP formula:
This gives, using an extra_tab if you can't insert columns in current sheet
Sheet1 extra_tab
A C D E A B
------------- ---------
1 A 1 A 1 A
2 2 C 0
2 C 3 B 2 C
3 B 4 W 3 B
4 W 4 W
4 Z 4 Z
To avoid blanks for further calculation Formula in extra_tab.A1 and extra_tab.B1 is:
A B
=Sheet1!B1 =IF(Sheet1!B1="";"";Sheet1!A1)
Formula in sheet1.D1 is:
==VLOOKUP(C1;extra_tab!A:B;2;FALSE)
Hope it helps