If I had data in rows A to E as seen below in the table. Some of the values can be NA. IN column F if i wanted to input data from columns A to E in a way that if data in A exists use that otherwise if data in B exists use that otherwise until column E. If none of them have any values return NA. I would like to automate this where somewhere I just specify the order for example A, B, C, D and E OR A, C, E, D, B and the values in F update according to the reference table
Reference : C - B - A - E - D
a
b
c
d
e
f
3
4
3
2
2
7
1
7
NA
1
4
2
4
2
2
4
2
2
Use FILTER() with # operator.
=#FILTER(A2:E2,A2:E2<>"","NA")
For dynamic array approach (spill results automatically), try-
=BYROW(A2:E7,LAMBDA(x,INDEX(FILTER(x,x<>"","NA"),1,1)))
Related
There have been a lot of posts concerning splitting a single column into multiples, but I couldn't find an answer to a slight modification to the idea of splitting.
When you use str.split, it splits the string independent of order. You can modify it to be slightly more complex, such as ordering it by sorting alphabetically
e.x. dataframe (df)
row
0 a, e, c, b
1 b, d, a
2 a, b, c, d, e
3 d, f
foo = df['row'].str.split(',')
will split based on the comma and return:
0 1 2 3
0 a e c b
....
However that doesn't align the results by their unique value. Even if you use a sort on the split string, it will still only result in this:
0 1 2 3 4 5
0 a b c e
1 a b d
...
whereas I want it to look like this:
0 1 2 3 4 5
0 a b c e
1 a b d
2 a b c d e
...
I know I'm missing something. Do I need to add the columns first and then map the split values to the correct column? What if you don't know all of the unique values? Still learning pandas syntax so any pointers in the right direction would be appreciated.
Using get_dummies
s=df.row.str.get_dummies(sep=' ,')
s.mul(s.columns)
Out[239]:
a b c d e f
0 a b c e
1 a b d
2 a b c d e
3 d f
I was wondering if there was a way to count the number of values by category. Example:
A 3
A 3
A 3
B 4
B 4
B 4
B 4
C 5
C 5
C 5
C 5
C 5
D 2
D 2
What is happening there is that there are 5 categories "A, B, C, D" and there are different counts of it. Duplicate values. I would like to create a new column and output the number of times it occurs in a different column as shown above. Please no VBA as i don't know it.
Try this...
=IF(A2<>A1,COUNTIF(A:A,A2),"")
Below is a sample of the data I have. I want to match the data in Column A and B. If column B is not matching column A, I want to add a row and copy the data from Column A to B. For example, "4" is missing in column B, so I want to add a space and add "4" to column B so it will match column A. I have a large set of data, so I am trying to find a different way instead of checking for duplicate values in the two columns and manually adding one row at a time. Thanks!
A B C D
3 3 Y B
4 5 G B
5 6 B G
6 8 P G
7 9 Y P
8 11 G Y
9 12 B Y
10
11
12
11
12
I would move col B,C,D to a separate columns, say E,F,G, then using index matches against col A and col B identify which records are missing.
For col C: =IFERROR(INDEX(F:F,Match(A1,E:E,0)),"N/A")
For col D: =IFERROR(INDEX(G:G,Match(A1,E:E,0)),"N/A")
Following this you can filter for C="N/A" to identify cases where a B value is missing for an A value, and manually edit. Since you want A & B to be matching here col B is unnecessary, final result w/ removing col B and C->B, D->C:
A B C
3 Y B
4 N/A N/A
5 G B
6 B G
7 N/A N/A
Hope this helps!
I have a table with a list of patients with repeating treatment events (several thousand of records).
Eg. in the following table in column A patients are coded with numbers (the same numbers the same patient), and in column B are coded the treatment events of the patients.
I want to exclude those patients who don’t have an initial treatment event (here "a"), and to mark them in column C for example with "E".
A B C
1 a
1 b
1 c
2 b E
2 c E
3 a
3 c
4 a
4 b
5 a
5 b
5 c
6 a
6 b
6 c
6 d
6 e
6 f
7 b E
7 f E
The formula to put in the column C is
=IF(COUNTIFS($A$2:$A$21,A2,$B$2:$B$21,"a")=0,"E","")
It counts the occurences of "a" treatments for each patient, and where there is none (count = 0) it puts letter E.
I would like to get the first value (for a given key)from the vlookup that is not blank.
1 A 1 A
2 2 C
2 C => 3 B
3 B 4 W
4 W
4 X
Is it possible with vlookup or do I have to use INDEX, MATCH, CHOOSE etc?
If so can anyone provide an example? I cannot add extra columns.
Try this:
=IF(VLOOKUP($D1;$A$1:$B$6;2;FALSE)=0;VLOOKUP($D1;$A$1:$B$6;2;TRUE);VLOOKUP($D1;$A$1:$B$6;2;FALSE))
Use this formula, it looks for the first non blank:
=INDEX($B$1:$B$6,AGGREGATE(15,6,ROW($B$1:$B$6)/(($A$1:$A$6=D1)*($B$1:$B$6<>"")),1))
You should use a IF in an intermediate column, then use this intermediate in your VLOOKUP formula:
This gives, using an extra_tab if you can't insert columns in current sheet
Sheet1 extra_tab
A C D E A B
------------- ---------
1 A 1 A 1 A
2 2 C 0
2 C 3 B 2 C
3 B 4 W 3 B
4 W 4 W
4 Z 4 Z
To avoid blanks for further calculation Formula in extra_tab.A1 and extra_tab.B1 is:
A B
=Sheet1!B1 =IF(Sheet1!B1="";"";Sheet1!A1)
Formula in sheet1.D1 is:
==VLOOKUP(C1;extra_tab!A:B;2;FALSE)
Hope it helps