I have the following Dataframe:
A B C
0 Success 1.5 AAA
1 Duplicate BBB
2 NaN 1.5 CCC
3 Rejected DDD
3 Rejected EEE
I am looking to capture each value in the C column when B is empty. The goal is to store this in a list.
The list would contain BBB,DDD,EEE
I've been searching on Stack for a bit and can quite find this answer.
Any help would be greatly appreciated.
Thank you!
Based on the given description, you can try this to get the list of values of column C when column B values are empty. Read more about tolist here
required_list = df.loc[df['B'].isna(), 'C'].tolist()
Now you can iter the required list as per your requirements.
Try this
df[df["B"].isnull()]["C"].tolist()
This will do you
import numpy as np
df[D] = np.where(df[B].isnull(),df[C],None)
list = df[D].dropna()
Related
I am trying to convert a pandas dataframe wih 2 columns , into a dictionary such that the values of one column are the keys, and the values of the other column are the values of the dictionary. If the keys happen to be repeating (which they are), I want the values of the same key to be appended in a list.
So far I did the following , but this takes a very long time if I want to convert a 100K plus records to a dictionary.
A B
1 ab kate
2 ab drew
3 ab mike
4 ab eric
5 cd bobby
6 cd kyle
7 ab alex
8 ab michelle
9 cd heather
fdict = dict()
for d, d2 in zip(t.A, t.B):
fdict.setdefault(d, list()).append(d2)
Please help me understand how I can do this faster using python.
Thanks !
I think df.set_index('ID').T.to_dict('list') this oneliner would serve your purpose and faster.
Thanks so much for looking at my question! I am trying to create a formula that subtracts a specific value from another formula. However, that specific value may change.
Example:
A B C D
1 1 100 =(2000 - ( if A = 1, i want to subtract the C value where B =1))
1 2 250
1 3 310
1 4 .
2 1
2 2 =((2000 - ( if A = 2, i want to subtract the C value where B =1))
2 3
2 4
3 1
3 2
3 3
3 4
(A,B,C,D are the columns)
Hopefully this makes sense! I am trying to subtract the C value that goes along with the B1 value for each different A.
I was thinking an index match of some sort but wasnt exactly sure how to do that when the A's change. Thanks so much in advance for help!
INDIRECT or INDEX functions can help you. See this answer.
Would something like a nested if function work for you here? For example:
=IF(A2=1,IF(B2=1,2000-C2,"Enter calculation if B2<>1"),"Enter calculation if A2"<>1)
If this works, then you can simply copy/paste the function down the rows in column D.
Please forgive any errors or shortcomings in this question, it's my first on stackoverflow.
I have two sets of data in Excel of differing lengths and frequency, and would like to be able to place a value of 0 for where they don't synchronise, and match the rest.
For example, dataset 1 could be:
Date Set1
01-01-2010 10
01-03-2010 4
01-04-2010 8
01-05-2010 5
01-06-2010 10
01-09-2010 12
01-10-2010 9
01-11-2010 4
And dataset 2 could be:
Date Set2
01-03-2010 102
01-06-2010 104
01-10-2010 102
I'm looking for an output table that displays the values alongside each other for dates matching, 0 otherwise, like so:
Date Set1 Set2
01-01-2010 10 0
01-03-2010 4 102
01-04-2010 8 0
01-05-2010 5 0
01-06-2010 10 104
01-09-2010 12 0
01-10-2010 9 102
01-11-2010 4 0
I can't seem to be able to crack this with my limited knowledge and the lack of synchronisation in the data. Any help would be much appreciated, thanks.
You can do this using a VLOOKUP nested in an IFERROR statement.
The two equations used (and dragged down to last unique date row) are:
H3 = IFERROR(VLOOKUP(G3,A:B,2,0),0)) & I3 = IFERROR(VLOOKUP(G3,D:E,2,0),0))
This will not work if you have duplicate dates in the same data set with varying values since VLOOKUP will always return the first matched value (reading top down).
Place Set1 in A1:B9 (header in row 1). Add a column of zeros next to it in column C, so A2:A9 is dates, B2:B9 is values and C2:C9 is zeros.
Place Set2 (without the header) in A10:B12; move the Set2 data to column C and put zeros in column B, so A10:A12 is dates, B10:B12 is zeros, C10:C12 is values.
Sort the range A2:C12 by Date (column A).
Easier to show with a screenshot but newbies are not allowed to post images.
I'm not sure if I'm over complicating this...
basically I'd like to have a formula which is
if the c column is less than 6, then look up the max value in B but display the value of C
so far I have this but I'd like it to show 2, not 437
{=MAX(IF(C2:C12<6,B2:B12, 0))}
any advice is appreciated. i'm shy, be nice..thanks
A B C
cat 110 3
dog 148 4
rooster 36 7
duck 32 8
pig 437 2
horse 44 6
eagle 215 5
dolphin 21 1
panda 2 9
iguana 257 10
fish 199 11
edit:
maybe something like
{=INDEX(C2:C12,MATCH(MAX(IF(C2:C12<6,C2:C12)),C2:C12,0))}
but I don't see where to put b2:b12
You really need two conditions
1) Column B is equal to =MAX(IF(C2:C12<6,B2:B12))
2) Column C is <6
so you can INDEX column C when those two are met, i.e.
=INDEX(C2:C12,MATCH(1,(B2:B12=MAX(IF(C2:C12<6,B2:B12)))*(C2:C12<6),0))
confirmed with CTRL+SHIFT+ENTER
{=IF(C2<6,INDEX($C$2:$C$12,MATCH(MAX($B$2:$B$12),$B$2:$B$12,0)),0)}
You were almost there..
Basically if C<6 , find max of B , lookup it in B:C and display corresponding C
{=IFERROR(VLOOKUP(MAX(IF(C2:C12<6,B2:B12, 0)),B2:C12,2,FALSE),0)}
Since your question doesn't clarify what if c>=6 I assume you don't want value.
Can answer more precisely if you clarify.
Mark this as answer if that's correct.
Hope this helps!
As I could see you requested a single INDEX formula:
{=INDEX($C$2:$C$12,MATCH(MAX(IF($C$2:$C$12<6,$B$2:$B$12,0),0),IF($C$2:$C$12<6,$B$2:$B$12,0),0))}
This is an array formula, hit Ctrl+Shift+Enter while still in the formula bar.
Lets break this down.
=INDEX(C:C, - Index column C as these are the values you want returned
MATCH(IF(C:C<6,B:B,0), - Find the largest value from the following array in the array and return it's relative position for INDEX()
IF(C:C<6,B:B,0),0)) - If the value in column c is less than 6 then add the column B value to the array, otherwise add 0
I have 3 columns
a b c
jon ben 2
ben jon 2
roy jack 1
jack roy 1
I'm trying to retrieve all unique permutations e.g. ben and jon = jon and ben so they should only appear once. Expected output:
a b c
jon ben 2
roy jack 1
Any ideas of a function that could do this? The order in the output does not matter. I've tried concatenating and then removing duplicates, but obviously this only considers the string order.
I've created a fourth column by joining all three columns together =a1&","&b1&","&c1 and used excel's built in remove duplicates function. This doesnt work as the order of the strings are different.
In your forth column use the formula
=if(A1<B1,A1&","&B1&","&C1,B1&","&A1&","&C1)
Which should join A and B in alphabetical order, then you can remove duplicates as you have done.